### Parks in Washington DC
* Washington, DC is a city of around 670,000 people (U.S. Census, 2022). While Washington, DC is an urban area, parks comprise approximately 20% of the land in Washington, DC. The city has a goal of a city-wide goal of over 40% tree canopy cover by 2035 (National Capital Planning Commission, n.d.). For more information on Washington, D.C. parks see: https://www.ncpc.gov/docs/CapitalSpace_Washingtons_Parks_and_Open_Space.pdf

* Washington DC is often regarded as having on of the best park systems in the country. It has consistently scored highly in the The Trust for Public Land (TPL)'s ratings for park access and park equity (The Public Land Trust, 2022; Patino & Poon, 2021). 

* However, others have also noted variations across the city in park size, quality, and tree coverage. Large parks with greater tree coverage tended to be further from areas with higher housing density (Urban Institute, 2022). Others have pointed to poor maintainence and inadequte resources in DC's neighborhood parks, due in part to federal control of the D.C. park system, which leaves D.C. parks competing with national parks for resources (Pusatory & Henry, 2023;  GWToday.com, 2023).

<figure style="display: inline-block;  padding-left:100px; padding-bottom:20px; margin-bottom:20px">
    <img alt="Kenilworth Aquatic Gardens" src="https://upload.wikimedia.org/wikipedia/commons/6/65/Kenilworth_Aquatic_Gardens%2C_Washington%2C_D.C.jpg" style="float:center; height: 200px; vertical-align: top;"/>
    <figcaption style="text-align: center; height: 20px; vertical-align: top" > Kenilworth Aquatic Garden
        <br><a href="https://upload.wikimedia.org/wikipedia/commons/6/65/Kenilworth_Aquatic_Gardens%2C_Washington%2C_D.C.jpg">Judy Gallagher</a>, CC by 2.0, via Wikimedia Commons
        </figcaption>
</figure>

<figure style="display: inline-block; padding-bottom:20px; margin-bottom:20px">
    <img src="https://live.staticflickr.com/3024/2995931336_78f99eb20a_3k.jpg" style="float:left; height: 200px; vertical-align: top;" alt='missing'/>
    <figcaption style="text-align: center; height: 20px; vertical-align: top" >Rock Creek Park
        <br>Source: <a href="https://www.flickr.com/photos/dionhinchcliffe/2995931336"> Dion Hinchcliffe</a>, CC by 2.0, via Flickr
        </figcaption>
</figure>


### NDVI In Washington DC Neighborhood Clusters
* This project uses National Agriculture Imagery Program (NAIP) data to examine NDVI in Washington, DC neighborhood clusters.
* Normalizaed Difference Vegetation Index (NDVI) is a value that reflects the relative amounts of red and near infrared light reflected in a location. Higher NDVI values indicate there is more healthy vegetation present.
* Data on neighborhood clusters was obtained from the Washington DC Open Data website. NAIP data was downloaded and merged for each neighborhood cluster. NDVI summary statistics were calculated.
* The neighborhood cluster for the National Mall and the Potomac River was omitted due to the number of NAIP images associated with this cluster (n=18). I was unable to successfully merge the data for this cluster due to the number of images.

<figure style="display: inline-block;  padding-left:100px; padding-bottom:0px">
    <img alt="Kenilworth Aquatic Gardens" src="https://live.staticflickr.com/65535/51146464327_adc103a3a4_3k.jpg" style="float:center; width: 650px; vertical-align: top;"/>
    <figcaption style="text-align: center; width: 650px; vertical-align: top" > National Arboretum
        <br><a href="https://www.flickr.com/photos/22711505@N05/51146464327">Ron Cogswell</a>, CC by 2.0, via Wikimedia Commons
        </figcaption>
</figure>

In [1]:
# Import Libraries
import os
from glob import glob

import earthpy as et
import earthpy.earthexplorer as etee
import geopandas as gpd 
import geoviews as gv
import holoviews as hv
import pandas as pd
import numpy as np
import rioxarray as rxr
import rioxarray.merge as rxrmerge
import shutil
from bokeh.models import HoverTool


In [2]:
# Define Data Directories for Project
data_dir = os.path.join(et.io.HOME, et.io.DATA_NAME)
dc_dir = os.path.join(data_dir, 'dc-neighborhoods')
ndvi_dir = os.path.join(data_dir, 'dc-green-space', 'processed')

# Check is file directory exists and if not, create it
for file_dir in [dc_dir, ndvi_dir]:
    if not os.path.exists(file_dir):
        os.makedirs()

In [3]:
# Save url for DC Neighborhood boundaries
dc_url = ("https://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/"
           "Administrative_Other_Boundaries_WebMercator/MapServer/17/"
           "query?outFields=*&where=1%3D1&f=geojson")

# Get DC Neighborhood Boundaries as a Shapefiles
dc_nbd_gdf = gpd.read_file(dc_url)


In [4]:
# Define path to DC Neighborhood Data
dc_path = os.path.join(dc_dir, 'dc-neighborhood.geojson')


# If the data does not already exist, save data to directory
if not os.path.exists(dc_path):
    # Save url for DC Neighborhood boundaries
    dc_url = ("https://maps2.dcgis.dc.gov/dcgis/rest/services/DCGIS_DATA/"
            "Administrative_Other_Boundaries_WebMercator/MapServer/17/"
            "query?outFields=*&where=1%3D1&f=geojson"
    )
    # Save DC neighborhood data to a file
    gpd.read_file(dc_url).to_file(dc_path)

# Create Geodatabase of DC Neighborhood Data
dc_gdf = gpd.read_file(dc_path).set_index("NAME")

# Select Two Clusters for Testing Data
neigh_gdf = (
    dc_gdf
    .loc[["Cluster 9", "Cluster 31"]]
)

In [5]:
def download_neighborhood_data(name, geometry, start, end):
    """
    Download NAIP raster for a given geometry, start date, and end date

    Downloads data from the National Agricultural Imagery Program (NAIP)
    given a spatial and temporal extent.
    <citation>

    Parameters
    ==========
    name : str
      The name used to label the download
    geometry : shapely.POLYGON
      The geometry to derive the download extent from. 
      Must have a `.bounds` attribute.
    start : str
      The start date as 'YYYY-MM-DD'
    end : str
      The end date as 'YYYY-MM-DD'

    Returns
    =======
    downloader : earthpy.earthexplorer.EarthExplorerDownloader
      Object with information about the download, including the data directory.
    """
    print(f'Neighborhood Name: {name}')
    # Create bounding box
    bbox = etee.BBox(*geometry.bounds)
    # Create downloader
    naip_downloader = etee.EarthExplorerDownloader(
        dataset="NAIP", 
        label=name.lower().replace(" ", "-"),
        bbox=bbox,
        start=start,
        end=end,
        store_credential=True)
    # Request and download data
    naip_downloader.submit_download_request()
    naip_downloader.download(override=False)
    return naip_downloader

ndvi_stats_path = os.path.join(ndvi_dir, 'neighborhood-ndvi-stats.csv')
if os.path.exists(ndvi_stats_path):
  ndvi_stats_df = pd.read_csv(ndvi_stats_path, index_col="neighborhood")
else:
  print('NDVI Statistics File does not exist...')
  ndvi_stats_df = pd.DataFrame()

# # Run to test
# for neighborhood_name, details in neigh_gdf.iterrows():
#     if neighborhood_name in ndvi_stats_df.index:
#       print("Neighborhood stats have already been calculated. Skipping")
#       continue
    
#     downloader = download_neighborhood_data(
#        neighborhood_name, details.geometry, '2021-01-01', '2021-12-31')

In [6]:
def load_and_merge_arrays(name):
    """
    Load in and merge downloaded arrays
    
    Parameters
    ==========
    name : str
        The name used to label the download
    
    Returns
    =======
    merge_da : rxr.DataArray
        Data array with merged data.
    """
    # Merge data for each neighborhood
    print(f'\nNeighborhood Name: {name}')
    data_path = os.path.join(
        et.io.HOME, et.io.DATA_NAME,
        name.lower().replace(' ', '-'))
    # Define paths to tif data
    tif_paths = glob(os.path.join(data_path, '*.tif'))
    # Load tifs
    das = [rxr.open_rasterio(tp, masked=True) for tp in tif_paths]
    # Merge arrays
    merged_da = rxrmerge.merge_arrays(das)
    return merged_da

# # Run to test
# merged_da = load_and_merge_arrays('Cluster 9')
# merged_da

In [7]:
# Function to compute NDVI summary statistics
def calculate_ndvi_stats(gdf, da, stats_path, override=False):
    """
    Calculate NDVI summary statistics and save to statistics file

    Uses downloaded National Agricultural Imagery Program (NAIP) data.

    Parameters
    ==========
    gdf : [gpd.GeoDataFrame]
        Single row with the neighborhood name and boundary
    da : rxr.DataArray
        Multispectral raster data from NAIP
    stats_path : pathlike
        The path to the statistics file to save results
    """
    name = str(gdf.index[0])
    print(f"\nNeighborhood Name: {name}")

    file_is_empty = True
    if os.path.exists(stats_path):
        print("Stats file exists.")
        stats_df = pd.read_csv(stats_path)
        with open(stats_path, "r") as stats_file:
            file_is_empty = len(stats_file.read()) == 0
            print(f"Stats file is empty? {file_is_empty}")

            if not file_is_empty:
                if name in list(stats_df.neighborhood) and (not override):
                    print("Stats already calculated. Skipping...")
                    return

    # Create gdf for neighborhood
    reprojected_gdf = gdf.to_crs(da.rio.crs)
    # Crop NAIP data array to the neighborhood
    naip_crop_da = da.rio.clip_box(*reprojected_gdf.total_bounds)
    naip_da = naip_crop_da.rio.clip(reprojected_gdf.geometry)

    mode = "w" if file_is_empty else "a"
    # Calculate NDVI
    ndvi_da = (da.sel(band=4) - da.sel(band=1)) / (da.sel(band=4) + da.sel(band=1))
    print("Writing stats to file")
    file_is_empty = not os.path.exists(stats_path)
    # Calculate summary statistics
    pd.DataFrame(
        dict(
            neighborhood=[name],
            ndvi_25pctl=[np.nanpercentile(ndvi_da, 25)],
            ndvi_75pctl=[np.nanpercentile(ndvi_da, 75)],
            ndvi_mean=[float(ndvi_da.mean())],
            ndvi_std=[float(ndvi_da.std())],
            ndvi_median=[float(ndvi_da.median())],
            ndvi_min=[float(ndvi_da.min())],
            ndvi_max=[float(ndvi_da.max())]
        )
    ).to_csv(stats_path, mode=mode, header=file_is_empty, index=False)


# Run to test
# calculate_ndvi_stats(dc_gdf.loc[['Lincoln Park']], merged_da, ndvi_stats_path)

In [8]:
# Dropping Cluster 45 (River and National Mall; there were 18 .tif files 
# and I was never able to get the merge step to go through without crashing)
dc_gdf = dc_gdf.drop('Cluster 45')

In [9]:
# Redefine NDVI stats path
ndvi_stats_path = os.path.join(ndvi_dir, 'neighborhood-ndvi-stats.csv')

# Loop through all DC neighborhoods to download data, merge data,
# and calculate NDVI statistics using functions
for neighborhood_name, details in dc_gdf.iterrows():
    if not os.path.exists(ndvi_stats_path):
        print('NDVI statistics file does not exist...')
        ndvi_stats_df = pd.DataFrame()
    else:
        ndvi_stats_df = pd.read_csv(ndvi_stats_path, index_col='neighborhood')
        
    if neighborhood_name in ndvi_stats_df.index:
      # print("Neighborhood stats have already been calculated. Skipping")
      continue
        
    downloader = download_neighborhood_data(
        neighborhood_name, details.geometry, '2021-01-01', '2021-12-31')
    merged_da = load_and_merge_arrays(neighborhood_name)
    calculate_ndvi_stats(
        dc_gdf.loc[[neighborhood_name]], merged_da, ndvi_stats_path)
    
    shutil.rmtree(downloader.data_dir)

In [10]:
# Create local copy of csv file in repository
shutil.copyfile(ndvi_stats_path, "ndvi_summary_stats.csv")
None

In [11]:
# Read in NDVI Summary Statistics
ndvi_stats_df = pd.read_csv(ndvi_stats_path, index_col="neighborhood")

# Create copy of Neighborhood name for use in hover tool
joined_dc_df = dc_gdf.join(ndvi_stats_df, how="left")
joined_dc_df['name'] = joined_dc_df.index

# Define hover tool for Choropleth
tooltips = [
    ('Neighborhood Cluster', '@name'),
    ('Neighborhood Names', '@NBH_NAMES'),
    ('NDVI', '@ndvi_mean')
]
hover = HoverTool(tooltips=tooltips)

# Create Choropleth of NDVI Statistics
choropleth = gv.Polygons(
    joined_dc_df,
    vdims=['ndvi_mean', 'name', 'NBH_NAMES']
).opts(cmap="RdYlGn",
       title="NDVI in Washington, DC Neighborhoods",
       xaxis=None,
       yaxis=None,
       colorbar=True, 
       colorbar_position="right",
       tools=[hover],
       width=600,
       height=350) * gv.tile_sources.CartoLight 

# Save Chloropleth to HTML
hv.save(choropleth, 'dc_ndvi_choropleth.html')



### Summary of NDVI in DC Neighborhoods
* A map of NDVI in DC neighborhoods can be viewed at [the following page](https://pth6570.github.io/notebooks/dc_ndvi_choropleth.html).
* NDVI is lowest in the central areas of Washington, D.C.
* Neighborhoods near the edages of DC tend to have higher NDVI values.

### Data Citation:


### Citations:
* Pusatory, M. & Henry, J. (2023, May 25). DC named best big-city park system, but some say data is skewed. WUSA90. Retrieved November 25 from https://www.wusa9.com/article/tech/science/environment/dc-named-best-big-city-park-system-push-back/65-a919748b-4244-4ece-9977-d1c96ed35e5c
* GWToday.com (2023, May 8) Report Finds U.S. Control of D.C. Parks Exacerbates Inequities in the City. Retrieved November 25, 2023 from: https://gwtoday.gwu.edu/report-finds-us-control-dc-parks-exacerbates-inequities-city
* Patino, M. & Poon, L. (2021, May 27). The Inequality of American Parks. Bloomberg.com. Retrieved November 25, 2023 from: https://www.bloomberg.com/news/articles/2021-05-27/the-cities-where-people-of-color-can-walk-to-a-park
* The Public Land Trust (2022, May 30). Washington, DC, Named Best Big City Park System in USA, Lifted by Strong Scores for Park Access and Park Equity. Retrieved November 24, 2023 from: https://www.tpl.org/media-room/washington-dc-named-best-big-city-park-system-usa-lifted-strong-scores-park-access-and
* Urban Institute (2022, August 18). “Not All Parks Are Created Equal”: How Communities Can Ensure Parks Are Accessible for All Residents. Retrieved November 24, 2023 from: https://housingmatters.urban.org/feature/not-all-parks-are-created-equal-how-communities-can-ensure-parks-are-accessible-all
* National Capital Planning Commission (n.d.). About Washington’s Parks and Open Space. Retrieved November 25, 2023 from: https://www.ncpc.gov/docs/CapitalSpace_Washingtons_Parks_and_Open_Space.pdf

In [12]:
%%capture
%%bash
jupyter nbconvert multispectral_analysis.ipynb --to html --no-input