# STEP 6: Calculate zonal statistics

In order to evaluate the connection between vegetation health and
redlining, we need to summarize NDVI across the same geographic areas as
we have redlining information.

First, import variables from previous notebooks:

In [56]:
store -r denver_redlining_gdf ndvi_da

<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-task"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Try It: Import packages</div></div><div class="callout-body-container callout-body"><p>Some packages are included that will help you calculate statistics
for areas imported below. Add packages for:</p>
<ol type="1">
<li>Interactive plotting of tabular and vector data</li>
<li>Working with categorical data in <code>DataFrame</code>s</li>
</ol></div></div>

In [53]:
# Interactive plots with pandas
# Ordered categorical data
import regionmask # Convert shapefile to mask
from xrspatial import zonal_stats # Calculate zonal statistics
import cartopy.crs as ccrs # CRSs
import numpy as np

<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-task"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Try It: Convert vector to raster</div></div><div class="callout-body-container callout-body"><p>You can convert your vector data to a raster mask using the
<code>regionmask</code> package. You will need to give
<code>regionmask</code> the geographic coordinates of the grid you are
using for this to work:</p>
<ol type="1">
<li>Replace <code>gdf</code> with your redlining
<code>GeoDataFrame</code>.</li>
<li>Add code to put your <code>GeoDataFrame</code> in the same CRS as
your raster data.</li>
<li>Replace <code>x_coord</code> and <code>y_coord</code> with the x and
y coordinates from your raster data.</li>
</ol></div></div>

In [76]:
denver_redlining_gdf = denver_redlining_gdf.to_crs(ccrs.Mercator())

In [72]:
ndvi_da.coords['y'].values

array([4400025., 4399995., 4399965., 4399935., 4399905., 4399875.,
       4399845., 4399815., 4399785., 4399755., 4399725., 4399695.,
       4399665., 4399635., 4399605., 4399575., 4399545., 4399515.,
       4399485., 4399455., 4399425., 4399395., 4399365., 4399335.,
       4399305., 4399275., 4399245., 4399215., 4399185., 4399155.,
       4399125., 4399095., 4399065., 4399035., 4399005., 4398975.,
       4398945., 4398915., 4398885., 4398855., 4398825., 4398795.,
       4398765., 4398735., 4398705., 4398675., 4398645., 4398615.,
       4398585., 4398555., 4398525., 4398495., 4398465., 4398435.,
       4398405., 4398375., 4398345., 4398315., 4398285., 4398255.,
       4398225., 4398195., 4398165., 4398135., 4398105., 4398075.,
       4398045., 4398015., 4397985., 4397955., 4397925., 4397895.,
       4397865., 4397835., 4397805., 4397775., 4397745., 4397715.,
       4397685., 4397655., 4397625., 4397595., 4397565., 4397535.,
       4397505., 4397475., 4397445., 4397415., 4397385., 43973

In [73]:
# zones
denver_redlining_mask = regionmask.mask_geopandas(
    denver_redlining_gdf,
    lon_or_obj=ndvi_da.coords['x'].values,
    lat=ndvi_da.coords['y'].values,
    # The regions do not overlap
    overlap=False,
    # We're not using geographic coordinates
    wrap_lon=False
)

  return func(*args, **kwargs)


In [74]:
denver_redlining_mask

<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-task"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Try It: Calculate zonal statistics</div></div><div class="callout-body-container callout-body"><p>Calculate zonal status using the <code>zonal_stats()</code> function.
To figure out which arguments it needs, use either the
<code>help()</code> function in Python, or search the internet.</p></div></div>

In [45]:
# Calculate NDVI stats for each redlining zone

stats_df = zonal_stats(zones=denver_redlining_mask, values=ndvi_da)

stats_df

Unnamed: 0,zone,mean,max,min,sum,std,var,count


<link rel="stylesheet" type="text/css" href="./assets/styles.css"><div class="callout callout-style-default callout-titled callout-task"><div class="callout-header"><div class="callout-icon-container"><i class="callout-icon"></i></div><div class="callout-title-container flex-fill">Try It: Plot regional statistics</div></div><div class="callout-body-container callout-body"><p>Plot the regional statistics:</p>
<ol type="1">
<li>Merge the NDVI values into the redlining
<code>GeoDataFrame</code>.</li>
<li>Use the code template below to convert the <code>grade</code> column
(<code>str</code> or <code>object</code> type) to an ordered
<code>pd.Categorical</code> type. This will let you use ordered color
maps with the grade data!</li>
<li>Drop all <code>NA</code> grade values.</li>
<li>Plot the NDVI and the redlining grade next to each other in linked
subplots.</li>
</ol></div></div>

In [8]:
# Merge the NDVI stats with redlining geometry into one `GeoDataFrame`

# Change grade to ordered Categorical for plotting
gdf.grade = pd.Categorical(
    gdf.grade,
    ordered=True,
    categories=['A', 'B', 'C', 'D']
)

# Drop rows with NA grades
denver_ndvi_gdf = denver_ndvi_gdf.dropna()

# Plot NDVI and redlining grade in linked subplots

In [10]:
store denver_ndvi_gdf