# Basic Spatial Anlysis with Python

## Overview
In this lecture, we will investigate the differences between Euclidean distance and Manhattan distance and between Buffer and Convex Hull. To exemplify, we will examine the census block group that is accessible to healthcare resources (i.e., hospitals, emergency medical services, and urgent care) in Champaign County. 

## Data
* Census block group: https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2020&layergroup=Block+Groups
* Healthcare resources: 
    * Hospitals: https://hifld-geoplatform.opendata.arcgis.com/maps/hospitals/about
    * Emergency Medical Service (EMS) Stations: https://hifld-geoplatform.opendata.arcgis.com/datasets/geoplatform::emergency-medical-service-ems-stations/about
    * Urgent Care Facilities: https://hifld-geoplatform.opendata.arcgis.com/datasets/geoplatform::urgent-care-facilities/about

## Glance the difference
### Euclidean distance (4058 feet) vs Manhattan distance (1.2 Miles; 6336 feet)
<img src="./data/euclidean_vs_manhattan.jpg" style="width: 600px;"/>

### Buffer (blue) vs Convex Hull (red)
<img src="./data/convex_hull_vs_buffer.jpg" style="width: 400px;"/>

## 1. Data Preparation

In [None]:
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import os
import warnings
warnings.filterwarnings('ignore')

In [None]:
# Healthcare resources
hc = gpd.read_file('./data/healthcare.shp')
hc.head()

In [None]:
# Census block groups
cbg = gpd.read_file('./data/census_block_group.shp')
cbg.head()

Let's examine the geographical distribution of healthcare resources and census block groups. 
For detail information, Visit `Week6/Geospatial_Data_Visualization.ipynb`. <br>
**Note**: We can specify the order of layers with `zorder` attribute. Higher `zorder` will place the layer on top. 

In [None]:
# Plot results
fig, ax = plt.subplots(figsize=(7, 10))

hc.plot(ax=ax, column='TYPE', markersize=100, legend=True, zorder=2)
cbg.boundary.plot(ax=ax, linestyle='dotted', lw=0.5, color='black', zorder=1)

plt.show()

You will find that the units (or coordinates) of the maps are decimal degrees (longitude and latitude). As we want to measure the distance between census block groups and healthcare resources, we need to reproject two GeoDataFrame (`hc` and `cbg`) from a geographical coordinate system (NAD83; EPSG 4269) to a projected coordinates system (SPCS83 Illinois East zone (meters); EPSG 26971). 

In [None]:
hc.crs

In [None]:
hc = hc.to_crs(epsg=26971)
cbg = cbg.to_crs(epsg=26971)
hc.crs

## 2. Simple Buffer Analysis

First of all, let's determine how far we can travel within a given time. Here, we assume that **10 minutes** is the threshold travel time and **30 MPH** is the travel speed in the study area. Therefore, the travel distance is 5 Miles. 

\begin{gather*}
Distance = Speed * Time\\
\\
5Miles = 30MPH * \frac{10 minutes} {60 minutes}
\end{gather*}

In [None]:
travel_time = 10
dist = 30 * travel_time / 60

# Translate to Meter per Hour to match the unit with the coordinates system (epsg 26971)
dist = dist * 1.6 * 1000
dist
print(f'{dist} meter is the threshold distance can travel within {travel_time} minutes.')

Several ways of doing buffer, but all have the same result. 

In [None]:
# Call buffer from GeoDataFrame
hc_buffer = hc.buffer(dist)
print(type(hc_buffer))
hc_buffer

In [None]:
# Call Buffer from GeoSeires (has to be geometry column)
hc_buffer = hc.geometry.buffer(dist)
hc_buffer

In [None]:
# Iterate through the GeoDataFrame and call buffer
for idx, row in hc.iterrows():
    print(row['geometry'].buffer(dist))

In [None]:
# Plot two layers
fig, ax = plt.subplots(figsize=(7, 10))

hc_buffer.boundary.plot(ax=ax, color='blue', lw=0.5, zorder=1)
hc.plot(ax=ax, column='TYPE', markersize=100, legend=True, zorder=2)
cbg.boundary.plot(ax=ax, linestyle='dotted', lw=0.5, color='black', zorder=1)

plt.show()

Given that the type of `hc_buffer` is `GeoSeries` (not `GeoDataFrame`), you can iterate the rows with `GeoSeries.iteritems()`.

In [None]:
for idx, buf in hc_buffer.iteritems():
    print(buf)

In [None]:
# The following code will not run. 
# for idx, buf in hc_buffer.iterrows():
#     print(buf)

You may remember that you can slice a `GeoDataFrame` with the function below. Let's see how we can select the census block group within each buffer.  
```python
cbg.loc[cbg.geometry.within()]
```

In [None]:
cbg.loc[cbg.geometry.within(buf)]

In [None]:
within_cbg = []
for idx, buf in hc_buffer.iteritems():
    temp_gdf = cbg.loc[cbg.geometry.within(buf)]
    within_cbg.extend(temp_gdf['GEOID'].to_list())
    
within_cbg

You will notice that the resulted list `within_cbg` is larger than the entire number of census block groups (`cbg`). This is because of duplicates, and `set()` will help you to select only the unique values. 

In [None]:
len(within_cbg)

In [None]:
len(cbg)

In [None]:
len(set(within_cbg))

Now, we will select the census block groups only within the buffers. Let's use `.loc[]` method to provide our result. <br>
**Note** You can reverse the result by adding `~` in front of the condition.  

In [None]:
cbg.loc[cbg['GEOID'].isin(set(within_cbg))]

In [None]:
cbg.loc[~cbg['GEOID'].isin(set(within_cbg))]

In [None]:
y_buffer = cbg.loc[cbg['GEOID'].isin(set(within_cbg))]
n_buffer = cbg.loc[~cbg['GEOID'].isin(set(within_cbg))]

In [None]:
# Plot results
fig, ax = plt.subplots(figsize=(7, 10))

hc_buffer.boundary.plot(ax=ax, color='blue', lw=0.5, zorder=2)
y_buffer.plot(ax=ax, color='#ef8a62', zorder=1)
n_buffer.plot(ax=ax, color='#bababa', zorder=1)

hc.plot(ax=ax, markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=ax, linestyle='dotted', lw=0.5, color='black', zorder=1)

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.show()

---
### *Exercise*

Suppose that we examine the sensitivities of both travel time and the geographical unit selection methods. <br>
Here, we want to **1)** increase travel time from 10 minutes to 15 minutes and **2)** utilize intersects function instead of within function. Investigate the following code and make necessary changes. You can check your answer with the cell below. 

```python
travel_time_15 = 10  # Travel time 
dist_15 = 30 * travel_time_15 / 60  # Current unit: MPH
dist_15 = dist_15 * 1.6 * 1000 # From mph(miles per hour) to m/h (meters per hour)
print(f'{dist_15} meter is the threshold distance can travel within {travel_time_15} minutes.')

# Creating buffer
hc_buffer_15 = hc.geometry.buffer(dist_15)

# Collect GEOIDs of accessible Census Block Groups
within_cbg = []
for idx, buf in hc_buffer_15.iteritems():
    temp_gdf = cbg.loc[cbg.geometry.within(buf)]
    within_cbg.extend(temp_gdf['GEOID'].to_list())
    
# Slice the original Census Block Group GeoDataFrame 
# to sort cbg based on they are accessible or not. 
y_buffer_15 = cbg.loc[cbg['GEOID'].isin(set(within_cbg))]
n_buffer_15 = cbg.loc[~cbg['GEOID'].isin(set(within_cbg))]

```
---

In [None]:
# Your answer here (Modify the following code)

travel_time_15 = 10  # Travel time 
dist_15 = 30 * travel_time_15 / 60  # Current unit: MPH
dist_15 = dist_15 * 1.6 * 1000 # From mph(miles per hour) to m/h (meters per hour)
print(f'{dist_15} meter is the threshold distance can travel within {travel_time_15} minutes.')

# Creating buffer
hc_buffer_15 = hc.geometry.buffer(dist_15)

# Collect GEOIDs of accessible Census Block Groups
within_cbg = []
for idx, buf in hc_buffer_15.iteritems():
    temp_gdf = cbg.loc[cbg.geometry.within(buf)]
    within_cbg.extend(temp_gdf['GEOID'].to_list())
    
# Slice the original Census Block Group GeoDataFrame 
# to sort cbg based on they are accessible or not. 
y_buffer_15 = cbg.loc[cbg['GEOID'].isin(set(within_cbg))]
n_buffer_15 = cbg.loc[~cbg['GEOID'].isin(set(within_cbg))]

**Check your answer with the cell below. Your output should look similar to the below. **

![](./data/exercise_1.jpg)

In [None]:
# Check your answer: Plot results
fig, axes = plt.subplots(1, 2, figsize=(15, 10))

# Within 10 minutes (8km) buffer
hc_buffer.boundary.plot(ax=axes[0], color='blue', lw=0.5, zorder=2)
y_buffer.plot(ax=axes[0], color='#ef8a62', zorder=1)
n_buffer.plot(ax=axes[0], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[0], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[0], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[0].set_title('Within 10 minutes (8km) buffer')
axes[0].get_xaxis().set_visible(False)
axes[0].get_yaxis().set_visible(False)

# Intersects with 15 minutes (12km) buffer'
hc_buffer_15.boundary.plot(ax=axes[1], color='blue', lw=0.5, zorder=2)
y_buffer_15.plot(ax=axes[1], color='#ef8a62', zorder=1)
n_buffer_15.plot(ax=axes[1], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[1], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[1], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[1].set_title('Intersects with 15 minutes (12km) buffer')
axes[1].get_xaxis().set_visible(False)
axes[1].get_yaxis().set_visible(False)

plt.show()

You may think having a large number of buffer make the map complicated and want to dissolve/union the buffer. You can use `.unary_union` for this purpose. 
**Note** The result of `unary_union` has the type of `Shapely`. You need to convert it back to `GeoSeries` or `GeoDataFrame` for plotting. 

In [None]:
hc_buffer_union = hc.geometry.buffer(dist).unary_union
print(type(hc_buffer_union))
# hc_buffer.plot()   # Will cause a problem. Shapely does not have a method .plot()
hc_buffer_union

In [None]:
hc_buffer_union_1 = gpd.GeoSeries(hc_buffer_union)
print(type(hc_buffer_union_1))
hc_buffer_union_1.plot()

Given that the aggregated buffer only has a row, you don't need to iterate every row to select census block groups. Instead, you can use the following one line. 

In [None]:
cbg.loc[cbg.geometry.within(hc_buffer_union)]

The following shows that the two approaches (i.e., iteration and unary_union) produce the same outcome. 

In [None]:
y_buffer_1 = cbg.loc[cbg.geometry.within(hc_buffer_union)]
n_buffer_1 = cbg.loc[~cbg.geometry.within(hc_buffer_union)]

In [None]:
# Plot two layers
fig, axes = plt.subplots(1, 2, figsize=(15, 10))

# Iteration approach
hc_buffer.boundary.plot(ax=axes[0], color='blue', lw=0.5, zorder=2)
y_buffer.plot(ax=axes[0], color='#ef8a62', zorder=1)
n_buffer.plot(ax=axes[0], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[0], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[0], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[0].set_title('Iteration approach')
axes[0].get_xaxis().set_visible(False)
axes[0].get_yaxis().set_visible(False)

# Union Approach
hc_buffer_union_1.boundary.plot(ax=axes[1], color='blue', lw=0.5, zorder=2)
y_buffer_1.plot(ax=axes[1], color='#ef8a62', zorder=1)
n_buffer_1.plot(ax=axes[1], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[1], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[1], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[1].set_title('Union approach')
axes[1].get_xaxis().set_visible(False)
axes[1].get_yaxis().set_visible(False)

plt.show()

In [None]:
# In summary, you just need three lines of code to do the buffer analysis. 
hc_buffer_union = hc.geometry.buffer(dist).unary_union

y_buffer_1 = cbg.loc[cbg.geometry.within(hc_buffer_union)]
n_buffer_1 = cbg.loc[~cbg.geometry.within(hc_buffer_union)]

---
### *Exercise*

Here, we also want to **1)** increase travel time from 10 minutes to 15 minutes and **2)** utilize intersects function instead of within function. Investigate the following code and make necessary changes. You can check your answer with the cell below. 

```python
hc_buffer_union_15 = hc.geometry.buffer(dist).unary_union  # Create buffers and make a union

# Slice the original Census Block Group GeoDataFrame 
# to sort cbg based on they are accessible or not. 
y_buffer_15_1 = cbg.loc[cbg.geometry.within(hc_buffer_union)]
n_buffer_15_1 = cbg.loc[~cbg.geometry.within(hc_buffer_union)]
```
---

In [None]:
# Your answer here (Modify the following code)
hc_buffer_union_15 = hc.geometry.buffer(dist).unary_union  # Create buffers and make a union

# Slice the original Census Block Group GeoDataFrame 
# to sort cbg based on they are accessible or not. 
y_buffer_15_1 = cbg.loc[cbg.geometry.within(hc_buffer_union)]
n_buffer_15_1 = cbg.loc[~cbg.geometry.within(hc_buffer_union)]

**Check your answer with the cell below. Your output should look similar to the below.**

![](./data/exercise_2.jpg)

In [None]:
# Plot two layers
fig, axes = plt.subplots(1, 2, figsize=(15, 10))

# Union Approach
hc_buffer_union_1.boundary.plot(ax=axes[0], color='blue', lw=0.5, zorder=2)
y_buffer_1.plot(ax=axes[0], color='#ef8a62', zorder=1)
n_buffer_1.plot(ax=axes[0], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[0], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[0], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[0].set_title('Within 10 minutes (8km) buffer')
axes[0].get_xaxis().set_visible(False)
axes[0].get_yaxis().set_visible(False)


# Iteration approach
gpd.GeoSeries(hc_buffer_union_15).boundary.plot(ax=axes[1], color='blue', lw=0.5, zorder=2)
y_buffer_15_1.plot(ax=axes[1], color='#ef8a62', zorder=1)
n_buffer_15_1.plot(ax=axes[1], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[1], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[1], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[1].set_title('Intersects with 15 minutes (12km) buffer')
axes[1].get_xaxis().set_visible(False)
axes[1].get_yaxis().set_visible(False)


plt.show()

## 3. Euclidean distance based on OD (Origin-Destination) Matrix

The buffer analysis has some issues of underestimate/overestimate the access. For example, the upper left buffer does not have any census block group selected. <br>
To increase the accuracy of analysis, we can measure the distance between every census block group and healthcare resource. Here, we take advantage of the `.distance()` method of `shapely`. 


In [None]:
hc.head()

In [None]:
cbg.head()

In [None]:
hc.at[0, 'geometry']

In [None]:
cbg.at[0, 'geometry']

In [None]:
cbg.at[0, 'geometry'].centroid

In [None]:
hc.at[0, 'geometry'].distance(cbg.at[0, 'geometry'].centroid)

In [None]:
# Plot results
fig, ax = plt.subplots(figsize=(7, 10))

# Create points of origin and desitnation and connect them. 
ori = hc.at[0, 'geometry']
dest = cbg.at[0, 'geometry'].centroid

plt.plot(ori.x, ori.y, 'ro')
plt.plot(dest.x, dest.y, 'ro')
plt.plot([ori.x, dest.x], [ori.y, dest.y], color='black', lw=2)

# Decoration
hc.plot(ax=ax, markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=ax, linestyle='dotted', lw=0.5, color='black', zorder=1)

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.show()

In [None]:
# Calculate euclidean distance between every healthcare and census block group
within_eucli = []

for idx_h, row_h in hc.iterrows():  # Iterate through healthcare resources
    for idx_c, row_c in cbg.iterrows():  # Iterate through census block groups
        
        temp_dist = row_h.geometry.distance(row_c.geometry.centroid)  # Measure distance between two locations
        print(f'From HC {idx_h} to CBG {idx_c}, Distance: {round(temp_dist)} m' )
        
        if temp_dist < dist:  # dist: 8000 meters
            within_eucli.append(row_c['GEOID'])    # append GEOID of a CBG if the distance is less than the threshold


In [None]:
set(within_eucli)

In [None]:
# Select census block groups within the distance 
y_eucli = cbg.loc[cbg['GEOID'].isin(set(within_eucli))]
n_eucli = cbg.loc[~cbg['GEOID'].isin(set(within_eucli))]

In [None]:
# Plot Eluclidean distance result
fig, ax = plt.subplots(figsize=(7, 10))

hc_buffer_union_1.boundary.plot(ax=ax, color='blue', lw=0.5, zorder=2)
y_eucli.plot(ax=ax, color='#ef8a62', zorder=1)
n_eucli.plot(ax=ax, color='#bababa', zorder=1)

# Decoration
hc.plot(ax=ax, markersize=50, color='blue', zorder=2)
cbg.centroid.plot(ax=ax, markersize=5, color='black')  # Centroids of Census Block Groups
cbg.boundary.plot(ax=ax, linestyle='dotted', lw=0.5, color='black', zorder=1)

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.show()

In [None]:
# In summary, the following is the necessary code for the euclidean distance approach.

# Calculate euclidean distance between every healthcare and census block group
within_eucli = []

for idx_h, row_h in hc.iterrows():  # Iterate through healthcare resources
    for idx_c, row_c in cbg.iterrows():  # Iterate through census block groups
        
        temp_dist = row_h.geometry.distance(row_c.geometry.centroid)  # Measure distance between two locations
       
        if temp_dist < dist:  # dist: 8000 meters   
            within_eucli.append(row_c['GEOID'])  # append GEOID of a CBG if the distance is less than the threshold

# Select census block groups within the distance 
y_eucli = cbg.loc[cbg['GEOID'].isin(set(within_eucli))]
n_eucli = cbg.loc[~cbg['GEOID'].isin(set(within_eucli))]

## 4. Manhattan Distance based on OD Matrix

Given the fact that people actually travel through the road network, consideration of manhattan distance would provide more accurate information. <br>
For this purpose, we will employ two packages, `osmnx` and `networkx`. 
* <a href=https://osmnx.readthedocs.io/en/stable/>`osmnx`</a> is a Python package to retrieve, model, analyze, and visualize street networks from OpenStreetMap. It inherits most of configuration of `networkx`. Users can download and model walkable, drivable, or bikeable urban networks with a single line of Python code, and then easily analyze and visualize them. 
* <a href=https://networkx.org/>`NetworkX`</a> is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.



In [None]:
# Import necessary packages
import osmnx as ox
import networkx as nx
from tqdm import tqdm

### 4.1. Data preprocessing

We can import road network from anywhere around the world with <a href=https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.graph.graph_from_place>`ox.graph_from_place()`</a>. The return variable has the type of `networkx`. 

In [None]:
G = ox.graph_from_place('Champaign County, IL, USA', network_type='drive', simplify=True)
G

In [None]:
ox.plot_graph(G)

We need to match CRS to measure the distance, appropriately; therefore, we project the graph to epsg 26971.

In [None]:
G = ox.projection.project_graph(G, to_crs='epsg:26971')
ox.plot_graph(G)

Next step is to find the nearest osm element from `hc` and `cbg` for running network analysis on the OSM network. 

In [None]:
hc.at[0, 'geometry'] # Geometry of the Carle Foundation Hospital

In [None]:
# This function returns the OSM id of the node
ox.distance.nearest_nodes(G=G, 
                          X=hc.at[0, 'geometry'].x, 
                          Y=hc.at[0, 'geometry'].y,
#                           return_dist=True
                         )

In [None]:
# This function helps you to find the nearest OSM node from a given GeoDataFrame
# If geom type is point, it will take it without modification, but 
# IF geom type is polygon or multipolygon, it will take its centroid to calculate the nearest element. 

def find_nearest_osm(network, gdf):
    for idx, row in tqdm(gdf.iterrows(), total=gdf.shape[0]):
        if row.geometry.geom_type == 'Point':
            nearest_osm = ox.distance.nearest_nodes(network, 
                                                    X=row.geometry.x, 
                                                    Y=row.geometry.y
                                                   )
        elif row.geometry.geom_type == 'Polygon' or row.geometry.geom_type == 'MultiPolygon':
            nearest_osm = ox.distance.nearest_nodes(network, 
                                        X=row.geometry.centroid.x, 
                                        Y=row.geometry.centroid.y
                                       )
        else:
            print(row.geometry.geom_type)
            continue

        gdf.at[idx, 'nearest_osm'] = nearest_osm

    return gdf

In [None]:
hc = find_nearest_osm(G, hc)
cbg = find_nearest_osm(G, cbg)

In [None]:
hc.head()

In [None]:
cbg.head()

It is also possible to convert the OSM road network to GeoPandas GeoDataFrame with `ox.graph_to_gdfs()`. 

In [None]:
nodes, edges = ox.graph_to_gdfs(G, nodes=True, edges=True, node_geometry=True)

In [None]:
edges.head()

In [None]:
nodes.head()

### 4.2. Calculate the shortest path between two locations

We can calculate shortest path between two locations and produce two different results.
* <a href=https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path.html>nx.shortest_path()</a>: Compute shortest paths in the graph.
* <a href=https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.shortest_path_length.html>nx.shortest_path_length()</a>: Compute shortest path lengths in the graph.

In [None]:
hc.loc[0]

In [None]:
hc.loc[1]

In [None]:
# Returns the node ids of the shortest path
routes = nx.shortest_path(G=G, 
                          source=hc.loc[0, 'nearest_osm'], 
                          target=hc.loc[1, 'nearest_osm'], 
                          weight='length',
                          method='dijkstra'
                         )
routes

In [None]:
ox.plot.plot_graph_route(G, routes)

In [None]:
# Returns the length of the shortest path
nx.shortest_path_length(G=G, 
                        source=hc.loc[0, 'nearest_osm'], 
                        target=hc.loc[1, 'nearest_osm'], 
                        weight='length',
                        method='dijkstra'
                       )

Verify distance on Google Maps. 

https://www.google.com/maps/dir/Carle+Foundation+Hospital,+West+Park+Street,+Urbana,+IL/The+Pavilion+Foundation,+809+W+Church+St,+Champaign,+IL+61820/@40.1199771,-88.2449266,15z/data=!3m1!4b1!4m14!4m13!1m5!1m1!1s0x880cd7719b423a01:0x1cbc0832642e6bd9!2m2!1d-88.2155548!2d40.1169714!1m5!1m1!1s0x880cd0ba1673d039:0x1222076f6d85d29c!2m2!1d-88.2576659!2d40.1177169!3e0

In [None]:
# Calculate Manhattan distance between every healthcare and census block group
within_manh = []

for idx_h, row_h in tqdm(hc.iterrows(), total=hc.shape[0]):
    for idx_c, row_c in cbg.iterrows():
        
        temp_dist = nx.shortest_path_length(G=G, 
                                            source=row_h['nearest_osm'], 
                                            target=row_c['nearest_osm'], 
                                            weight='length',
                                            method='dijkstra'
                                           )
#         print(f'From HC {idx_h} to CBG {idx_c}, Distance: {round(temp_dist)} m' )
        
        if temp_dist < dist:
            within_manh.append(row_c['GEOID'])


In [None]:
# Select census block groups within the distance 
y_manh = cbg.loc[cbg['GEOID'].isin(set(within_manh))]
n_manh = cbg.loc[~cbg['GEOID'].isin(set(within_manh))]

In [None]:
# Plot Manhattan distance result
fig, ax = plt.subplots(figsize=(7, 10))

hc_buffer_union_1.boundary.plot(ax=ax, color='blue', lw=0.5, zorder=2)
y_manh.plot(ax=ax, color='#ef8a62', zorder=1)
n_manh.plot(ax=ax, color='#bababa', zorder=1)

# Decoration
hc.plot(ax=ax, markersize=50, color='blue', zorder=2)
cbg.centroid.plot(ax=ax, markersize=5, color='black')  # Centroids of Census Block Groups
cbg.boundary.plot(ax=ax, linestyle='dotted', lw=0.5, color='black', zorder=1)

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.show()

In [None]:
# In summary, the following is the necessary code for the Manhattan distance approach.

# Obtain Network from Open Street Map and project to a local CRS
G = ox.graph_from_place('Champaign County, IL, USA', network_type='drive', simplify=True)
G = ox.projection.project_graph(G, to_crs='epsg:26971')

# Find the nearest OSM node from the given GeoDataFrame (NOTE: find_nearest_osm is a user-defined function)
hc = find_nearest_osm(G, hc)
cbg = find_nearest_osm(G, cbg)

# Calculate Manhattan distance between every healthcare and census block group
within_manh = []
for idx_h, row_h in tqdm(hc.iterrows(), total=hc.shape[0]):  # Iterate through healthcare resources
    for idx_c, row_c in cbg.iterrows():   # Iterate through census block groups
        
        # Measure Manhattan distance between two locations (actually between two OSM nodes)
        temp_dist = nx.shortest_path_length(G=G, source=row_h['nearest_osm'], target=row_c['nearest_osm'], weight='length', method='dijkstra')
        
        if temp_dist < dist:  # dist: 8000 meters
            within_manh.append(row_c['GEOID'])  # append GEOID of a CBG if the distance is less than the threshold          

# Select census block groups within the distance 
y_manh = cbg.loc[cbg['GEOID'].isin(set(within_manh))]
n_manh = cbg.loc[~cbg['GEOID'].isin(set(within_manh))]


## 5. Compare the results

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(20, 10))

# First approach: Buffer analysis
hc_buffer_union_1.boundary.plot(ax=axes[0], color='blue', lw=0.5, zorder=2)
y_buffer_1.plot(ax=axes[0], color='#ef8a62', zorder=1)
n_buffer_1.plot(ax=axes[0], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[0], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[0], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[0].set_title('Buffer analysis')
axes[0].get_xaxis().set_visible(False)
axes[0].get_yaxis().set_visible(False)

# Second approach: Euclidean distance OD matrix
hc_buffer_union_1.boundary.plot(ax=axes[1], color='blue', lw=0.5, zorder=2)
y_eucli.plot(ax=axes[1], color='#ef8a62', zorder=1)
n_eucli.plot(ax=axes[1], color='#bababa', zorder=1)

# Decoration
hc.plot(ax=axes[1], markersize=50, color='blue', zorder=2)
cbg.centroid.plot(ax=axes[1], markersize=5, color='black')  # Centroids of Census Block Groups
cbg.boundary.plot(ax=axes[1], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[1].set_title('Euclidean distance')
axes[1].get_xaxis().set_visible(False)
axes[1].get_yaxis().set_visible(False)

# Third approach: Manhattan distance OD matrix
hc_buffer_union_1.boundary.plot(ax=axes[2], color='blue', lw=0.5, zorder=2)
y_manh.plot(ax=axes[2], color='#ef8a62', zorder=1)
n_manh.plot(ax=axes[2], color='#bababa', zorder=1)

# Decoration
hc.plot(ax=axes[2], markersize=50, color='blue', zorder=2)
cbg.centroid.plot(ax=axes[2], markersize=5, color='black')  # Centroids of Census Block Groups
cbg.boundary.plot(ax=axes[2], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[2].set_title('Manhattan distance')
axes[2].get_xaxis().set_visible(False)
axes[2].get_yaxis().set_visible(False)

plt.show()

## 6. The most efficient and accurate way: Manhattan distance with Convex Hull

One of the caveats calculating the OD matrix of Manhattan distance is the computational intensity. It is slow. <br>
One workaround is to calculate the list of nodes that is accessible from healthcare resources and create their convex hull. It will provide a higher accuracy, but the computational speed will be similar to the buffer analysis. 

* <a href=https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.weighted.single_source_dijkstra_path_length.html>nx.single_source_dijkstra_path_length</a>: Find shortest weighted path lengths in G from a source node.

In [None]:
# This returns the dictionary that has OSM node as its key and the distance as value.
temp_nodes = nx.single_source_dijkstra_path_length(G, hc.loc[0, 'nearest_osm'], dist, weight='length')
temp_nodes

In [None]:
fig, ax = plt.subplots(figsize=(10, 10))

# Select nodes within threshold distance
nodes.loc[nodes.index.isin(temp_nodes.keys())].plot(ax=ax, color='black', markersize=1)
nodes.loc[~nodes.index.isin(temp_nodes.keys())].plot(ax=ax, color='grey', markersize=1)

gpd.GeoSeries(nodes.loc[nodes.index.isin(temp_nodes.keys()), 'geometry'].unary_union.convex_hull).boundary.plot(ax=ax, color='red', lw=1, zorder=2)
gpd.GeoSeries(hc.loc[0, 'geometry']).buffer(dist).boundary.plot(ax=ax, color='blue', lw=1, zorder=2)
gpd.GeoSeries(hc.loc[0, 'geometry']).plot(ax=ax, color='yellow', markersize=100)

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

In [None]:
nodes, edges = ox.graph_to_gdfs(G, nodes=True, edges=True, node_geometry=True)
nodes

In [None]:
access_nodes = nodes.loc[nodes.index.isin(temp_nodes.keys()), 'geometry']
access_nodes

In [None]:
access_nodes.unary_union  # Union every nodes
access_nodes.unary_union.convex_hull # Create convex hull from the unioned nodes

Now we configure a for loop (not nested) to calculate the accessible location at once. 

In [None]:
convex_hulls = gpd.GeoSeries()

for idx, row in tqdm(hc.iterrows(), total=hc.shape[0]):
    temp_nodes = nx.single_source_dijkstra_path_length(G, row['nearest_osm'], dist, weight='length')
    access_nodes = nodes.loc[nodes.index.isin(temp_nodes.keys()), 'geometry']
    access_nodes_ = gpd.GeoSeries(access_nodes.unary_union.convex_hull)
    convex_hulls = convex_hulls.append(access_nodes_, ignore_index=True)
    
convex_hulls

In [None]:
fig, ax = plt.subplots(figsize=(7, 10))

hc_buffer.boundary.plot(ax=ax, color='blue', lw=0.5, zorder=2)
convex_hulls.boundary.plot(ax=ax, color='red', lw=0.5, zorder=2)

## Decoration purpose
hc.plot(ax=ax, markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=ax, linestyle='dotted', lw=0.5, color='black', zorder=1)

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

We still don't want to have overlaid convex hulls for visualization purposes. Let's use `unary_union` again. 

In [None]:
convex_hulls_union = convex_hulls.unary_union
convex_hulls_union = gpd.GeoSeries(convex_hulls_union)
convex_hulls_union.plot()

In [None]:
y_convex_hull = cbg.loc[cbg.geometry.centroid.within(convex_hulls_union[0])]
n_convex_hull = cbg.loc[~cbg.geometry.centroid.within(convex_hulls_union[0])]

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(20, 10))

# First: Buffer analysis
hc_buffer_union_1.boundary.plot(ax=axes[0], color='blue', lw=0.5, zorder=2)
y_buffer_1.plot(ax=axes[0], color='#ef8a62', zorder=1)
n_buffer_1.plot(ax=axes[0], color='#bababa', zorder=1)

## Decoration purpose
hc.plot(ax=axes[0], markersize=50, color='blue', zorder=2)
cbg.boundary.plot(ax=axes[0], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[0].set_title('Buffer analysis')
axes[0].get_xaxis().set_visible(False)
axes[0].get_yaxis().set_visible(False)

# Second: Manhattan distance OD matrix
hc_buffer_union_1.boundary.plot(ax=axes[1], color='blue', lw=0.5, zorder=2)
y_manh.plot(ax=axes[1], color='#ef8a62', zorder=1)
n_manh.plot(ax=axes[1], color='#bababa', zorder=1)

# Decoration
hc.plot(ax=axes[1], markersize=50, color='blue', zorder=2)
cbg.centroid.plot(ax=axes[1], markersize=5, color='black')  # Centroids of Census Block Groups
cbg.boundary.plot(ax=axes[1], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[1].set_title('Manhattan distance')
axes[1].get_xaxis().set_visible(False)
axes[1].get_yaxis().set_visible(False)

# Third: Convex Hull
convex_hulls_union.boundary.plot(ax=axes[2], color='blue', lw=0.5, zorder=2)
y_convex_hull.plot(ax=axes[2], color='#ef8a62', zorder=1)
n_convex_hull.plot(ax=axes[2], color='#bababa', zorder=1)

# Decoration
hc.plot(ax=axes[2], markersize=50, color='blue', zorder=2)
cbg.centroid.plot(ax=axes[2], markersize=5, color='black')  # Centroids of Census Block Groups
cbg.boundary.plot(ax=axes[2], linestyle='dotted', lw=0.5, color='black', zorder=1)
axes[2].set_title('Convex Hull')
axes[2].get_xaxis().set_visible(False)
axes[2].get_yaxis().set_visible(False)

axes[0].set_xlim(axes[2].get_xlim())
axes[0].set_ylim(axes[2].get_ylim())

axes[1].set_xlim(axes[2].get_xlim())
axes[1].set_ylim(axes[2].get_ylim())

plt.show()