# Lab7: Accessibility (Cumulative Opportunity) 

In this lab, we will measure accessibility using the **Cumulative Opportunity** method. This method evaluates accessibility by counting the number of opportunities (e.g., jobs, services) available within a specified distance or travel time from a given location. Additionally, it can incorporate the effect of distance—known as *distance decay*—by applying various decay functions.  
The method is defined as follows:

$$\LARGE O_i = {\sum_{j\in {\left\{{t_{ij}} \le {t_0} \right\}}}^{}{S_j}{f(t_{ij})}}$$

where:<br>
$O_i$: is the cumulative opportunity of location $i$ (e.g., the number of hospital accessible at location $i$). <br>
$S_j$: the degree of supply (e.g., number of doctors) at location $j$. <br>
$t_{ij}$: the travel time between locations $i$ and $j$. <br>
$t_0$: the threshold travel time of the analysis. <br>
$f(t_{ij})$: a distance decay function that decreases the weight of opportunities as the distance increases. <br>

In addition, this lab will implement the Gaussian distribution as a distance decay function. The Gaussian distribution is defined as:
$$\LARGE
G(t_{ij}, t_0) =
\begin{cases}
\frac{e^{-\frac{1}{2} \left(\frac{t_{ij}}{t_0}\right)^2} - e^{-\frac{1}{2}}}{1 - e^{-\frac{1}{2}}} & \text{if } t_{ij} \leq t_0 \\\\
0 & \text{if } t_{ij} > t_0
\end{cases}
$$

where: <br>
$G(t_{ij}, t_0)$: the Gaussian distribution value at distance $t_{ij}$ with a threshold $t_0$. <br>
$e$: the base of the natural logarithm (approximately 2.71828). <br>

## Notes:
**Before you submit your lab, make sure everything runs as expected WITHOUT ANY ERROR.** <br>
**Make sure you fill in any place that says `YOUR CODE HERE` or `YOUR ANSWER HERE`:**

In [None]:
FULL_NAME = ""

In [None]:
# Import necessary packages
import geopandas as gpd
import pandas as pd
import networkx as nx
import osmnx as ox
import math
import matplotlib.pyplot as plt

## 1. Data Preparation

**1.1.** (0 point) Load `emd_5179.geojson` in the data folder as the name of `emd_gdf` with GeoPandas.<br>
**1.2.** (0 point) Load `hospitals_seoul.geojson` in the data folder as the name of `hospital_gdf` with GeoPandas. <br>

In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert type(emd_gdf) == gpd.GeoDataFrame
assert emd_gdf.shape == (426, 4)
assert emd_gdf.crs == 'EPSG:5179'
assert hospital_gdf.shape == (14, 7)
assert hospital_gdf.crs == 'EPSG:5179'

print('Success!')

**1.3.** (2 point) Load `road_network_seoul.graphml` in the data folder as the name of `G` with OSMnx. <br>

In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert type(G) == nx.classes.multidigraph.MultiDiGraph
assert len(G.nodes) == 31253
assert len(G.edges) == 64513

print('Success!')

**1.4.** (2 points) For each hospital in `hospital_gdf`, find the nearest node in the network `G` and store the result in a new column named `nearest_nodes` in the `hospital_gdf` GeoDataFrame.

**1.5.** (2 points) For each centroid of an administrative region (읍면동) in `emd_gdf`, find the nearest node in the network `G` and store the result in a new column named `nearest_nodes` in the `emd_gdf` GeoDataFrame.


In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert hospital_gdf.loc[hospital_gdf['사업장명'] == '연세대학교의과대학세브란스병원', 'nearest_nodes'].values[0] == 637022
assert hospital_gdf.loc[hospital_gdf['사업장명'] == '삼성서울병원', 'nearest_nodes'].values[0] == 847578
assert emd_gdf.loc[emd_gdf['ADM_NM'] == '회기동', 'nearest_nodes'].values[0] == 347649
assert emd_gdf.loc[emd_gdf['ADM_NM'] == '여의동', 'nearest_nodes'].values[0] == 224341

print('Success!')

## 2. Get the accessible area of a hospital

In this section, we will calculate the accessible area of a hospital using the Cumulative Opportunity method.  
As illustrated in the image below, you will compute the opportunity for a hospital (e.g., `경희대학교병원`) within a specified travel time threshold (e.g., 10 minutes).  
Here, opportunity is defined as the number of hospitals that can be reached within the given travel time threshold.


<div style="text-align: center">
  <img src="./data/q2.png" width="500">
</div>

**2.1.** (2 points) Select the nearest node of `경희대학교병원` from `hospital_gdf` and store it in a variable named `hospital_node`. Be sure to select the ID as an integer.

In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert hospital_node == 647525

print('Success!')

**2.2.** (2 points) Search the accessible nodes from `hospital_node` within 10 minutes (600 seconds) and store the result (dictionary) in a variable named `access_nodes_dic`. <br> You can use <a href=https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.weighted.single_source_dijkstra_path_length.html>`nx.single_source_dijkstra_path_length()`</a> function.<br>


In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert type(access_nodes_dic) == dict
assert len(access_nodes_dic) == 626
assert round(access_nodes_dic[348067]) == 600
assert round(access_nodes_dic[346684]) == 160

print('Success!')

**2.3.** (2 points) Select the administrative region (읍면동) in `emd_gdf` that contains the `hospital_node` and store it in a variable named `hospital_emd`.  
You may use the `.isin()` method from GeoPandas and the `.keys()` method from a dictionary.

<div style="text-align: center">
  <img src="./data/q2_3.jpg" width="700">
</div>


In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert type(hospital_emd) == gpd.GeoDataFrame
assert hospital_emd.shape == (12, 5)

print('Success!')

**2.4.** (2 points) Retrieve the travel time from `hospital_node` to each administrative region (읍면동) in `hospital_emd`, and store it in a new column named `time` in the `hospital_emd` GeoDataFrame.  
You can use the `.apply()` function with a lambda function to extract the travel time from the `access_nodes_dic` and apply it to the `hospital_emd` GeoDataFrame.

<div style="text-align: center">
  <img src="./data/q2_4.jpg" width="700">
</div>

In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert 'time' in hospital_emd.columns
assert hospital_emd.shape == (12, 6)
assert round(hospital_emd.loc[hospital_emd['ADM_NM'] == '회기동', 'time'].values[0]) == 141
assert round(hospital_emd.loc[hospital_emd['ADM_NM'] == '중화2동', 'time'].values[0]) == 597

print('Success!')

**2.5.** (2 points) Now, you will implement the Gaussian distribution as a distance decay function. <br> Run the following `gaussian` function and then calculate the weight of the hospital for each administrative region (읍면동) in `hospital_emd` using the Gaussian distribution with a threshold of 600 seconds. <br> <br>For example, the distance decay function means that the weight of the travel time is 1 when the travel time is 0 seconds, and the weight decreases as the travel time increases. <br> Store the result in a new column named `access` in the `hospital_emd` GeoDataFrame.

<div style="text-align: center">
  <img src="./data/q2_5.jpg" width="700">
</div>

```python
    # Run this cell to implement the gaussian function

    def gaussian(dij, d0):  # Gaussian probability distribution
        # dij: travel distance/time between i and j
        # d0: travel distance/time threshold
        # val: value of the Gaussian function
        
        if d0 >= dij:
            val = (math.exp(-1 / 2 * ((dij / d0) ** 2)) - math.exp(-1 / 2)) / (1 - math.exp(-1 / 2))
            return val
        else:
            return 0
```

For your information, the Gaussian distribution is defined as:

$$\LARGE
G(t_{ij}, t_0) =
\begin{cases}
\frac{e^{-\frac{1}{2} \left(\frac{t_{ij}}{t_0}\right)^2} - e^{-\frac{1}{2}}}{1 - e^{-\frac{1}{2}}} & \text{if } t_{ij} \leq t_0 \\\\
0 & \text{if } t_{ij} > t_0
\end{cases}
$$

where: <br>
$G(t_{ij}, t_0)$: the Gaussian distribution value at distance $t_{ij}$ with a threshold $t_0$. <br>
$e$: the base of the natural logarithm (approximately 2.71828). <br>

In [None]:
# Run this cell to implement the gaussian function

def gaussian(dij, d0):  # Gaussian probability distribution
    # dij: travel distance/time between i and j
    # d0: travel distance/time threshold
    # val: value of the Gaussian function
    
    if d0 >= dij:
        val = (math.exp(-1 / 2 * ((dij / d0) ** 2)) - math.exp(-1 / 2)) / (1 - math.exp(-1 / 2))
        return val
    else:
        return 0

In [None]:
# Your code here


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert 'access' in hospital_emd.columns
assert hospital_emd.shape == (12, 7)
assert round(hospital_emd.loc[hospital_emd['ADM_NM'] == '회기동', 'access'].values[0], 3) == 0.931
assert round(hospital_emd.loc[hospital_emd['ADM_NM'] == '중화2동', 'access'].values[0], 3) == 0.008

print('Success!')

## 3. Measure accessibility of hospital for the entire city

(9 points) With the parameters below, calculate the accessibility of hospitals for the entire city. Save the result in a new column named `access` in the `emd_gdf` GeoDataFrame. <br>
The expected result is shown in the image below. <br> 
- **Travel time threshold**: 30 minutes (1800 seconds) <br>
- **Distance decay function**: Gaussian distribution <br>
- **Supply**: Equal weight for all hospitals (1) <br>

<br>

<div style="text-align: center">
  <img src="./data/q3.png" width="600">
</div>




In [None]:
# Your code here



In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
assert emd_gdf.shape == (426, 6)
assert round(emd_gdf.loc[emd_gdf['ADM_NM'] == '회기동', 'access'].values[0], 3) == 2.65
assert emd_gdf.loc[emd_gdf['access'] == 0].shape[0] == 5

print('Success!')

You can also check the results with the following code:
```python
    # Windows
    # plt.rcParams['font.family'] = 'Malgun Gothic'

    # Mac
    # plt.rcParams['font.family'] = 'AppleGothic'

    import matplotlib.patheffects as pe
    gu_gdf = gpd.read_file('./data/gu_gdf.geojson')

    fig, ax = plt.subplots(figsize=(10, 10))
    emd_gdf.plot(column='access', cmap='Blues', legend=True, scheme='UserDefined', ax=ax,
                    classification_kwds={'bins': [1, 2, 3, 4, 5]},
                    )

    gu_gdf.boundary.plot(ax=ax, color='black', linewidth=0.5)

    # Annotation
    for idx, row in gu_gdf.iterrows(): # Iterate everyrow in `tsa` GeoDataFrame
        ax.text(s=row['ADM_NM'], # String to be displayed; TSA name
                x=row['geometry'].centroid.coords[:][0][0], # X coordinate of label
                y=row['geometry'].centroid.coords[:][0][1], # Y coordinate of label
                fontsize=15, 
                color='black',
                ha='center', # Horizontal align
                va='center', # Vertical align
                path_effects=[pe.withStroke(linewidth=2, foreground="white")] # This will create boundary of text
            )

    ax.get_xaxis().set_visible(False)  # Remove ticks and labels
    ax.get_yaxis().set_visible(False)  # Remove ticks and labels

    plt.show()
```

In [None]:
# Windows
# plt.rcParams['font.family'] = 'Malgun Gothic'

# Mac
plt.rcParams['font.family'] = 'AppleGothic'

import matplotlib.patheffects as pe
gu_gdf = gpd.read_file('./data/gu_gdf.geojson')

fig, ax = plt.subplots(figsize=(10, 10))
emd_gdf.plot(column='access', cmap='Blues', legend=True, scheme='UserDefined', ax=ax,
                classification_kwds={'bins': [1, 2, 3, 4, 5]},
                )

gu_gdf.boundary.plot(ax=ax, color='black', linewidth=0.5)

# Annotation
for idx, row in gu_gdf.iterrows(): # Iterate everyrow in `tsa` GeoDataFrame
    ax.text(s=row['ADM_NM'], # String to be displayed; TSA name
            x=row['geometry'].centroid.coords[:][0][0], # X coordinate of label
            y=row['geometry'].centroid.coords[:][0][1], # Y coordinate of label
            fontsize=15, 
            color='black',
            ha='center', # Horizontal align
            va='center', # Vertical align
            path_effects=[pe.withStroke(linewidth=2, foreground="white")] # This will create boundary of text
           )

ax.get_xaxis().set_visible(False)  # Remove ticks and labels
ax.get_yaxis().set_visible(False)  # Remove ticks and labels

plt.show()

### *You have finished Lab 7: Accessibility*
Please name your jupyter notebook as `GEOG4038_Lab7_[YOUR_STUDENT_ID].ipynb`, and upload it to https://e-campus.khu.ac.kr/. 