# Analysis of Emergency Obstetric Care (EmOC) in Kano
> Note: All notebooks need the [environment dependencies](https://github.com/GIScience/openrouteservice-examples#local-installation)
> as well as an [openrouteservice API key](https://openrouteservice.org/dev/#/signup) to run

prepare environment dependencies document

## Abstract
The rapid growth of urban areas has put substantial pressure on local services and infrastructure, particularly in African cities. With migrants moving into cities and transient households moving within cities, traditional means of collecting data (e.g., censuses and household surveys) are inadequate and often overlook informal settlements and households. As a consequence, there is a chronic lack of basic data about deprived households and entire settlements. Given that urban poor residents rely predominantly on private and informal service providers for healthcare and other services, they are rarely captured in routine service data, including health information management systems. This is even more critical for women in need of maternal health care. 

Considering the different phases of maternity: antenatal care, interpartern or delivery, and postnatal care, the team decided to focus on interpartern or delivery phase being the most critical. The intertwined relationship between maternal health care and urban deprivation has been documented and described in the literature [Abascal et al., 2022](https://doi.org/10.1016/j.compenvurbsys.2022.101770). The IDEAMAPS Data Ecosystem team aims to analyse the conditions in which vulnerable communities relate to emergency maternal care (EmOC) in the city of Kano. To do so, the analysis is divided into three main components: 
1. **EmOC Offer**: Based on the geospatial database of travel times [(Macharia et al., 2023)](https://doi.org/10.1038/s41597-023-02651-9) and the team's field validation, we characterised 145 HC facilities offering EmOC in Kano, their service levels and relative costs.
2. **EmOC Accessibility**: The team used different routing services, including the OSM-based openrouteservice API, to calculate the travel times to the nearest EmOC facility for each 100x100m grid cell in Kano. 
3. **EmOC Demand**: The team discussed a set of socio-economic factors that determine the way communities from slums and other deprived areas demand or interact with EmOC services such as available income, employment, education, age, medical practitioners' age and gender as well as religious beliefs and social practices. despite not having access to specific data, the team discussed the potential impacts on demand for EmOC services in Kano based on these factors.



### Workflow:

The notebook gives an overview of the distribution of centres offering EmOC in Kano, their classification and how they can be accessed by car. Open source data from OpenStreetMap and tools (such as the openrouteservice) were used to create accessibility measures such as travel times and isochrones. Spatial analysis and other data analytics functions led to generating outputs within the 100x100m grid cells that categorised them into three levels: low, medium, and high.

* **Preprocessing**: Get data for EmOC facilities.
* **Analysis for Offer**:
    * Filter and classify EmOC facilities based on discussed criteria.
    * Visualise EmOC faccilities in their categories.
* **Analysis for Accessibility**:
    * Compute travel times to facilities using openrouteservice API or other routing services.
    * Generate areas for low, medium and high categories based on discussed criteria.
* **Analysis for Demmand**:
    * Derive socio-economic descriptors based on discussed criteria.
* **Result**: Visualize results as maps and export model outputs.


### Datasets and Tools:
* [A geospatial database of close to reality travel times to obstetric emergency care in 15 Nigerian conurbations](https://figshare.com/s/8868db0bf3fd18a9585d) - A curated list of health care facilities offering EmOC in Nigeria [(Macharia et al., 2023)](https://doi.org/10.1038/s41597-023-02651-9).
* [openrouteservice](https://openrouteservice.org/) - generate isochrones on the OpenStreetMap road network


# Python Workflow

This study integrates various Python geospatial analysis libraries and packages to support spatial data processing, visualization, and isochrone generation. The os module is used to interact with the operating system, managing file paths and reading environment variables such as API keys. folium library along with its MarkerCluster plugin, facilitates the creation of interactive maps for visualizing large-scale geospatial data. The openrouteservice.client serves as an interface to the OpenRouteService API, enabling the extraction of isochrones. pandas library for data analysis, provides functions for analyzing, cleaning, exploring, and manipulating data, while fiona supports reading and writing real-world data using multi-layered GIS formats, such as shapefiles. The shapely package is employed for the manipulation and analysis of planar geometric objects.

## Setting up the virtual environment

```bash
# Create a new virtual environment
python -m venv .venv
activate .venv/bin/activate
pip install -r requirements.txt
```

## To run your notebook in VS Code

```bash
pip install -U ipykernel
python -m ipykernel install --user --name=.venv
```

In [23]:
import os
from IPython.display import display

import folium
from folium.plugins import MarkerCluster
import openrouteservice
import time

import time
import pandas as pd
import numpy as np
import fiona as fn
import geopandas as gpd
from shapely.geometry import shape, mapping
from shapely.geometry import Point
from shapely.geometry import box
from scipy.spatial import cKDTree
from tqdm import tqdm

import rasterio
from rasterio.transform import xy
from rasterio.mask import mask
import rasterstats as rs

import math

## Preprocessing
In this study, users first requested an ORS Matrix API key from the [OpenRouteService](https://openrouteservice.org/) platform and subsequently interacted with the OpenRouteService API through the instantiation of the OpenRouteService client. This is the OpenRouteService [API documentation](https://openrouteservice.org/dev/#/api-docs/introduction) for ORS Core-Version 9.0.0. 

Generate a [API Key](https://openrouteservice.org/dev/#/home?tab=1) (Token) it is necessary to sign up at the OpenRouteService dashboard by using your E-mail address or sign up with your GitHub. After logging in, go to the Dashboard by clicking on your profile icon and navigate to the API Keys section. Click "Create API Key" to generate a free key and then choose a service plan (the free plan has limited requests per day). Copy the API Key and store it securely. 

OpenRouteService primarily uses API keys for authentication. However, if a token is required for certain endpoints, you can send a request with your API key in the Authorization header. This process facilitated various geospatial analysis functions, including isochrone generation.

### API Key
Make sure you have a .env file in the root directory with the following content:
```bash
    OPENROUTESERVICE_API_KEY='your_api_key'
```

In [None]:
# Read the api key from the .env file
from dotenv import load_dotenv
%load_ext dotenv
%dotenv
api_key = os.getenv('OPENROUTESERVICE_API_KEY')
ors = client.Client(key=api_key)

For this study different kind of data were used. The dataset on healthcare facilities is sourced from a research ([Macharia, P.M. et al., 2023](https://doi.org/10.1038/s41597-023-02651-9)) which provides A geospatial database of close-to-reality travel times to obstetric emergency care in 15 Nigerian conurbations. The dataset were filtered by state name to isolate facilities in Kano and converted CSV file to shapefile based on coordinates using [QGIS](https://qgis.org/). 

The Level 2 administrative boundary data is sourced from [Humanitarian Data Exchange](https://data.humdata.org/) were used to correlate the isochrones and healthcare facility distribution with specific administrative regions. The data were filtered based on the administrative region name (lganame) to focus the analysis on Kano.

Despite being official, administrative boundaries may not reflect the actual patterns of human settlement or economic activity. Therefore, the team used the Functional Urban Area (FUA) as a complementary definition of the study areas. The FUA is defined by [the Joint Research Centre of the European Commission](https://commission.europa.eu/about/departments-and-executive-agencies/joint-research-centre_en) as the actual urban sprawl and human activities, encompassing the core city and economically or socially integrated surrounding regions. The FUA was obtained from [the Global Human Settlement Layer (GHSL) ](https://human-settlement.emergency.copernicus.eu/)dataset, which provides spatial data for functional urban areas worldwide. 

* [Datasets of health facilities](https://doi.org/10.6084/m9.figshare.22689667.v2) (15/07/2023)
* [Shapefile of district boundaries](https://data.humdata.org/dataset/nigeria-admin-level-2) - Admin Level 2 (data from Humanitarian Data Exchange, 25/11/2015)
* [Functional Urban Areas](https://human-settlement.emergency.copernicus.eu/download.php?ds=FUA) - data from Global Human Settlement Layer(2015)

In [2]:
# Set paths to access data
# Define directories
data_inputs = '../scripts/data_inputs/'
data_temp = '../scripts/data_temp/'
data_outputs = '../scripts/data_outputs/'

## 1. Data Collection

### Validated healthcare facilities

In [3]:
health_facilities_validated = data_inputs + 'Healthcare_facilities_validated.csv'
healthcare_facilities = pd.read_csv(health_facilities_validated)

In [4]:
# Filter the rows where the 'Validation of HCFs Categorization' column Public/Private Basic EmOC
valid_categories = ['Public Comprehensive EmOC']
healthcare_facilities = healthcare_facilities[healthcare_facilities['Validation of HCFs Categorization'].isin(valid_categories)]

# Create geometry column from longitude and latitude
healthcare_facilities['geometry'] = [Point(np.array([lon, lat])) for lon, lat in zip(healthcare_facilities['longitude'], healthcare_facilities['latitude'])]

# Convert to GeoDataFrame with the correct CRS
healthcare_facilities = gpd.GeoDataFrame(healthcare_facilities, geometry='geometry', crs="EPSG:4326")

# Save the filtered GeoDataFrame as a GeoPackage
healthcare_facilities.to_file(data_inputs + 'healthcare_facilities_validated.gpkg', driver="GPKG")

In [5]:
healthcare_facilities

Unnamed: 0,orig_order,state,lga,ward,urban_conurb,uid,facility_code,ontime_code,facility_name,reg_number,...,longitude,operation_status,registration_status,license_status,created,last_updated,last_updated_ontime,Validation of HCFs Categorization,Unnamed: 33,geometry
2,1188.0,9.0,Dala,Gobirawa,9.0,23493984.0,19/42/1/2/1/0001,100901003.0,Mariya Sanusi General Hospital,,...,8.473443,Operational,Unknown,Unknown,2018-01-01 01:01:01,2020-01-10 08:28:17,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.47344 12.05615)
9,1195.0,9.0,Dala,Kofar Ruwa,9.0,,,100901010.0,National Orthopaedic Hospital,,...,8.563672,Operational,,,,,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.56367 11.99456)
10,1196.0,9.0,Dawakin Kudu,Dawaki,9.0,66247006.0,19/09/1/2/1/0001,100902001.0,Dawakin Kudu General Hospital,,...,8.58769,Operational,Registered,Licensed,2018-01-01 01:01:01,2020-01-02 14:17:58,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.58769 11.83897)
13,1199.0,9.0,Dawakin Tofa,Dawakin East,9.0,29007435.0,19/10/1/2/1/0001,100903002.0,Dawakin Tofa General Hospital,,...,8.331265,Operational,Registered,Licensed,2018-01-01 01:01:01,2020-01-02 14:18:50,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.33127 12.10734)
19,1205.0,9.0,Fagge,Kwachiri,9.0,18135799.0,19/12/1/2/1/0001,100904005.0,Nigerian Armed Forces Specialist Hospital,,...,8.517041,Operational,Registered,Licensed,2018-01-01 01:01:01,2020-01-13 11:28:44,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.51704 12.03325)
24,1210.0,9.0,Fagge,Kwachiri,9.0,12757068.0,19/12/1/2/1/0004,100904010.0,465 Nigerian Airforce Base Hospital,,...,8.53178,Operational,Registered,Licensed,2018-01-01 01:01:01,2019-12-30 22:56:13,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.53178 12.04531)
42,1228.0,9.0,Gezawa,Ketawa,9.0,31150772.0,19/18/1/2/1/0001,100905002.0,Gezawa General Hospital,,...,8.751751,Operational,Unknown,Unknown,2018-01-01 01:01:01,2020-01-06 10:14:29,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.75175 12.08834)
62,1248.0,9.0,Kano Municipal,Tudun Nufawa,9.0,22027518.0,19/21/1/2/1/0005,100907006.0,Sabo Bakin Zuwo General Hospital,,...,8.50923,Operational,Registered,Licensed,2018-01-01 01:01:01,2020-05-02 12:50:57,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.50923 12.00065)
63,1249.0,9.0,Kano Municipal,She-She,9.0,78857304.0,19/21/1/2/1/0002,100907007.0,Marmara General Hospital,,...,8.5124,Operational,Registered,Licensed,2018-01-01 01:01:01,2020-01-10 08:33:08,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.51240 11.99712)
65,1251.0,9.0,Kano Municipal,Kankarofi,9.0,30647869.0,19/21/1/2/1/0004,100907009.0,Nuhu Bamalli General Hospital,,...,8.52929,Operational,Registered,Licensed,2018-01-01 01:01:01,2020-01-13 11:35:23,28/09/2022 09:00,Public Comprehensive EmOC,,POINT (8.52929 11.99102)


### Population Grid Data (1km resolution) from WorldPop

In [6]:
FUA = gpd.read_file(data_inputs + 'functional_area.gpkg')
# Read the raster dataset
raster_path = data_inputs + 'nga_f_15_49_2015_1km.tif'

In [7]:
# Ensure raster dataset is open within the 'with' block
with rasterio.open(raster_path) as dataset:
    population_data = dataset.read(1)  # Read the first band of the raster
    transform = dataset.transform  # Get affine transformation parameters
    
    # Clip the raster using the FUA geometry
    geometries = [FUA.geometry.unary_union.__geo_interface__]
    clipped_image, clipped_transform = mask(dataset, geometries, crop=True)

    # Update the metadata for the clipped image
    clipped_meta = dataset.meta.copy()
    clipped_meta.update({
        "height": clipped_image.shape[1],
        "width": clipped_image.shape[2],
        "transform": clipped_transform
    })
    
    # Extract the centroids of non-zero population grid cells from the clipped image
    rows, cols = np.where(clipped_image[0] > 0)  # Use clipped_image to get non-zero population
    grid_cells = [[*transform * (col + 0.5, row + 0.5)] for row, col in zip(rows, cols)]

    # Get the population values for these non-zero grid cells
    population_values = clipped_image[0][rows, cols]  # Extract population values for non-zero cells

# Filter out grid cells with population greater than 50
grid_cells_filtered = []
population_filtered = []

for i in range(len(population_values)):
    if population_values[i] > 50:  # Only include grid cells with population > 50
        grid_cells_filtered.append(grid_cells[i])
        population_filtered.append(population_values[i])

In [8]:
# Save the grid cells (centroids) with population data to a CSV
grid_df = pd.DataFrame(grid_cells, columns=["longitude", "latitude"])
grid_df['population'] = population_values  # Add the population count for each grid cell

# Generate unique grid codes
uid_set = set()
def generate_unique_uid():
    uid = np.random.randint(10000, 100000)
    while uid in uid_set:
        uid = np.random.randint(10000, 100000)
    uid_set.add(uid)
    return uid

grid_df['grid_code'] = [generate_unique_uid() for _ in range(len(grid_df))]

# Save the DataFrame to CSV
grid_csv_path = data_inputs + 'population_centroids.csv'
grid_df.to_csv(grid_csv_path, index=False)

In [9]:
grid_df

Unnamed: 0,longitude,latitude,population,grid_code
0,2.881250,13.882917,194.130783,32129
1,2.872917,13.874583,451.999268,36562
2,2.881250,13.874583,161.425812,35674
3,2.864583,13.866250,276.582611,96921
4,2.872917,13.866250,453.410889,60605
...,...,...,...,...
1685,2.914583,13.407917,141.400116,39787
1686,2.889583,13.399583,66.997810,38803
1687,2.897917,13.399583,187.014084,68088
1688,2.906250,13.399583,148.789215,31072


## 2. Spatial Analysis Pipeline
### Using OpenRouteService (ORS) Matrix API to calculate the travel time and distance from each population grid centroid to the nearest healthcare facility 

In [19]:
# insert your ORS api key
api_key = '5b3ce3597851110001cf6248ff032155cbce4db1a0e2e70efb739a13'
client = openrouteservice.Client(key=api_key)

In [20]:
healthcare_facilities = gpd.read_file(data_inputs + 'healthcare_facilities_validated.gpkg')
grid_df = gpd.read_file(data_inputs + 'population_centroids.gpkg')

In [210]:
origin_gdf = grid_df
origin_name_column = 'grid_code'
destination_gdf = healthcare_facilities.dropna(subset=['geometry'])
destination_name_column = 'facility_name'

In [211]:
origin_gdf["longitude"] = origin_gdf.geometry.x
origin_gdf["latitude"] = origin_gdf.geometry.y

destination_gdf["longitude"] = destination_gdf.geometry.x
destination_gdf["latitude"] = destination_gdf.geometry.y

In [212]:
origins = list(zip(origin_gdf.geometry.x, origin_gdf.geometry.y))
destinations = list(zip(destination_gdf.geometry.x, destination_gdf.geometry.y))
locations = origins + destinations

In [213]:
origins_index = list(range(0, len(origins)))
destinations_index = list(range(len(origins), len(locations)))

In [214]:
batch_size = 20 # batch processing, 20 grids per time
request_counter = 0
duration_matrix = []

for i in range(0, len(origins), batch_size):
    if request_counter == 40:
        time.sleep(60)
        request_counter = 0  

    sources_batch = origins[i:i + batch_size]
    body = {
        "locations": sources_batch + destinations,
        "sources": list(range(len(sources_batch))),  
        "destinations": list(range(len(sources_batch), len(sources_batch) + len(destinations))),  
        "metrics": ['distance', 'duration']
    }

    try:
        response = requests.post('https://api.openrouteservice.org/v2/matrix/driving-car', json=body, headers=headers)
        response.raise_for_status()  

        duration_matrix.append(response.json())
        request_counter += 1

        if len(duration_matrix) % 50 == 0:
            time.sleep(20)

    except requests.exceptions.RequestException as err:
        time.sleep(10)

print(f"Completed {len(duration_matrix)} requests.")

Completed 84 requests.


In [219]:
response = requests.post('https://api.openrouteservice.org/v2/matrix/driving-car', json=body, headers=headers)

print(f"Status Code: {response.status_code}")

print(f"Response Content: {response.text}")

Status Code: 403
Response Content: {
    "error": "Quota exceeded"
}


In [190]:
distances = response.json()['distances']
distances

[[779520.94,
  793625.94,
  814492.56,
  766195.81,
  785995.81,
  788711.31,
  816226.69,
  787205.69,
  788453.94,
  790600.31,
  802855.25,
  827062.5,
  791710.31,
  793781.63,
  794885.63,
  794427.25,
  779204.94],
 [787360.44,
  801465.5,
  822332.13,
  774035.31,
  793835.31,
  796550.88,
  824066.19,
  795045.19,
  796293.5,
  798439.81,
  810694.75,
  834902.0,
  799549.88,
  801621.13,
  802725.13,
  802266.75,
  787044.5],
 [786819.63,
  800924.63,
  821791.25,
  773494.5,
  793294.5,
  796010.0,
  823525.38,
  794504.38,
  795752.63,
  797899.0,
  810153.94,
  834361.19,
  799009.0,
  801080.31,
  802184.31,
  801725.94,
  786503.69],
 [780621.69,
  794726.75,
  815593.31,
  767296.56,
  787096.56,
  789812.13,
  817327.44,
  788306.44,
  789554.75,
  791701.06,
  803956.0,
  828163.25,
  792811.13,
  794882.38,
  795986.38,
  795528.0,
  780305.75],
 [779712.25,
  793817.31,
  814683.94,
  766387.13,
  786187.19,
  788902.69,
  816418.06,
  787397.0,
  788645.31,
  790791

In [191]:
durations = response.json()['durations']
durations

[[36082.53,
  36952.84,
  37795.06,
  35671.26,
  36509.13,
  36800.58,
  38211.63,
  36704.46,
  36756.05,
  36682.0,
  37317.34,
  38213.54,
  36766.77,
  36964.05,
  36875.58,
  36885.84,
  36066.2],
 [36650.47,
  37520.79,
  38363.01,
  36239.2,
  37077.08,
  37368.52,
  38779.57,
  37272.4,
  37324.0,
  37249.95,
  37885.29,
  38781.48,
  37334.71,
  37531.99,
  37443.52,
  37453.79,
  36634.14],
 [36585.57,
  37455.89,
  38298.11,
  36174.3,
  37012.18,
  37303.63,
  38714.68,
  37207.5,
  37259.1,
  37185.05,
  37820.39,
  38716.58,
  37269.81,
  37467.09,
  37378.63,
  37388.89,
  36569.25],
 [36214.62,
  37084.93,
  37927.15,
  35803.35,
  36641.23,
  36932.67,
  38343.72,
  36836.55,
  36888.15,
  36814.09,
  37449.43,
  38345.63,
  36898.86,
  37096.14,
  37007.67,
  37017.93,
  36198.29],
 [36105.49,
  36975.8,
  37818.02,
  35694.22,
  36532.09,
  36823.54,
  38234.59,
  36727.42,
  36779.02,
  36704.96,
  37340.3,
  38236.5,
  36789.73,
  36987.01,
  36898.54,
  36908.8,


In [216]:
for origin_index, item in origin_gdf.iterrows():
    origin_name = item[origin_name_column]
    origin_x = item.geometry.x
    origin_y = item.geometry.y
    origin_durations = durations[origin_index]

    min_duration, min_index = min((duration, idx) for idx, duration in enumerate(origin_durations))
    destination_index = destinations_index[min_index]
    destination_x, destination_y = locations[destination_index]

    distance = distances[origin_index][min_index]

    filtered = healthcare_facilities[
        (destination_gdf.geometry.x == destination_x) & (destination_gdf.geometry.y == destination_y)
    ]

    if not filtered.empty:
        destination_row = filtered.iloc[0]
        destination_name = destination_row[destination_name_column]
        
duration_matrix.append(
    [origin_name, origin_y, origin_x, destination_name, destination_y, destination_x, distance, min_duration])

IndexError: list index out of range

In [None]:
# Origin: population grid cell, Desitination: HCF
results_df = pd.DataFrame(results, columns=["origin_name","origin_lon", "origin_lat", "destination_name","dest_lon", "dest_lat", "distance", "duration"])
output_csv = data_inputs + 'nearest_facility_travel_time.csv'
results_df.to_csv(output_csv, index=False)

In [None]:
matrix_df = pd.DataFrame(duration_matrix, 
                  columns =['origin_name', 'origin_lon', 'origin_lat', 'destination_name', 'dest_lon', 'dest_lat', 'distance', 'duration'])
matrix_df

In [None]:
output_file = 'matrix.gpkg'
output_path = os.path.join(output_folder, output_file)

origin_gdf.to_file(driver='GPKG', filename=output_path, layer='origins')
destination_gdf.to_file(driver='GPKG', filename=output_path, layer='destinations')
matrix_gdf.to_file(driver='GPKG', filename=output_path, layer='duration_matrix')

## Enhanced Two-Step Floating Catchment Area (E2SFCA) method

In [92]:
origin_dest = pd.read_csv(data_inputs + 'nearest_facility_travel_time.csv')

In [88]:
# Function
from math import *
d = 10 * 60 # try max duration 10mins car
W = 0.1
beta = - d ** 2 / log(W)
print(beta)

156346.01348517067


In [176]:
# Generate a unique code for Each HCF
origin_dest['unique_code'] = origin_dest[['dest_lon', 'dest_lat']].apply(lambda x: hash(tuple(x)), axis=1)
origin_dest['grid_code'] = origin_dest[['origin_lon', 'origin_lat']].apply(lambda x: hash(tuple(x)), axis=1)

In [177]:
print(origin_dest.head())

   origin_lon  origin_lat  dest_lon   dest_lat  distance  duration  \
0    8.512500   12.245833  8.461099  12.068206   1044.27  22674.95   
1    8.504167   12.237500  8.461099  12.068206   1164.85  24349.76   
2    8.512500   12.237500  8.461099  12.068206    975.57  21538.03   
3    8.520833   12.237500  8.461099  12.068206    993.45  21960.36   
4    8.495833   12.229167  8.461099  12.068206   1036.83  20817.05   

           unique_code            grid_code  Weight  Pop_W  
0 -8427746937895610222  6161116049963123496     0.0    0.0  
1 -8427746937895610222  7417919066492957684     0.0    0.0  
2 -8427746937895610222 -4295567797866426869     0.0   -0.0  
3 -8427746937895610222  4538009176888898832     0.0    0.0  
4 -8427746937895610222  -559700342194000249     0.0   -0.0  


In [178]:
# Convert 'duration' to numeric, coercing errors to NaN
origin_dest['duration'] = pd.to_numeric(origin_dest['duration'], errors='coerce')

# Drop rows with NaN values in 'duration' column
origin_dest = origin_dest.dropna(subset=['duration'])
origin_dest['grid_code'] = pd.to_numeric(origin_dest['grid_code'], errors='coerce')

origin_dest_acc = origin_dest  # Backup

In [179]:
# Apply Gaussian decay function to calculate the weight of each grid to healthcare facilities based on the travel duration. d is the travel time and beta is the decay parameter previously calculated.
# The weight decreases as the duration increases, meaning facilities that are further away have less impact.
origin_dest_acc['Weight'] = origin_dest_acc['duration'].apply(lambda d: round(math.exp(-d**2/beta), 8))

In [180]:
# Compute the Weighted Population (Pop_W), the population of each grid cell is multiplied by the corresponding weight to calculate the weighted population.
origin_dest_acc['Pop_W'] = origin_dest_acc['grid_code'] * origin_dest_acc['Weight']

In [181]:
# Sum the Weighted Population for Each Healthcare Facility
# It aggregates the population from all grid cells contributing to each healthcare facility
origin_dest_sum = origin_dest_acc.groupby(by='unique_code')['Pop_W'].sum().reset_index()

In [182]:
# Merge the Sum of Weighted Population Back into the Original Data
origin_dest_acc = origin_dest_acc.merge(origin_dest_sum, on='unique_code')

In [183]:
# supply value is set to 1 for simplicity (capacity of HCF)
supply = 1
origin_dest_acc = origin_dest_acc.rename(columns={'Pop_W_y': 'Pop_W_S'})  # Pop_W_S: Population Weight Sum

In [184]:
# Compute the Supply-Demand Ratio (Rj)
origin_dest_acc['supply_demand_ratio'] = 1 / origin_dest_acc.Pop_W_S
origin_dest_acc['supply_demand_ratio'].replace([np.inf, np.nan], 0, inplace=True)

In [185]:
# Calculate Rj * Weight for Each Grid Cell
origin_dest_acc['supply_W'] = origin_dest_acc['supply_demand_ratio'] * origin_dest_acc.Weight

In [186]:
# Compute Accessibility Index (Ai) for Each Grid Cell
origin_dest_acc['Accessibility'] = origin_dest_acc.groupby('grid_code')['supply_W'].transform('sum')

In [187]:
# Normalize
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
origin_dest_acc['Accessibility_standard'] = scaler.fit_transform(origin_dest_acc[['Accessibility']])

In [188]:
origin_dest_acc

Unnamed: 0,origin_lon,origin_lat,dest_lon,dest_lat,distance,duration,unique_code,grid_code,Weight,Pop_W_x,Pop_W_S,supply_demand_ratio,supply_W,Accessibility,Accessibility_standard
0,8.512500,12.245833,8.461099,12.068206,1044.27,22674.95,-8427746937895610222,6161116049963123496,0.000000e+00,0.000000e+00,2.353813e+17,4.248425e-18,0.000000e+00,0.000000e+00,1.067727e-17
1,8.504167,12.237500,8.461099,12.068206,1164.85,24349.76,-8427746937895610222,7417919066492957684,0.000000e+00,0.000000e+00,2.353813e+17,4.248425e-18,0.000000e+00,0.000000e+00,1.067727e-17
2,8.512500,12.237500,8.461099,12.068206,975.57,21538.03,-8427746937895610222,-4295567797866426869,0.000000e+00,-0.000000e+00,2.353813e+17,4.248425e-18,0.000000e+00,0.000000e+00,1.067727e-17
3,8.520833,12.237500,8.461099,12.068206,993.45,21960.36,-8427746937895610222,4538009176888898832,0.000000e+00,0.000000e+00,2.353813e+17,4.248425e-18,0.000000e+00,0.000000e+00,1.067727e-17
4,8.495833,12.229167,8.461099,12.068206,1036.83,20817.05,-8427746937895610222,-559700342194000249,0.000000e+00,-0.000000e+00,2.353813e+17,4.248425e-18,0.000000e+00,0.000000e+00,1.067727e-17
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1995,8.495833,11.754167,8.424150,11.778010,812.77,12135.34,-5045608071072975233,-5675272957297663404,0.000000e+00,-0.000000e+00,2.920534e+16,3.424031e-17,0.000000e+00,0.000000e+00,1.067727e-17
1996,8.420833,11.745833,8.424150,11.778010,434.77,10131.45,-5045608071072975233,-1423105009493966142,0.000000e+00,-0.000000e+00,2.920534e+16,3.424031e-17,0.000000e+00,0.000000e+00,1.067727e-17
1997,8.429167,11.745833,8.424150,11.778010,160.26,3784.06,-5045608071072975233,5310152199856200921,3.806000e-05,2.021044e+14,2.920534e+16,3.424031e-17,1.303186e-21,1.303186e-21,1.067857e-17
1998,8.437500,11.745833,8.424150,11.778010,258.30,4434.23,-5045608071072975233,-8701372282082393122,8.500000e-07,-7.396166e+12,2.920534e+16,3.424031e-17,2.910426e-23,2.910426e-23,1.067730e-17
