## Assessing Service Accessibility for Asylum Seekers in Chicago

In [1]:
import pandas as pd
import geopandas as gpd
import folium

from shapely.geometry import Point, MultiPolygon

pd.set_option('display.max_columns', None)

## I. Loading and Cleaning Data

In [2]:
# Location Data
shelters = pd.read_excel("../data/shelter_data.xlsx",sheet_name=0,engine="openpyxl")
public_libraries = pd.read_json("https://data.cityofchicago.org/resource/x8fc-8rcq.json")
grocery_shops = pd.read_json("https://data.cityofchicago.org/resource/ce29-twzt.json")
wifi_points = pd.read_json("https://data.cityofchicago.org/resource/4jzv-pgsc.json")
public_clinics = pd.read_json("https://data.cityofchicago.org/resource/kcki-hnch.json")
public_schools = pd.read_json("https://data.cityofchicago.org/resource/tz49-n8ze.json")
bus_stations = gpd.read_file("../data/cta_bus_stops/bus_stops_location.shp")
rail_stations = gpd.read_file("../data/cta_rail_stations/CTA_RailStations.shp")

# Boundary Data
neigh_bound = gpd.read_file("../data/neighborhood_boundaries/neigh_bound.shp")
parks_bound = gpd.read_file("../data/parks_boundaries/parks_bound.shp")

### I.a Geocode Datasets

The data sources don’t have a homogeneous format, so we need to ensure that the locations of shelters, grocery shops, Wi-Fi points, public schools, clinics, bus stops, and rail stops are properly geocoded. Furthermore, given that we’ll perform spatial operations is important to standardize all geospatial data is in the same coordinate reference system (CRS) for accurate distance measurement.

In [3]:
# Define function that geocode several datasets
def geocode_df(df,idx):
    cond_1 = ("latitude" and "longitude") in df.columns
    cond_2 = ("y" and "x") in df.columns
    if cond_1 or cond_2:
        if cond_2:
            df.rename(columns={"y": "latitude", "x": "longitude"}, inplace=True)
    else:
        df["latitude"] = df["location"].apply(lambda row: row["latitude"])
        df["longitude"] = df["location"].apply(lambda row: row["longitude"])
    
    df["geometry"] = [Point(lon, lat) for lon, lat in zip(df["longitude"], df["latitude"])]
    gdf = gpd.GeoDataFrame(df, crs="EPSG:4326", geometry=df["geometry"])

    return gdf

In [4]:
dfs_geocode = ["shelters", "public_libraries", "grocery_shops", "wifi_points",
               "public_clinics", "public_schools"]

for df_name in dfs_geocode:
    input_df = globals()[df_name]  # Get the DataFrame by name
    geo_df = geocode_df(input_df,df_name)  # Call the geocode_df function
    globals()["geo_" + df_name] = geo_df  # Assign the result to a new global variable
    print("Created geo_",df_name," using ", df_name,sep="")

Created geo_shelters using shelters
Created geo_public_libraries using public_libraries
Created geo_grocery_shops using grocery_shops
Created geo_wifi_points using wifi_points
Created geo_public_clinics using public_clinics
Created geo_public_schools using public_schools


### I.b Subsetting the Data

In [9]:
# Remove useless columns from each dataset
geo_public_libraries = geo_public_libraries[['name_','latitude','longitude','geometry']]
geo_grocery_shops = geo_grocery_shops[['community_area_name','latitude','longitude','geometry']]
geo_wifi_points = geo_wifi_points[['organization_name','latitude','longitude','geometry']]
geo_public_clinics = geo_public_clinics[['site_name','latitude','longitude','geometry']]
geo_public_schools = geo_public_schools[['school_nm','latitude','longitude','geometry']]
bus_stations = bus_stations[['public_nam','geometry']]
rail_stations = rail_stations[['LONGNAME','geometry']]
neigh_bound = neigh_bound[['pri_neigh','sec_neigh','geometry']]
parks_bound = parks_bound[['park','geometry']]

## II. Defining Buffer Zones

We need to define what distance is considered accessible for each type of service. After this, we’ll create buffer zones around each shelter. The size of these buffers will be based on the distances defined as accessible.

* **Public libraries**: Following a study from [Donelly (2015)](https://www.sciencedirect.com/science/article/abs/pii/S0740818815000869) and this post about [average distance to libraries in the US](https://atcoordinates.info/2016/02/22/average-distance-to-public-libraries-in-the-us/), we use a two-mile buffer to define accessibility to library services.

* **Grocery shops**: Based on research from the [Department of Agriculture](https://www.ers.usda.gov/data-products/food-access-research-atlas/documentation/), and [Wilde et. al (2017)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5998793/) a grocery store is considerable accesible in an urban area if it's within a 1 mile radius from a person's location.

* **Wi-Fi points**: Given the lack of studies on accesibility to internet services, we define a buffer zone of 1 mile to Wi-Fi points.

* **Public Clinics**: Based on a paper from [Luo and Wang (2021)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8238135/) on Spatial Accessibility to Healthcare in Chicago, we define a buffer zone of 15 miles for public hospitals.

* **Public Schools**: Using a study from the [US Department of Transportation](https://nhts.ornl.gov/briefs/travel%20to%20school.pdf) and from the [Kinder Institute for Urban Research](https://kinder.rice.edu/research/staying-neighborhood-examining-distance-zoned-schools-and-access-transportation#:~:text=The%20average%20distance%20between%20an,to%20enroll%20in%20that%20school.), we define a school as accesible if it's within a 2 mile radius from a family household in urban areas.

* **Transportation**: Based on research from the [US Department of Transportation](https://safety.fhwa.dot.gov/ped_bike/ped_transit/ped_transguide/ch4.cfm), most people are willing to walk for five to ten minutes, or approximately a quarter of half mile to a transit stop. Given that Chicago is one of the major cities in the US, we define a 1/2 mile radius as accessible distance.

## III. Calculating Accesibility Indices

### III.a Individual Indices

For each shelter, calculate how many of each type of service (grocery shops, wifi points, etc.) fall within its buffer zone using spatial operations. This will give us a raw count of accessible services. To make these counts comparable across shelters, we’ll normalize them in a uniform range (e.g., 0 to 1 or 0 to 100). For sake of simplicity, we’ll treat every service as equally important so there is no need to define specific weights.

### III.b Aggregated Index

After calculating the normalized individual indices, we’ll combine them into a final index for each shelter. This can be a simple average if all services are considered equally important.

## IV. Visualization

Finally, we’ll use the final aggregated index to rank shelters in terms of their access to essential services. We’ll then create a geospatial visualization to represent the accessibility of each shelter visually.