# Travel Time Matrix DEMO

Authors: [Irene Farah](https://www.linkedin.com/in/imfarah/),  [Julia Koschinsky](https://www.linkedin.com/in/julia-koschinsky-657599b1/), [Logan Noel](https://www.linkedin.com/in/lmnoel/).   
Contact: [Julia Koschinsky](mailto:jkoschinsky@uchicago.edu)  

Research assistance of [Shiv Agrawal](http://simonlab.uchicago.edu/people/ShivAgrawal.html), [Caitlyn Tien](https://www.linkedin.com/in/caitlyn-tien-0b784b161/) and [Richard Lu](https://www.linkedin.com/in/richard-lu-576874155/) is gratefully acknowledged.

Center for Spatial Data Science  
University of Chicago  

July 30, 2019

**_Input Requirements_**  

In order to construct a travel time matrix, the csv table should contain **ID, latitude, longitude** for the origins and destinations. Note that destinations need to be constrained to the spatial extent of the origins.

In [1]:
from spatial_access.p2p import *

In [None]:
cd ../..

In [4]:
%matplotlib inline

**View structure of data example: Health Facilities in Chicago.**  
Health Facilities Data: http://makosak.github.io/chihealthaccess/index.html

5 sources (tract centroids):

In [5]:
df_sources = pd.read_csv('./data/input_data/sources/tracts2010.csv')
df_sources.head()

Unnamed: 0,geoid10,lon,lat,Pop2014,Pov14,community
0,17031842400,-87.63004,41.742475,5157,769,44
1,17031840300,-87.681882,41.832094,5881,1021,59
2,17031841100,-87.635098,41.851006,3363,2742,34
3,17031841200,-87.683342,41.855562,3710,1819,31
4,17031838200,-87.675079,41.870416,3296,361,28


5 destinations (health facilities):

In [6]:
df_dests = pd.read_csv('./data/input_data/destinations/health_chicago.csv')
df_dests.head()

Unnamed: 0,ID,Facility,lat,lon,Type,target,category,community
0,1,"American Indian Health Service of Chicago, Inc.",41.956676,-87.651879,5,127000,Other Health Providers,3
1,2,Hamdard Center for Health and Human Services,41.997852,-87.669535,5,190000,Other Health Providers,77
2,3,Infant Welfare Society of Chicago,41.924904,-87.71727,5,137000,Other Health Providers,22
3,4,Mercy Family - Henry Booth House Family Health...,41.841694,-87.62479,5,159000,Other Health Providers,35
4,6,Cook County - Dr. Jorge Prieto Health Center,41.847143,-87.724975,5,166000,Other Health Providers,30


### Travel Time Matrices  

**Specifications for the asymmetric and symmetric distance matrices:**  

- **network_type**: can be walk, drive, bike, or otp (otp allows you to read in an external file from OpenTripPlanner)
- **primary_input**: sources file
- **secondary_input**: destinations file (omit to calculate an NxN matrix on the primary_input)
- **read_from_file**: tmx or csv filename (read in external matrix files)
- **primary_hints**: dictionary that contains column names (lat/lon/ID)
- **secondary_hints**: dictionary that contains column names (lat/lon/ID)
- **debug**: if set to `True` enables to see more detailed logging output
- **configs**: defaults to None, else pass in an instance of **Configs.py** to override default values.  
    The following arguments in **configs** can be changed:  
    - walk_speed: numeric (km/hr). Default is set to 5 km/hr.
    - bike_speed: numeric (km/hr). Default is set to 15.5 km/hr.
    - default_drive_speed: numeric (km/hr). Default is set to 40 km/hr.
    - walk_node_penalty:  numeric (seconds). Default is set to 0.
    - bike_node_penalty:  numeric (seconds). Default is set to 0.
    - drive_node_penalty:  numeric (seconds). Default is set to 4.
    - speed_limit_dict: dictionary {edge type (string) : speed in km/hr}
    - use_meters: if `True` output will be in meters. If `False`, output will be in seconds.
    - disable_area_threshold: enables computation for areas exceeding the bounding box area constraint (set to 5,000 squared km in NetworkInterface.py).
    - require_extended_range: If true, use unsigned integers instead of unsigned shorts for value type to increase max range.
    - epsilon: factor by which to increase the requested bounding box. Increasing epsilon may result in increased accuracy for points at the edge of the bounding box, but will increase computation times. Default is set to 0.05.

## Asymmetric Travel Time Matrix 
---
You can create an asymmetric matrix from source to destination points (takes ~ 20 min for this example). This is useful when you only need to generate results once (as opposed to repeatedly for the same origins but different destinations).

**Please map your latitude and longitude before reading them in to make sure they are correct. E.g. if incorrect lat-long values are far outside of your actual spatial extent, the results will take an excessively long time to compute or stall.**

**WALKING**

In [19]:
# Calculate asymmetric distance matrix for walking (takes ~6 minutes to run) 
w_asym_mat = TransitMatrix('walk', 
                           primary_input='./data/input_data/sources/tracts2010.csv',
                           secondary_input='./data/input_data/destinations/health_chicago.csv',
                           primary_hints={'idx' : 'geoid10', 'population': 'skip', 'lat': 'lat', 'lon': 'lon'},
                           secondary_hints={'idx': 'ID', 'capacity': 'skip', 'category': 'category', 'lat': 'lat', 'lon': 'lon'})
                       
w_asym_mat.process()

INFO:spatial_access.p2p:Approx area of bounding box: 2,445.05 sq. km
INFO:spatial_access.p2p:All operations completed in 14.79 seconds


In [None]:
#Saved as walk_asym_health_tracts.csv
w_asym_mat.write_csv(outfile = "./data/output_data/matrices/walk_asym_health_tracts.csv")

In [None]:
# Saved as walk_asym_health_tracts.tmx
w_asym_mat.write_tmx(outfile = "./data/output_data/matrices/walk_asym_health_tracts.tmx")

**Example of overriding default Configs 

Here we are disabling the large bounding box constraint and lowering the drive speed. We are keeping the default output of the matrix set to travel times as opposed to distances (by setting meters to false). If you want to work with distances instead of travel times, set this parameter to True.

In [None]:
from spatial_access.Configs import Configs  
custom_config = Configs()
custom_config.disable_area_threshold=True  
custom_config.default_drive_speed=35
custom_config.use_meters=False

# then run:
w_asym_mat = TransitMatrix('walk',primary_input='./data/input_data/sources/tracts2010.csv',
                           secondary_input='./data/input_data/destinations/health_chicago.csv',
                           primary_hints={'idx' : 'geoid10', 'population': 'skip', 'lat': 'lat', 'lon': 'lon'},
                           secondary_hints={'idx': 'ID', 'capacity': 'skip', 'category': 'category', 'lat': 'lat', 'lon': 'lon'},
                           configs=custom_config)
w_asym_mat.process()

#make sure you add configs=custom_config in the last line or this won't run

**DRIVING**

In [None]:
# Calculate asymmetric distance matrix for driving (takes ~1.5 minutes to run) 
d_asym_mat = TransitMatrix('drive', 
                           primary_input='./data/input_data/sources/tracts2010.csv', 
                           secondary_input='./data/input_data/destinations/health_chicago.csv',
                           primary_hints={'idx' : 'geoid10', 'population': 'skip', 'lat': 'lat', 'lon': 'lon'},
                           secondary_hints={'idx': 'ID', 'capacity': 'skip', 'category': 'category', 'lat': 'lat', 'lon': 'lon'})

d_asym_mat.process()

In [None]:
#Saved as drive_asym_health_tracts.csv
d_asym_mat.write_csv(outfile = "./data/output_data/matrices/drive_asym_health_tracts.csv")

In [None]:
# Saved as drive_asym_health_tracts.tmx
d_asym_mat.write_tmx(outfile = "./data/output_data/matrices/drive_asym_health_tracts.tmx")

----

## Symmetric Matrix

You can also create a symmetric travel time matrix, e.g. from each tract to all the other tracts (in this case, a 801 x 801 matrix). Then, you can merge destinations to this matrix using shared IDs or spatial joins in a GIS, GeoDa or R to create an asymmetric matrix as above. If you have several different destinations for the same spatial extent (or want to run simulations), the advantage of merging them with a symmetric matrix is that you only have to compute the travel times once.


**WALKING**

In [None]:
# Specify walking distance matrix (takes ~3 min to run) 
w_sym_mat = TransitMatrix('walk', 
                           primary_input='./data/output_data/matrices/walk_asym_health_tracts.csv',
                           primary_hints={'idx' : 'geoid10', 'population': 'skip', 'lat': 'lat', 'lon': 'lon'})
# Run process
w_sym_mat.process()

In [None]:
# Saved as walk_sym_health_tracts.csv
w_sym_mat.write_csv(outfile = "./data/output_data/matrices/walk_sym_health_tracts.csv")

In [None]:
# Saved as walk_sym_health_tracts.tmx
w_sym_mat.write_tmx(outfile = "./data/output_data/matrices/walk_sym_health_tracts.tmx")

**DRIVING**

In [None]:
# Specify driving distance matrix (takes ~1.5 minute to run) 
d_sym_mat = TransitMatrix('drive', 
                           primary_input='/Users/whlu/spatial_access/data/input_data/tracts2010.csv',
                           primary_hints={'idx' : 'geoid10', 'population': 'skip', 'lat': 'lat', 'lon': 'lon'})

# Run process. For driving, p2p queries OSM to fetch the street network and then output the shortest path transit matrix
d_sym_mat.process()


In [None]:
# Saved as drive_sym_health_tracts.csv
d_sym_mat.write_csv(outfile = "./data/output_data/matrices/drive_sym_health_tracts.csv")

In [None]:
# Saved as drive_sym_health_tracts.tmx
d_sym_mat.write_tmx(outfile = "./data/output_data/matrices/drive_sym_health_tracts.tmx")

#### Spatial Join (snap destinations to origins)

Now you can snap the destination points to the areas of origin. Before you do this, map origins and destinations to understand how the two layers are related: e.g., when points fall on the boundary of an area, which area they are assigned to can be arbitrary. If destinations fall within areas, you can use a within function that joins the destinations to area it falls into. If origins and destinations share a geoID, you can also merge the data that way.
The following image shows that, in this case, we can safely run a function that assigns each destination point to the area that surrounds it. 

<img src="figures/snap.png" width="500" title="Optional title">

**Spatial join of health facilities and travel time matrix**

We need to join the health facilities with the travel time matrix generated before. This will generate an asymmetric matrix with the travel times from all tracts in Chicago to the health facility destinations.

In [47]:
# Read destination files to join with boundaries 
health_gdf = gpd.read_file('./data/input_data/destinations/health_chicago.shp')
health_gdf.head()
#Use symmetric matrix calculated above or read your previously saved results:
sym_walk=pd.read_csv('./data/matrices/walk_sym_health_tracts.csv')

# Read boundaries files 
boundaries_gdf = gpd.read_file('./data/input_data/sources/tracts2010.shp')

# Rename the ID name in order to match both data frames. 
sym_walk= sym_walk.rename(index=str, columns={"Unnamed: 0": "geoid10"})


# Spatial join of amenities within each area of analysis 
#It drops values outside of the tracts shapefile. From 199 to 182 datapoints.
s_join = gpd.sjoin(health_gdf, boundaries_gdf, how='inner', op='within')

# Convert geopanda dataframe to non-spatial dataframe to join 
jb_df = pd.DataFrame(s_join)


# Make sure the id is of the same data type in both data frames.
# sym_walk.dtypes
# jb_df.dtypes
jb_df.geoid10=jb_df.geoid10.astype(int)
jb_df=pd.DataFrame(jb_df['geoid10'])

# Join the symmetric matrix with the spatially joined data (with geoid10 id)
j_asym=pd.merge(sym_walk, jb_df, left_on='geoid10', right_on='geoid10', how='left')

j_asym.to_csv('./data/output_data/matrices/walk_asym_health_tracts_join.csv')

In [48]:
#Check the output is correct
j_asym.head()

Unnamed: 0,geoid10,1,2,3,4,5,6,7,8,9,...,793,794,795,796,797,798,799,800,801,Unnamed: 802
0,1,0,9881,9106,11593,12167,8364,7089,27241,7104,...,9824,15701,16077,16334,15570,15201,22104,13666,8169,
1,2,9881,0,3326,2115,3592,6092,14890,18531,16327,...,4472,9291,8947,8756,8394,8751,13449,3924,4295,
2,3,9106,3326,0,3297,3777,9084,15494,18504,15926,...,7464,6881,7245,7437,6673,6381,13367,5388,7287,
3,4,11593,2115,3297,0,1670,7905,16709,16992,18146,...,6285,7568,7205,7014,6652,7028,11910,2482,6108,
4,5,12167,3592,3777,1670,0,9382,17433,15746,18870,...,7762,6141,5778,5587,5225,5601,10609,2939,7585,


In [49]:
j_asym.shape

(801, 803)

Now that you have a origin destination matrix, we can proceed to estimate spatial access metrics based on these matrices. For this demo's purpose, we will use drive_asym_health_tracts.csv and walk_asym_health_tracts.csv to run the metrics.