 # Analysis of Transportation Network Companies (TNCs) and Taxis in Chicago
 ## Performing a hotspot (Getis-Ord Gi*) analysis
 This notebook performs a hotspot (Getis-Ord Gi*) analysis on the selected Taxi and TNC data for Chicago. The starting datasets for this notebook were generated by selecting one week of trips for each dataset: for the TNCs the selected week is November 5 - November 11, 2019; for the taxi trips, the selected week is November 7 - November 13, 2016.

 A project by:<br><br>
 Juan Francisco Saldarriaga<br>
 Senior Data and Design Researcher<br>
 Brown Institute for Media Innovation<br>
 School of Journalism, Columbia University<br>
 jfs2118@columbia.edu<br>
 <br>
 and<br><br>
 David King<br>
 School of Geographical Sciences and Urban Planning<br>
 Faculty Advisor, Barrett Honors College<br>
 Arizona State University<br>
 david.a.king@asu.edu<br>

 The original data for this project can be found at:
 * Taxi trips: [Chicago Data Portal](https://data.cityofchicago.org/Transportation/Taxi-Trips/wrvz-psew), accessed on June 12, 2019.
 * TNC trips: [Chicago Data Portal](https://data.cityofchicago.org/Transportation/Transportation-Network-Providers-Trips/m6dm-c72p), accessed on April 26, 2019.
 * Chicago census tracts: [US Census Bureau](https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2018&layergroup=Census+Tracts), accessed on June 12, 2019.

**Importing libraries (Pandas, Numpy, Geopandas, Shapely, and Matplotlib)**

In [1]:
import pandas as pd
import numpy as np
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Point
%matplotlib inline

**Setting up global paths and file names**

In [4]:
inputDataPath = '../input/'
outputDataPath = '../output/'
tncInputFileName = 'SelectedTNC_Trips_181105_181111_Datetime.csv'
taxiInputFileName = 'SelectedTaxi_Trips_161107_161113_Datetime.csv'
illinoisCensusTractsFileName = 'tl_2018_17_tract.shp'
# tncOutputFileName = 'SelectedTNC_Trips_181105_181111_Datetime.csv'
# taxiOutputFileName = 'SelectedTaxi_Trips_161107_161113_Datetime.csv'

**Loading TNC and taxi data**

In [3]:
tncData = pd.read_csv(inputDataPath + tncInputFileName, delimiter=',', index_col=0)
taxiData = pd.read_csv(inputDataPath + taxiInputFileName, delimiter=',', index_col=0)

  mask |= (ar1 == a)


**Loading Illinois census tracts shapefile and selecting Cook county**

In [7]:
illinoisCT = gpd.read_file(inputDataPath + illinoisCensusTractsFileName)

In [8]:
illinoisCT.head()

Unnamed: 0,STATEFP,COUNTYFP,TRACTCE,GEOID,NAME,NAMELSAD,MTFCC,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
0,17,91,11700,17091011700,117.0,Census Tract 117,G5020,S,2370100,102060,41.1294653,-87.8735796,"POLYGON ((-87.887682 41.13594, -87.887643 41.1..."
1,17,91,11800,17091011800,118.0,Census Tract 118,G5020,S,1790218,55670,41.1403452,-87.8760059,"POLYGON ((-87.89409599999999 41.143875, -87.89..."
2,17,119,400951,17119400951,4009.51,Census Tract 4009.51,G5020,S,5170038,169066,38.7277628,-90.100262,POLYGON ((-90.11191599999999 38.70280899999999...
3,17,119,400952,17119400952,4009.52,Census Tract 4009.52,G5020,S,5751222,305905,38.7301928,-90.082751,"POLYGON ((-90.09442 38.720308, -90.093604 38.7..."
4,17,189,950300,17189950300,9503.0,Census Tract 9503,G5020,S,30383680,349187,38.3567671,-89.3783135,"POLYGON ((-89.413484 38.307848, -89.413478 38...."


In [106]:
cookCT = illinoisCT[illinoisCT['COUNTYFP'] == '031']

In [100]:
cookCT.head()

Unnamed: 0,STATEFP,COUNTYFP,TRACTCE,GEOID,NAME,NAMELSAD,MTFCC,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
35,17,31,630200,17031630200,6302,Census Tract 6302,G5020,S,634189,0,41.8027454,-87.6940453,"POLYGON ((-87.70386499999999 41.804309, -87.70..."
48,17,31,430400,17031430400,4304,Census Tract 4304,G5020,S,548003,0,41.7621978,-87.5903116,"POLYGON ((-87.595444 41.759068, -87.595208 41...."
49,17,31,430600,17031430600,4306,Census Tract 4306,G5020,S,333359,0,41.7644133,-87.571373,"POLYGON ((-87.576302 41.766214, -87.575689 41...."
50,17,31,430800,17031430800,4308,Census Tract 4308,G5020,S,328071,0,41.76075,-87.5712983,"POLYGON ((-87.57622099999999 41.762515, -87.57..."
51,17,31,430500,17031430500,4305,Census Tract 4305,G5020,S,342833,0,41.7643139,-87.58127,"POLYGON ((-87.58632899999999 41.766121, -87.58..."


In [11]:
cookCT.shape

(1319, 13)

**Select TNC and taxi trips based on their weekday and weekend peak hour (Wednesday and Saturday)**

In [22]:
tncWeekdayPeak = tncData[(tncData['StartDateTime'] >= '2018-11-07 17:00:00') & (tncData['StartDateTime'] < '2018-11-07 18:00:00')]

In [23]:
tncWeekdayPeak.head()

Unnamed: 0,Trip ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,Fare,...,Shared Trip Authorized,Trips Pooled,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location,StartDateTime,EndDateTime
2184,06c3f91590e2514551b871df15bd3f3f12589a83,11/07/2018 05:15:00 PM,11/07/2018 06:00:00 PM,2042.0,12.4,,17031840000.0,,44.0,22.5,...,False,1,,,,41.742488,-87.630045,POINT (-87.6300448953 41.7424875717),2018-11-07 17:15:00,2018-11-07 18:00:00
2611,0907f260a2dc6377c6845134dd0677f1d885160f,11/07/2018 05:15:00 PM,11/07/2018 05:45:00 PM,1965.0,19.4,,17031980000.0,,76.0,27.5,...,False,1,,,,41.979071,-87.90304,POINT (-87.9030396611 41.9790708201),2018-11-07 17:15:00,2018-11-07 17:45:00
6335,201d44b5b0d2bfe7a92d117a8311fc0ec37ec441,11/07/2018 05:00:00 PM,11/07/2018 05:30:00 PM,1139.0,11.5,,,76.0,,17.5,...,False,1,41.980264,-87.913625,POINT (-87.913624596 41.9802643146),,,,2018-11-07 17:00:00,2018-11-07 17:30:00
8487,2e4cbd22a24c2fbb83f75dc0be1afd46d0d8a7e0,11/07/2018 05:00:00 PM,11/07/2018 05:45:00 PM,2920.0,10.6,,,2.0,,20.0,...,True,1,42.001571,-87.695013,POINT (-87.6950125892 42.001571027),,,,2018-11-07 17:00:00,2018-11-07 17:45:00
9482,349a7319a0796b2b86e05a405e118217ed557c4e,11/07/2018 05:30:00 PM,11/07/2018 06:00:00 PM,1487.0,13.0,17031980000.0,,76.0,,20.0,...,False,1,41.979071,-87.90304,POINT (-87.9030396611 41.9790708201),,,,2018-11-07 17:30:00,2018-11-07 18:00:00


In [24]:
tncWeekdayPeak.shape

(17425, 23)

In [25]:
taxiWeekdayPeak = taxiData[(taxiData['StartDateTime'] >= '2016-11-09 18:00:00') & (taxiData['StartDateTime'] < '2016-11-09 19:00:00')]

In [26]:
taxiWeekdayPeak.head()

Unnamed: 0,Trip ID,Taxi ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,...,Payment Type,Company,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location,StartDateTime,EndDateTime
435894,7523fe185136debdf4dc8050ed5f2db14502494b,d38a242610110db56d50433ee978b21d989c43fd63cc98...,11/09/2016 06:00:00 PM,11/09/2016 06:00:00 PM,300.0,0.4,,,,,...,Credit Card,,,,,,,,2016-11-09 18:00:00,2016-11-09 18:00:00
435895,a97686ee780b89c9cfd89f99ed33c00653a42869,5bc4fb505b377f571a14166c27aeee000fbf3769630eda...,11/09/2016 06:00:00 PM,11/09/2016 06:00:00 PM,0.0,0.0,,,2.0,2.0,...,Credit Card,,42.001571,-87.695013,POINT (-87.6950125892 42.001571027),42.001571,-87.695013,POINT (-87.6950125892 42.001571027),2016-11-09 18:00:00,2016-11-09 18:00:00
435896,30b57a67ef5db6faa3d8a47e2d682870c3ef3a83,f7b3881bae139702c0198535ff8fd725cf0f3f95872204...,11/09/2016 06:00:00 PM,11/09/2016 06:00:00 PM,600.0,0.0,17031840000.0,17031840000.0,32.0,32.0,...,Cash,Blue Ribbon Taxi Association Inc.,41.880994,-87.632746,POINT (-87.6327464887 41.8809944707),41.880994,-87.632746,POINT (-87.6327464887 41.8809944707),2016-11-09 18:00:00,2016-11-09 18:00:00
435904,2a06c330ef84b2f6c05028033a249c1b08a75145,0dbad465512058f7f0a3c4633db7bcfd588be8b0569cc7...,11/09/2016 06:00:00 PM,11/09/2016 06:00:00 PM,360.0,0.4,17031840000.0,17031320000.0,32.0,32.0,...,Cash,,41.880994,-87.632746,POINT (-87.6327464887 41.8809944707),41.884987,-87.620993,POINT (-87.6209929134 41.8849871918),2016-11-09 18:00:00,2016-11-09 18:00:00
435908,3709e742855fb9098033a781f97e5254a540901f,7ebc0131e37b5de496a799105e161e918b6b965238885e...,11/09/2016 06:00:00 PM,11/09/2016 06:00:00 PM,0.0,0.0,,,,,...,Cash,Taxi Affiliation Services,,,,,,,2016-11-09 18:00:00,2016-11-09 18:00:00


In [27]:
taxiWeekdayPeak.shape

(3233, 25)

In [28]:
tncWeekendPeak = tncData[(tncData['StartDateTime'] >= '2018-11-10 23:00:00') & (tncData['StartDateTime'] < '2018-11-11 00:00:00')]

In [29]:
tncWeekendPeak.head()

Unnamed: 0,Trip ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,Fare,...,Shared Trip Authorized,Trips Pooled,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location,StartDateTime,EndDateTime
1848,050c3d8b4244601175837e990daef1e00436f7e3,11/10/2018 11:15:00 PM,11/11/2018 12:00:00 AM,2235.0,19.0,,,,66.0,27.5,...,False,1,,,,41.771849,-87.695666,POINT (-87.695666342 41.7718485152),2018-11-10 23:15:00,2018-11-11 00:00:00
2854,0a83de781474a3ec3cc23642f03a500c02f7ab44,11/10/2018 11:00:00 PM,11/10/2018 11:45:00 PM,2555.0,34.4,,17031840000.0,,24.0,42.5,...,False,1,,,,41.898306,-87.653614,POINT (-87.6536139825 41.8983058696),2018-11-10 23:00:00,2018-11-10 23:45:00
4418,139b83c701d1d131ece47bf46a2a7aa9eba598e1,11/10/2018 11:45:00 PM,11/10/2018 11:45:00 PM,431.0,1.7,,,2.0,,5.0,...,False,1,42.001571,-87.695013,POINT (-87.6950125892 42.001571027),,,,2018-11-10 23:45:00,2018-11-10 23:45:00
4596,14a4efe0dd9d4e1dc7853fdacf928f166cd78099,11/10/2018 11:45:00 PM,11/11/2018 12:15:00 AM,2115.0,24.4,17031740000.0,,74.0,,32.5,...,False,1,41.697269,-87.698582,POINT (-87.6985820537 41.6972691922),,,,2018-11-10 23:45:00,2018-11-11 00:15:00
5914,1d10bf2c26a8f5b781dface1e36180f67257503f,11/10/2018 11:45:00 PM,11/11/2018 12:00:00 AM,1097.0,6.8,,,62.0,,12.5,...,False,1,41.792982,-87.724208,POINT (-87.7242081939 41.7929819032),,,,2018-11-10 23:45:00,2018-11-11 00:00:00


In [30]:
tncWeekendPeak.shape

(24810, 23)

In [31]:
taxiWeekendPeak = taxiData[(taxiData['StartDateTime'] >= '2016-11-12 19:00:00') & (taxiData['StartDateTime'] < '2016-11-12 20:00:00')]

In [32]:
taxiWeekendPeak.head()

Unnamed: 0,Trip ID,Taxi ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,...,Payment Type,Company,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location,StartDateTime,EndDateTime
580571,cec1dddeb5c5511a9d6c4793bd90f59d219457c4,bab9f72c23fb8c0e0cdbbdf97b0d978b9f9cb6d731acaa...,11/12/2016 07:00:00 PM,11/12/2016 07:00:00 PM,540.0,1.1,17031080000.0,17031320000.0,8.0,32.0,...,Cash,,41.898332,-87.620763,POINT (-87.6207628651 41.8983317935),41.884987,-87.620993,POINT (-87.6209929134 41.8849871918),2016-11-12 19:00:00,2016-11-12 19:00:00
580573,eb297e5edcad4a209dc1b09268c691e0a21e21c5,faf9edb027f7718349ba31b94b6e146443f3e0486400f7...,11/12/2016 07:00:00 PM,11/12/2016 07:00:00 PM,240.0,0.0,17031840000.0,17031330000.0,33.0,33.0,...,Cash,Blue Ribbon Taxi Association Inc.,41.849247,-87.624135,POINT (-87.6241352979 41.8492467545),41.85935,-87.617358,POINT (-87.6173580061 41.859349715),2016-11-12 19:00:00,2016-11-12 19:00:00
580577,ce7a029e56262bb0b2c02c6836a86ea281d53bb5,9329ed84930ef60622b0e9bf10745816a941e22908309b...,11/12/2016 07:00:00 PM,11/12/2016 07:00:00 PM,300.0,0.62,17031080000.0,17031080000.0,8.0,8.0,...,Cash,,41.892042,-87.631864,POINT (-87.6318639497 41.8920421365),41.900266,-87.632109,POINT (-87.6321092196 41.9002656868),2016-11-12 19:00:00,2016-11-12 19:00:00
580579,ce66276ace4f5571becd6777d9c0b75deaf7519e,5c4dbf120a97d6d82d93388ef3ec44ea96fff151a5f179...,11/12/2016 07:00:00 PM,11/12/2016 07:00:00 PM,480.0,1.5,17031080000.0,17031080000.0,8.0,8.0,...,Cash,Choice Taxi Association,41.890922,-87.618868,POINT (-87.6188683546 41.8909220259),41.893216,-87.637844,POINT (-87.6378442095 41.8932163595),2016-11-12 19:00:00,2016-11-12 19:00:00
580581,ce2e72f4f197e7e76b409a8ea77897293e821e4c,dc45a54b6ba04f8e92435b925b709a1c6ba222f9f9c302...,11/12/2016 07:00:00 PM,11/12/2016 07:00:00 PM,540.0,1.3,17031080000.0,17031080000.0,8.0,8.0,...,Cash,,41.891972,-87.612945,POINT (-87.6129454143 41.8919715078),41.891972,-87.612945,POINT (-87.6129454143 41.8919715078),2016-11-12 19:00:00,2016-11-12 19:00:00


In [33]:
taxiWeekendPeak.shape

(2831, 25)

**Merging peak hour data with Cook county census tracts**

In [88]:
len(tncWeekdayPeak[tncWeekdayPeak['Pickup Census Tract'].isnull()])

4045

In [107]:
tncWeekdayPeak_Pivot = tncWeekdayPeak.pivot_table(index='Pickup Census Tract', values='Trip ID', aggfunc='count')

In [108]:
tncWeekdayPeak_Pivot.head()

Unnamed: 0_level_0,Trip ID
Pickup Census Tract,Unnamed: 1_level_1
17031010000.0,13
17031010000.0,2
17031010000.0,11
17031010000.0,13
17031010000.0,14


In [109]:
tncWeekdayPeak_Pivot['Trip ID'].sum()

13380

In [110]:
tncWeekdayPeak_Pivot.reset_index(inplace=True)

In [111]:
tncWeekdayPeak_Pivot.head()

Unnamed: 0,Pickup Census Tract,Trip ID
0,17031010000.0,13
1,17031010000.0,2
2,17031010000.0,11
3,17031010000.0,13
4,17031010000.0,14


In [112]:
tncWeekdayPeak_Pivot.dtypes

Pickup Census Tract    float64
Trip ID                  int64
dtype: object

In [113]:
tncWeekdayPeak_Pivot['Pickup Census Tract'] = tncWeekdayPeak_Pivot['Pickup Census Tract'].astype('int').astype('str')

In [114]:
tncWeekdayPeak_Pivot.head()

Unnamed: 0,Pickup Census Tract,Trip ID
0,17031010100,13
1,17031010201,2
2,17031010202,11
3,17031010300,13
4,17031010400,14


In [115]:
cookCT = pd.merge(cookCT, tncWeekdayPeak_Pivot, how='left', left_on='GEOID', right_on='Pickup Census Tract')

In [117]:
cookCT.head()

Unnamed: 0,STATEFP,COUNTYFP,TRACTCE,GEOID,NAME,NAMELSAD,MTFCC,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry,Pickup Census Tract,Trip ID
0,17,31,630200,17031630200,6302,Census Tract 6302,G5020,S,634189,0,41.8027454,-87.6940453,"POLYGON ((-87.70386499999999 41.804309, -87.70...",,
1,17,31,430400,17031430400,4304,Census Tract 4304,G5020,S,548003,0,41.7621978,-87.5903116,"POLYGON ((-87.595444 41.759068, -87.595208 41....",,
2,17,31,430600,17031430600,4306,Census Tract 4306,G5020,S,333359,0,41.7644133,-87.571373,"POLYGON ((-87.576302 41.766214, -87.575689 41....",,
3,17,31,430800,17031430800,4308,Census Tract 4308,G5020,S,328071,0,41.76075,-87.5712983,"POLYGON ((-87.57622099999999 41.762515, -87.57...",,
4,17,31,430500,17031430500,4305,Census Tract 4305,G5020,S,342833,0,41.7643139,-87.58127,"POLYGON ((-87.58632899999999 41.766121, -87.58...",,


In [118]:
cookCT['Trip ID'].sum()

13380.0

In [None]:
crs = {'init': 'epsg:4326'}
selectedData = gpd.GeoDataFrame(selectedData, crs=crs, geometry=geometry)

In [34]:
cookCT.crs

{'init': 'epsg:4269'}

In [None]:
tncWeekdayPeak_pickupGeo = gpd.GeoDataFrame(tncWeekdayPeak, crs={'init': 'epsg:4269'}, geometry=tncWeekdayPeak['Pickup Centroid Location'])