 # Analysis of Transportation Network Companies (TNCs) and Taxis in Chicago
 ## Selecting data to be analyzed
 This notebook takes the downloaded original files and selects one week of data for each dataset. For the TNCs the selected week is November 5 - November 11, 2019. For the taxi trips, the selected week is November 7 - November 13, 2016.

 A project by:<br><br>
 Juan Francisco Saldarriaga<br>
 Senior Data and Design Researcher<br>
 Brown Institute for Media Innovation<br>
 School of Journalism, Columbia University<br>
 jfs2118@columbia.edu<br>
 <br>
 and<br><br>
 David King<br>
 School of Geographical Sciences and Urban Planning<br>
 Faculty Advisor, Barrett Honors College<br>
 Arizona State University<br>
 david.a.king@asu.edu<br>

 The original data for this project can be found at:
 * Taxi trips: [Chicago Data Portal](https://data.cityofchicago.org/Transportation/Taxi-Trips/wrvz-psew), accessed on June 12, 2019.
 * TNC trips: [Chicago Data Portal](https://data.cityofchicago.org/Transportation/Transportation-Network-Providers-Trips/m6dm-c72p), accessed on April 26, 2019.

 **Importing libraries (Pandas and Numpy)**

In [1]:
import pandas as pd
import numpy as np

 **Setting global paths and filenames**

In [2]:
inputDataPath = '../input/'
outputDataPath = '../output/'
tncInputFileName = 'TNP_Trips.csv'
taxiInputFileName = 'TaxiTrips.csv'
tncOutputFileName = 'SelectedTNC_Trips_181105_181111.csv'
taxiOutputFileName = 'SelectedTaxi_Trips_161107_161113.csv'

 **Loading and exploring TNC data**

In [16]:
tncData = pd.read_csv(inputDataPath + tncInputFileName, delimiter=',')

In [17]:
tncData.head()

Unnamed: 0,Trip ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,Fare,...,Additional Charges,Trip Total,Shared Trip Authorized,Trips Pooled,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
0,022ad3b7f1320d4e52cce3d5931eb0a0cee16c48,11/01/2018 08:45:00 AM,11/01/2018 09:30:00 AM,3140.0,38.5,,17031980000.0,,56.0,47.5,...,9.1,66.6,False,1,,,,41.785999,-87.750934,POINT (-87.7509342894 41.785998518)
1,0377167460a4d5d5e015c642b460e56ac88dab71,11/01/2018 02:00:00 AM,11/01/2018 02:15:00 AM,1031.0,5.3,,,70.0,,10.0,...,2.5,12.5,False,1,41.745758,-87.708366,POINT (-87.7083657043 41.7457577128),,,
2,03a2ac30a46af881e6a2e6af06a3a779b67c0802,11/01/2018 08:15:00 PM,11/01/2018 09:15:00 PM,4125.0,49.9,17031080000.0,,8.0,,62.5,...,3.8,66.3,False,1,41.892042,-87.631864,POINT (-87.6318639497 41.8920421365),,,
3,03fe17b0509941aa04744e9e4478ed5ded56b2eb,11/01/2018 03:45:00 AM,11/01/2018 04:15:00 AM,1229.0,11.7,17031830000.0,,22.0,,7.5,...,2.5,10.0,True,3,41.916005,-87.675095,POINT (-87.6750951155 41.9160052737),,,
4,040590c0bf5b22f8ccf7d8f19873c612bebfd480,11/01/2018 05:00:00 PM,11/01/2018 06:00:00 PM,3383.0,12.3,,17031840000.0,,32.0,27.5,...,2.5,30.0,False,1,,,,41.880994,-87.632746,POINT (-87.6327464887 41.8809944707)


In [18]:
tncData.tail()

Unnamed: 0,Trip ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,Fare,...,Additional Charges,Trip Total,Shared Trip Authorized,Trips Pooled,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
17432006,ffffd4c98d53b50e74ba212b3865579855ad0165,11/15/2018 03:15:00 AM,11/15/2018 03:30:00 AM,298.0,1.0,,,60.0,60.0,5.0,...,2.5,7.5,False,1,41.83615,-87.648788,POINT (-87.6487879519 41.8361501547),41.83615,-87.648788,POINT (-87.6487879519 41.8361501547)
17432007,ffffe75577e2095f9645ca574bcb40225e52f118,12/22/2018 07:00:00 PM,12/22/2018 07:00:00 PM,367.0,1.1,,,27.0,28.0,2.5,...,2.5,5.0,True,3,41.878914,-87.705897,POINT (-87.7058971305 41.8789144956),41.874005,-87.663518,POINT (-87.6635175498 41.874005383)
17432008,ffffefd286942b361f65bbc5ede5182af0286eae,11/03/2018 07:00:00 AM,11/03/2018 07:15:00 AM,959.0,8.3,,,8.0,22.0,20.0,...,2.5,22.5,False,1,41.899602,-87.633308,POINT (-87.6333080367 41.899602111),41.922761,-87.699155,POINT (-87.69915534320002 41.9227606205)
17432009,fffff09aa0fe6b2f3a1127499aec069b69aa91ab,11/24/2018 05:45:00 PM,11/24/2018 05:45:00 PM,278.0,0.6,,,19.0,20.0,2.5,...,2.5,5.0,False,1,41.927261,-87.765502,POINT (-87.7655016086 41.9272609555),41.924347,-87.73474,POINT (-87.7347397536 41.9243470769)
17432010,fffffd66c52c5e40d1f2eed1fbf8aabcc61c39a0,11/11/2018 08:45:00 AM,11/11/2018 09:15:00 AM,1000.0,9.0,,,15.0,24.0,12.5,...,2.5,15.0,False,1,41.954028,-87.763399,POINT (-87.7633990316 41.9540276487),41.901207,-87.676356,POINT (-87.6763559892 41.90120699410001)


In [19]:
tncData.shape

(17432011, 21)

In [20]:
tncData.dtypes

Trip ID                        object
Trip Start Timestamp           object
Trip End Timestamp             object
Trip Seconds                  float64
Trip Miles                    float64
Pickup Census Tract           float64
Dropoff Census Tract          float64
Pickup Community Area         float64
Dropoff Community Area        float64
Fare                          float64
Tip                             int64
Additional Charges            float64
Trip Total                    float64
Shared Trip Authorized           bool
Trips Pooled                    int64
Pickup Centroid Latitude      float64
Pickup Centroid Longitude     float64
Pickup Centroid Location       object
Dropoff Centroid Latitude     float64
Dropoff Centroid Longitude    float64
Dropoff Centroid Location      object
dtype: object

 **Selecting TNC data for one week (November 5 - November 11, 2018)**

In [21]:
tncSelectedData = tncData[(tncData['Trip Start Timestamp'] >= '11/05/2018 00:00:00 AM') & (tncData['Trip Start Timestamp'] < '11/12/2018 00:00:00 AM')]

In [22]:
tncSelectedData.head()

Unnamed: 0,Trip ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,Fare,...,Additional Charges,Trip Total,Shared Trip Authorized,Trips Pooled,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
1038,0000093625f90e660c8275040a7f57e335ae147b,11/10/2018 10:45:00 AM,11/10/2018 10:45:00 AM,314.0,1.2,17031060000.0,17031070000.0,6.0,7.0,5.0,...,2.5,7.5,False,1,41.936159,-87.661265,POINT (-87.6612652184 41.936159071),41.921701,-87.655912,POINT (-87.6559118484 41.9217014922)
1039,00000988893ef92c0d44374d79aef52631cb7087,11/11/2018 01:15:00 AM,11/11/2018 01:15:00 AM,105.0,0.7,17031830000.0,17031830000.0,22.0,22.0,2.5,...,2.5,5.0,False,1,41.919225,-87.671446,POINT (-87.671445766 41.9192250505),41.919225,-87.671446,POINT (-87.671445766 41.9192250505)
1041,0008f2d890c585c0729178d46bd0352ea8428b9a,11/07/2018 04:30:00 PM,11/07/2018 04:45:00 PM,1209.0,7.4,,17031570000.0,,57.0,12.5,...,2.5,15.0,True,1,,,,41.816264,-87.718927,POINT (-87.7189271854 41.8162644525)
1045,00131a71fdd0237dd323903986622464410f0200,11/07/2018 06:30:00 AM,11/07/2018 07:00:00 AM,1914.0,19.7,,,11.0,,27.5,...,3.45,30.95,False,1,41.97883,-87.771167,POINT (-87.771166703 41.9788295262),,,
1048,00164c1eb30516cbd645e04d044678b4e9f31bd2,11/06/2018 02:00:00 AM,11/06/2018 02:15:00 AM,434.0,1.7,,,,25.0,5.0,...,2.5,7.5,False,1,,,,41.890609,-87.756047,POINT (-87.7560467111 41.8906088526)


In [23]:
tncSelectedData.tail()

Unnamed: 0,Trip ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,Fare,...,Additional Charges,Trip Total,Shared Trip Authorized,Trips Pooled,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
17431970,fffe6e97132e6c727c0702fb37febabf9dd6bfec,11/06/2018 09:00:00 PM,11/06/2018 09:00:00 PM,57.0,0.3,,,28.0,28.0,0.0,...,2.5,2.5,True,1,41.874005,-87.663518,POINT (-87.6635175498 41.874005383),41.874005,-87.663518,POINT (-87.6635175498 41.874005383)
17431994,ffff620bb8413cd21ecda2c007860db177ce260d,11/09/2018 08:00:00 PM,11/09/2018 08:30:00 PM,1031.0,4.4,,,,11.0,10.0,...,2.5,12.5,True,1,,,,41.97883,-87.771167,POINT (-87.771166703 41.9788295262)
17431999,ffffb0b495c0894f0f10333fc418454dee21fc13,11/08/2018 09:15:00 AM,11/08/2018 09:45:00 AM,1848.0,12.0,,,42.0,8.0,20.0,...,2.5,22.5,False,1,41.778877,-87.594925,POINT (-87.5949254391 41.7788768603),41.899602,-87.633308,POINT (-87.6333080367 41.899602111)
17432000,ffffb4b4c7c835bcb2ebc46f46a2a0e45bc36be9,11/10/2018 11:45:00 PM,11/11/2018 12:15:00 AM,2141.0,15.0,,,,20.0,25.0,...,2.5,27.5,False,1,,,,41.924347,-87.73474,POINT (-87.7347397536 41.9243470769)
17432010,fffffd66c52c5e40d1f2eed1fbf8aabcc61c39a0,11/11/2018 08:45:00 AM,11/11/2018 09:15:00 AM,1000.0,9.0,,,15.0,24.0,12.5,...,2.5,15.0,False,1,41.954028,-87.763399,POINT (-87.7633990316 41.9540276487),41.901207,-87.676356,POINT (-87.6763559892 41.90120699410001)


In [24]:
tncSelectedData.shape

(2140207, 21)

 **Exporting selected TNC data as a `.csv` file**

In [25]:
tncSelectedData.to_csv(outputDataPath + tncOutputFileName)

 **Loading and exploring Taxi data**

In [3]:
taxiData = pd.read_csv(inputDataPath + taxiInputFileName, delimiter=',')

In [4]:
taxiData.head()

Unnamed: 0,Trip ID,Taxi ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,...,Extras,Trip Total,Payment Type,Company,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
0,493d28d1aef5aaf0764ea3192d92090ac75121ed,ff214d6d48867d32b43c8caf27613acc83d3a03a290c37...,11/08/2016 03:15:00 PM,10/11/2016 06:00:00 AM,,3.4,,,6.0,8.0,...,0.0,18.0,Credit Card,Taxi Affiliation Services,41.944227,-87.655998,POINT (-87.6559981815 41.9442266014),41.899602,-87.633308,POINT (-87.6333080367 41.899602111)
1,d61104499ad3c93c3279b18919f401f4dda8b804,cea95974258777edc45fc0717c4538f0c93545566da157...,11/01/2016 12:00:00 AM,11/01/2016 12:00:00 AM,240.0,2.2,,,8.0,24.0,...,1.0,13.0,Credit Card,Northwest Management LLC,41.899602,-87.633308,POINT (-87.6333080367 41.899602111),41.901207,-87.676356,POINT (-87.6763559892 41.9012069941)
2,03aed5a1f1f6ab50515e8c9423ad2c0c2eaf8bbd,40aed61177db25fbb6eea25b46c3f674ddbd7d42d93e6f...,11/01/2016 12:00:00 AM,11/01/2016 12:00:00 AM,360.0,1.3,17031080000.0,17031080000.0,8.0,8.0,...,1.0,8.0,Cash,Northwest Management LLC,41.902788,-87.626146,POINT (-87.6261455896 41.9027880476),41.890922,-87.618868,POINT (-87.6188683546 41.8909220259)
3,b186013b681a47873f5a66bab88ecf19eccb899c,fcbee6a4836402c8a068a12f39eb6fd45c0f8b0172c9b3...,11/01/2016 12:00:00 AM,11/01/2016 12:00:00 AM,120.0,0.5,,,6.0,6.0,...,1.0,5.5,Cash,Taxi Affiliation Services,41.944227,-87.655998,POINT (-87.6559981815 41.9442266014),41.944227,-87.655998,POINT (-87.6559981815 41.9442266014)
4,6b7f11da6f79d9167fd75e8bf2e4c26c53634f6e,309ff9f30190e038d6801051fc1b42cdefdd4169e46eed...,11/01/2016 12:00:00 AM,11/01/2016 12:00:00 AM,360.0,1.3,17031320000.0,17031830000.0,32.0,28.0,...,0.0,7.0,Cash,Taxi Affiliation Services,41.884987,-87.620993,POINT (-87.6209929134 41.8849871918),41.885281,-87.657233,POINT (-87.6572331997 41.8852813201)


In [5]:
taxiData.tail()

Unnamed: 0,Trip ID,Taxi ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,...,Extras,Trip Total,Payment Type,Company,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
1296082,4f3ee987e0f66e3fe7d257a4146a0123df773908,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/12/2016 01:00:00 AM,,,0.0,17031830000.0,,28.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.885281,-87.657233,POINT (-87.6572331997 41.8852813201),,,
1296083,6635adf207dc85b4b38457d898fd3bb8d31dc1a9,72298402ab35fd20241fec18cc9c8b604ce672dae84cec...,11/06/2016 01:45:00 AM,,,0.0,17031060000.0,,6.0,,...,0.0,0.0,Unknown,Taxi Affiliation Services,41.942585,-87.656644,POINT (-87.6566440918 41.9425851797),,,
1296084,77eecb021758e0038da55d282b3a5e627bbab9f7,d3d2542b9c37c4577b664694adfc3ad2b1efcddf4f692a...,11/29/2016 02:00:00 PM,,,0.0,17031320000.0,,32.0,,...,0.0,0.0,Unknown,KOAM Taxi Association,41.884987,-87.620993,POINT (-87.6209929134 41.8849871918),,,
1296085,6c6caed72bce663d16f1b1dc780cddbe7110075b,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/18/2016 10:15:00 PM,,,0.0,17031080000.0,,8.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.893216,-87.637844,POINT (-87.6378442095 41.8932163595),,,
1296086,832ec8b8d5666827b996030705c8252785d15d3b,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/19/2016 03:00:00 AM,,,0.0,17031240000.0,,24.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.89967,-87.669838,POINT (-87.6698377982 41.8996701799),,,


In [6]:
taxiData.shape

(1296087, 23)

In [7]:
taxiData.dtypes

Trip ID                        object
Taxi ID                        object
Trip Start Timestamp           object
Trip End Timestamp             object
Trip Seconds                  float64
Trip Miles                    float64
Pickup Census Tract           float64
Dropoff Census Tract          float64
Pickup Community Area         float64
Dropoff Community Area        float64
Fare                          float64
Tips                          float64
Tolls                         float64
Extras                        float64
Trip Total                    float64
Payment Type                   object
Company                        object
Pickup Centroid Latitude      float64
Pickup Centroid Longitude     float64
Pickup Centroid Location       object
Dropoff Centroid Latitude     float64
Dropoff Centroid Longitude    float64
Dropoff Centroid  Location     object
dtype: object

 **Selecting Taxi data for one week (November 7 - November 13, 2016)**

In [10]:
taxiSelectedData = taxiData[(taxiData['Trip Start Timestamp'] >= '11/07/2016 00:00:00 AM') & (taxiData['Trip Start Timestamp'] < '11/14/2016 00:00:00 AM')]

In [11]:
taxiSelectedData.head()

Unnamed: 0,Trip ID,Taxi ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,...,Extras,Trip Total,Payment Type,Company,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
0,493d28d1aef5aaf0764ea3192d92090ac75121ed,ff214d6d48867d32b43c8caf27613acc83d3a03a290c37...,11/08/2016 03:15:00 PM,10/11/2016 06:00:00 AM,,3.4,,,6.0,8.0,...,0.0,18.0,Credit Card,Taxi Affiliation Services,41.944227,-87.655998,POINT (-87.6559981815 41.9442266014),41.899602,-87.633308,POINT (-87.6333080367 41.899602111)
305643,eb810f6b891baa03fe6684c798b5c8a1a9bda143,8b518cb35743db24b2498b46d3eb2e89321a5f26a4601f...,11/07/2016 12:00:00 AM,11/07/2016 12:00:00 AM,0.0,0.8,17031080000.0,17031080000.0,8.0,8.0,...,4.0,44.4,Credit Card,Taxi Affiliation Services,41.892073,-87.628874,POINT (-87.6288741572 41.8920726347),41.892073,-87.628874,POINT (-87.6288741572 41.8920726347)
305644,3f205337b016faad57fe90288c05814bd5ecfac1,d38890256d8a1e8146f0b15dc23e3b2a140b4fa3834698...,11/07/2016 12:00:00 AM,11/07/2016 12:00:00 AM,180.0,0.5,,,6.0,6.0,...,0.0,4.5,Cash,Taxi Affiliation Services,41.944227,-87.655998,POINT (-87.6559981815 41.9442266014),41.944227,-87.655998,POINT (-87.6559981815 41.9442266014)
305645,d97eb24911ebd9a9a1bb7f1c771ad963a09256dd,29f6d119e21d61401aa5156346aca14fad3d0fea0dc1c8...,11/07/2016 12:00:00 AM,11/07/2016 12:00:00 AM,300.0,0.8,,,32.0,32.0,...,0.0,5.5,Cash,Taxi Affiliation Services,41.878866,-87.625192,POINT (-87.6251921424 41.8788655841),41.878866,-87.625192,POINT (-87.6251921424 41.8788655841)
305652,3c18e189acc6b1a86e143e5b97a6ee4ae6d8906b,fdd5ae66de73dcf0baaf4a768c182b0c9e0579643672c0...,11/07/2016 12:00:00 AM,11/07/2016 12:00:00 AM,301.0,1.1,17031080000.0,17031320000.0,8.0,32.0,...,1.5,7.5,Cash,,41.892073,-87.628874,POINT (-87.6288741572 41.8920726347),41.884987,-87.620993,POINT (-87.6209929134 41.8849871918)


In [12]:
taxiSelectedData.tail()

Unnamed: 0,Trip ID,Taxi ID,Trip Start Timestamp,Trip End Timestamp,Trip Seconds,Trip Miles,Pickup Census Tract,Dropoff Census Tract,Pickup Community Area,Dropoff Community Area,...,Extras,Trip Total,Payment Type,Company,Pickup Centroid Latitude,Pickup Centroid Longitude,Pickup Centroid Location,Dropoff Centroid Latitude,Dropoff Centroid Longitude,Dropoff Centroid Location
1296063,62d526e28631c24e3c63a4cac32bc5068433a9f6,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/11/2016 10:00:00 AM,,,0.0,17031840000.0,,32.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.880994,-87.632746,POINT (-87.6327464887 41.8809944707),,,
1296067,6f2c440efc684c18966fc8fa930ae2fa54b593da,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/12/2016 03:00:00 PM,,,0.0,17031080000.0,,8.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.893216,-87.637844,POINT (-87.6378442095 41.8932163595),,,
1296070,559d7e545c8794dd75480c7c8838c5d7a91f40dd,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/08/2016 06:00:00 PM,,,0.0,17031080000.0,,8.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.902788,-87.626146,POINT (-87.6261455896 41.9027880476),,,
1296076,5275a3e55cbb1c095c087a3cff5082dba462281c,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/13/2016 04:45:00 AM,,,0.0,17031060000.0,,6.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.94914,-87.656804,POINT (-87.6568039088 41.9491397709),,,
1296082,4f3ee987e0f66e3fe7d257a4146a0123df773908,2a0dffcdb837d73460febb3e3882633bba0563b966825a...,11/12/2016 01:00:00 AM,,,0.0,17031830000.0,,28.0,,...,0.0,0.0,Unknown,Chicago Medallion Management,41.885281,-87.657233,POINT (-87.6572331997 41.8852813201),,,


In [13]:
taxiSelectedData.shape

(321107, 23)

 **Exporting selected TNC data as a `.csv` file**

In [15]:
taxiSelectedData.to_csv(outputDataPath + taxiOutputFileName)