#### 2013 and 2018 parking inventory comparison
##### Author: Polina Butrina 

From Carol's email: "it could be worth doing some preliminary analysis at a lower level (tract maybe?) to see if there are significant geographic differences within any given study area. But I’m also OK with keeping the analysis at the study area summary level, especially given the big methodological shifts between the two datasets. "

", it would be helpful to run a simple comparison between the 2013 and 2018 results, identifying where the biggest differences emerge (per the Trend article). To the degree that we can, identify where these differences are attributable to real changes on the ground vs. data collection methodology—a conversation with Peter could be helpful on this front. A longer time trend analysis that includes additional historical data points might also be helpful for identifying methodology-based anomalies as well."


In [1]:
import pandas as pd
import numpy as np 
import os
import matplotlib.pyplot as plt
import seaborn as sns

In [5]:
#set working directory
path = "C:\\Users\\pbutrina\\Documents\\Python Scripts\\parking_inventory"
os.chdir(path)
os.getcwd()

'C:\\Users\\pbutrina\\Documents\\Python Scripts\\parking_inventory'

In [14]:
# download 2013 and 2018 parking data
parking_cap_zones_2018 = pd.read_excel('parking_summaries_2018.xlsx', sheet_name='CapacityZones', index = False)
parking_occup_zones_2013 = pd.read_excel('copy_of_parking_summaries_13.xlsx', sheet_name='OccupancyZones', index = False)


In [52]:
data_2018 = parking_cap_zones_2018
data_2013 = parking_occup_zones_2013

In [119]:
data_2013.head()

Unnamed: 0,County,City,Zone,Tract,Block,Number of Lots,Total Stalls,Total AM Car Count,Total PM Car Count,Average Total Stalls,Average AM Car Count,Average PM Car Count,AM Occupancy Rate,PM Occupancy Rate,Average Daily Occupancy Rate,Tract_Parking
0,King,Bellevue,1.0,23803.0,1000.0,3.0,194.0,73,86,65,24,29,0.369231,0.446154,0.407692,8592.0
1,King,Bellevue,1.0,23803.0,1001.0,6.0,387.0,182,225,65,30,38,0.461538,0.584615,0.523077,8592.0
2,King,Bellevue,1.0,23803.0,1002.0,10.0,263.0,84,130,26,8,13,0.307692,0.5,0.403846,8592.0
3,King,Bellevue,1.0,23803.0,1003.0,3.0,225.0,129,129,75,43,43,0.573333,0.573333,0.573333,8592.0
4,King,Bellevue,1.0,23803.0,1004.0,6.0,2078.0,1019,1080,346,170,180,0.491329,0.520231,0.50578,8592.0


In [22]:
parking_occup_zones_2013.describe()

Unnamed: 0,Number of Lots,Total Stalls
count,981.0,981.0
mean,8.296636,631.181448
std,86.115565,6590.145362
min,1.0,0.0
25%,1.0,31.0
50%,2.0,79.0
75%,3.0,194.0
max,2443.0,180676.0


In [24]:
parking_cap_zones_2018.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 987 entries, 0 to 986
Data columns (total 8 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   County                    979 non-null    float64
 1   City                      986 non-null    object 
 2   Zone                      970 non-null    float64
 3   Tract                     970 non-null    float64
 4   Block                     970 non-null    float64
 5   Number of Lots            986 non-null    float64
 6   Total Stalls              986 non-null    float64
 7   Average Number of Stalls  986 non-null    float64
dtypes: float64(7), object(1)
memory usage: 61.8+ KB


In [62]:
data_2013.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 973 entries, 0 to 980
Data columns (total 15 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   County                        972 non-null    object 
 1   City                          972 non-null    object 
 2   Zone                          972 non-null    float64
 3   Tract                         972 non-null    float64
 4   Block                         972 non-null    float64
 5   Number of Lots                972 non-null    float64
 6   Total Stalls                  972 non-null    float64
 7   Total AM Car Count            972 non-null    object 
 8   Total PM Car Count            972 non-null    object 
 9   Average Total Stalls          972 non-null    object 
 10  Average AM Car Count          972 non-null    object 
 11  Average PM Car Count          972 non-null    object 
 12  AM Occupancy Rate             972 non-null    object 
 13  PM Oc

Check if the number of zones, tracts, and blocks is the same in 2013 and 2018; Analysis shows that there were surveyed 44 tracts in 2013 and 42 in 2018; 24 cities in 2013 and 25 cities in 2018. See the detailed tables below

In [91]:
#unique number of zones, tracts, and blocks surveyed in 2013 and 2018
data_2013.nunique()

County                            4
City                             15
Zone                             20
Tract                            43
Block                           254
Number of Lots                   19
Total Stalls                    371
Total AM Car Count              270
Total PM Car Count              271
Average Total Stalls            264
Average AM Car Count            190
Average PM Car Count            200
AM Occupancy Rate               511
PM Occupancy Rate               514
Average Daily Occupancy Rate    644
Tract_Parking                    43
dtype: int64

In [92]:
data_2018.nunique()

County                        4
City                         25
Zone                         20
Tract                        42
Block                       256
Number of Lots               27
Total Stalls                390
Average Number of Stalls    290
Tract_Parking                42
dtype: int64

checking the change in parking in tracts between 2013 and 2018

In [59]:
#first, need to convert strings from data_2013 to numerical data to be able to calculate the difference between 2013 and 2018 data
#need to remove "-" in the columns
data_2013 = data_2013[data_2012.Zone != "-"]
data_2013[["Zone", "Tract","Block"]] = data_2013[["Zone", "Tract","Block"]].apply(pd.to_numeric)

In [83]:
#summing  together parking by tracts for 2013 and 2018
data_2013_total_parking = data_2013.groupby("Tract").sum()[["Total Stalls"]]
data_2018_total_parking = data_2018.groupby("Tract").sum()[["Total Stalls"]]

In [98]:
#renaming the new column
data_2013_total_parking = data_2013_total_parking.rename(columns={"Total Stalls": "Total Stalls 2013"})
data_2018_total_parking = data_2018_total_parking.rename(columns={"Total Stalls": "Total Stalls 2018"})

In [99]:
#creating a new df with 2013 and 2018 parking stalls grouped by the tract by merging two tables
parking_dif = pd.merge(data_2013_total_parking, data_2018_total_parking, how='outer', on='Tract')

In [116]:
#calculating absolute difference and % difference
parking_dif['abs_dif'] = parking_dif['Total Stalls 2018'] - parking_dif['Total Stalls 2013']
parking_dif['perc_dif'] = (parking_dif['Total Stalls 2018'] - parking_dif['Total Stalls 2013'])/parking_dif['Total Stalls 2013']*100
parking_dif = parking_dif.dropna()

In [117]:
parking_dif.sort_values(by = ['perc_dif','abs_dif'], ascending=False)

Unnamed: 0_level_0,Total Stalls 2013,Total Stalls 2018,abs_dif,perc_dif
Tract,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
5302.0,2820.0,10703.0,7883.0,279.539007
81100.0,1317.0,4512.0,3195.0,242.596811
90900.0,968.0,1985.0,1017.0,105.061983
90102.0,773.0,1472.0,699.0,90.426908
9300.0,4538.0,6050.0,1512.0,33.318643
7200.0,11127.0,14112.0,2985.0,26.826638
8200.0,9039.0,10877.0,1838.0,20.334108
23700.0,2145.0,2357.0,212.0,9.88345
23804.0,29551.0,32342.0,2791.0,9.444689
23803.0,8592.0,9265.0,673.0,7.832868


In [118]:
(parking_dif['Total Stalls 2018'].sum()-parking_dif['Total Stalls 2013'].sum())/parking_dif['Total Stalls 2013'].sum()*100

12.789745179215833

In [124]:
data_2013[data_2013.Tract== 5302]

Unnamed: 0,County,City,Zone,Tract,Block,Number of Lots,Total Stalls,Total AM Car Count,Total PM Car Count,Average Total Stalls,Average AM Car Count,Average PM Car Count,AM Occupancy Rate,PM Occupancy Rate,Average Daily Occupancy Rate,Tract_Parking
514,King,University District,4.0,5302.0,1004.0,1.0,38.0,27,33,38,27,33,0.710526,0.868421,0.789474,2820.0
515,King,University District,4.0,5302.0,2000.0,2.0,100.0,60,66,50,30,33,0.6,0.66,0.63,2820.0
516,King,University District,4.0,5302.0,2001.0,2.0,112.0,73,74,56,37,37,0.660714,0.660714,0.660714,2820.0
517,King,University District,4.0,5302.0,2002.0,1.0,91.0,51,60,91,51,60,0.56044,0.659341,0.60989,2820.0
518,King,University District,4.0,5302.0,2009.0,3.0,999.0,696,745,333,232,248,0.696697,0.744745,0.720721,2820.0
519,King,University District,4.0,5302.0,2010.0,4.0,51.0,35,35,13,9,9,0.692308,0.692308,0.692308,2820.0
520,King,University District,4.0,5302.0,2011.0,2.0,144.0,91,90,72,46,45,0.638889,0.625,0.631944,2820.0
521,King,University District,4.0,5302.0,2014.0,1.0,13.0,9,9,13,9,9,0.692308,0.692308,0.692308,2820.0
522,King,University District,4.0,5302.0,2016.0,3.0,57.0,22,41,19,7,14,0.368421,0.736842,0.552632,2820.0
523,King,University District,4.0,5302.0,2018.0,5.0,1195.0,1083,1086,239,217,217,0.90795,0.90795,0.90795,2820.0
