## COVID's Impact on CitiBikes Ridership in NYC 

We have the <a href="https://drive.google.com/file/d/1LM7b3bQa-toyiXxLnVvK5MIACIRNoPOl/view?usp=share_link">dataset for all CitiBike ridership</a> during March 2020. 

How might we determine COVID's impact on ridership at the onset of the pandemic?




In [1]:
## import libraries and packages
import pandas as pd

In [2]:
## import data
df = pd.read_csv("citibike-tripdata-march-2020.csv")
df

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
0,1589,2020-03-01 00:00:03.6400,2020-03-01 00:26:32.9860,224,Spruce St & Nassau St,40.711464,-74.005524,3574,Prospect Pl & Underhill Ave,40.676969,-73.965790,16214,Subscriber,1980,1
1,389,2020-03-01 00:00:16.7560,2020-03-01 00:06:46.0620,293,Lafayette St & E 8 St,40.730207,-73.991026,223,W 13 St & 7 Ave,40.737815,-73.999947,29994,Subscriber,1991,2
2,614,2020-03-01 00:00:20.0580,2020-03-01 00:10:34.2200,379,W 31 St & 7 Ave,40.749156,-73.991600,515,W 43 St & 10 Ave,40.760094,-73.994618,39853,Subscriber,1991,1
3,597,2020-03-01 00:00:24.3510,2020-03-01 00:10:22.3390,3739,Perry St & Greenwich Ave,40.735918,-74.000939,325,E 19 St & 3 Ave,40.736245,-73.984738,42608,Subscriber,1989,1
4,1920,2020-03-01 00:00:26.1120,2020-03-01 00:32:26.2680,236,St Marks Pl & 2 Ave,40.728419,-73.987140,3124,46 Ave & 5 St,40.747310,-73.954510,36288,Subscriber,1993,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1068452,137,2020-03-31 23:56:06.0490,2020-03-31 23:58:23.3880,422,W 59 St & 10 Ave,40.770513,-73.988038,3356,Amsterdam Ave & W 66 St,40.774667,-73.984706,18851,Subscriber,1989,2
1068453,1548,2020-03-31 23:57:27.6850,2020-04-01 00:23:16.4110,523,W 38 St & 8 Ave,40.754666,-73.991382,442,W 27 St & 7 Ave,40.746647,-73.993915,36539,Subscriber,1993,1
1068454,308,2020-03-31 23:58:00.2690,2020-04-01 00:03:08.9500,528,2 Ave & E 31 St,40.742909,-73.977061,487,E 20 St & FDR Drive,40.733143,-73.975739,43023,Subscriber,1982,1
1068455,872,2020-03-31 23:58:42.9010,2020-04-01 00:13:15.5860,3043,Lewis Ave & Decatur St,40.681460,-73.934903,3755,DeKalb Ave & Franklin Ave,40.690648,-73.957462,43073,Customer,1990,1


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1068457 entries, 0 to 1068456
Data columns (total 15 columns):
 #   Column                   Non-Null Count    Dtype  
---  ------                   --------------    -----  
 0   tripduration             1068457 non-null  int64  
 1   starttime                1068457 non-null  object 
 2   stoptime                 1068457 non-null  object 
 3   start station id         1068457 non-null  int64  
 4   start station name       1068457 non-null  object 
 5   start station latitude   1068457 non-null  float64
 6   start station longitude  1068457 non-null  float64
 7   end station id           1068457 non-null  int64  
 8   end station name         1068457 non-null  object 
 9   end station latitude     1068457 non-null  float64
 10  end station longitude    1068457 non-null  float64
 11  bikeid                   1068457 non-null  int64  
 12  usertype                 1068457 non-null  object 
 13  birth year               1068457 non-null 

## Analysis strategy:

1. break march into first half and second half dataframes
2. get total counts (for what?) grouped by station id
3. percent change
4. absolute number 

In [4]:
## df for first half of march
df1 = df.query('starttime < "2020-03-16"')
df1

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
0,1589,2020-03-01 00:00:03.6400,2020-03-01 00:26:32.9860,224,Spruce St & Nassau St,40.711464,-74.005524,3574,Prospect Pl & Underhill Ave,40.676969,-73.965790,16214,Subscriber,1980,1
1,389,2020-03-01 00:00:16.7560,2020-03-01 00:06:46.0620,293,Lafayette St & E 8 St,40.730207,-73.991026,223,W 13 St & 7 Ave,40.737815,-73.999947,29994,Subscriber,1991,2
2,614,2020-03-01 00:00:20.0580,2020-03-01 00:10:34.2200,379,W 31 St & 7 Ave,40.749156,-73.991600,515,W 43 St & 10 Ave,40.760094,-73.994618,39853,Subscriber,1991,1
3,597,2020-03-01 00:00:24.3510,2020-03-01 00:10:22.3390,3739,Perry St & Greenwich Ave,40.735918,-74.000939,325,E 19 St & 3 Ave,40.736245,-73.984738,42608,Subscriber,1989,1
4,1920,2020-03-01 00:00:26.1120,2020-03-01 00:32:26.2680,236,St Marks Pl & 2 Ave,40.728419,-73.987140,3124,46 Ave & 5 St,40.747310,-73.954510,36288,Subscriber,1993,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
746390,1651,2020-03-15 23:59:39.7010,2020-03-16 00:27:11.6730,402,Broadway & E 22 St,40.740343,-73.989551,3080,S 4 St & Rodney St,40.709340,-73.956080,41933,Customer,1990,1
746391,214,2020-03-15 23:59:51.2280,2020-03-16 00:03:25.5020,3538,W 110 St & Amsterdam Ave,40.802692,-73.962950,3539,W 116 St & Amsterdam Ave,40.806758,-73.960708,32977,Subscriber,1998,1
746392,315,2020-03-15 23:59:51.7010,2020-03-16 00:05:07.2340,450,W 49 St & 8 Ave,40.762272,-73.987882,523,W 38 St & 8 Ave,40.754666,-73.991382,41286,Subscriber,1980,1
746393,516,2020-03-15 23:59:52.7560,2020-03-16 00:08:28.8870,470,W 20 St & 8 Ave,40.743453,-74.000040,474,5 Ave & E 29 St,40.745168,-73.986831,37039,Subscriber,1951,1


In [5]:
## df for the 2nd half of the march
df2 = df.query("starttime > '2020-03-16'")
df2

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
746395,366,2020-03-16 00:00:41.9630,2020-03-16 00:06:48.2820,479,9 Ave & W 45 St,40.760193,-73.991255,513,W 56 St & 10 Ave,40.768254,-73.988639,33359,Subscriber,1991,1
746396,235,2020-03-16 00:00:58.3680,2020-03-16 00:04:53.4700,3724,7 Ave & Central Park South,40.766741,-73.979069,3814,E 56 St & Madison Ave,40.761573,-73.972628,27574,Subscriber,1990,1
746397,1132,2020-03-16 00:01:08.1960,2020-03-16 00:20:00.6150,3795,10 St & 2 Ave,40.671907,-73.993612,307,Canal St & Rutgers St,40.714275,-73.989900,36440,Subscriber,1990,0
746398,176,2020-03-16 00:01:25.0740,2020-03-16 00:04:22.0380,482,W 15 St & 7 Ave,40.739355,-73.999318,497,E 17 St & Broadway,40.737050,-73.990093,36288,Subscriber,1991,1
746399,263,2020-03-16 00:01:48.9160,2020-03-16 00:06:11.9760,445,E 10 St & Avenue A,40.727408,-73.981420,2003,1 Ave & E 18 St,40.733812,-73.980544,38554,Subscriber,1960,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1068452,137,2020-03-31 23:56:06.0490,2020-03-31 23:58:23.3880,422,W 59 St & 10 Ave,40.770513,-73.988038,3356,Amsterdam Ave & W 66 St,40.774667,-73.984706,18851,Subscriber,1989,2
1068453,1548,2020-03-31 23:57:27.6850,2020-04-01 00:23:16.4110,523,W 38 St & 8 Ave,40.754666,-73.991382,442,W 27 St & 7 Ave,40.746647,-73.993915,36539,Subscriber,1993,1
1068454,308,2020-03-31 23:58:00.2690,2020-04-01 00:03:08.9500,528,2 Ave & E 31 St,40.742909,-73.977061,487,E 20 St & FDR Drive,40.733143,-73.975739,43023,Subscriber,1982,1
1068455,872,2020-03-31 23:58:42.9010,2020-04-01 00:13:15.5860,3043,Lewis Ave & Decatur St,40.681460,-73.934903,3755,DeKalb Ave & Franklin Ave,40.690648,-73.957462,43073,Customer,1990,1


In [6]:
## groupby of station ids and then count them up.
# first half of march
df1_count = df1.groupby(["start station id", "start station name" ])\
["start station id"].count().reset_index(name = "count1")

df1_count

Unnamed: 0,start station id,start station name,count1
0,72,W 52 St & 11 Ave,2078
1,79,Franklin St & W Broadway,860
2,82,St James Pl & Pearl St,477
3,83,Atlantic Ave & Fort Greene Pl,715
4,116,W 17 St & 8 Ave,2545
...,...,...,...
883,3913,Sands St Gate,309
884,3914,West End Ave & W 78 St,455
885,3916,Pearl St & Peck Slip,996
886,3917,Willoughby St & Ashland Pl,217


In [7]:
## groupby of station ids and then count them up.
# second half of march
df2_count = df2.groupby(["start station id", "start station name" ])\
["start station id"].count().reset_index(name = "count2")
df2_count 

Unnamed: 0,start station id,start station name,count2
0,72,W 52 St & 11 Ave,942
1,79,Franklin St & W Broadway,225
2,82,St James Pl & Pearl St,309
3,83,Atlantic Ave & Fort Greene Pl,321
4,116,W 17 St & 8 Ave,792
...,...,...,...
881,3914,West End Ave & W 78 St,256
882,3916,Pearl St & Peck Slip,453
883,3917,Willoughby St & Ashland Pl,78
884,3918,Avenue D & E 8 St,583


In [8]:
## confirm that station 72 shows up 2078 times in df1
df1.query("`start station id` == 72")

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
1029,347,2020-03-01 06:03:34.7570,2020-03-01 06:09:22.3400,72,W 52 St & 11 Ave,40.767272,-73.993929,500,Broadway & W 51 St,40.762288,-73.983362,36412,Subscriber,1980,1
1142,1032,2020-03-01 06:52:35.1240,2020-03-01 07:09:47.6050,72,W 52 St & 11 Ave,40.767272,-73.993929,3345,Madison Ave & E 99 St,40.789485,-73.952429,37630,Subscriber,1995,1
1144,375,2020-03-01 06:53:22.3110,2020-03-01 06:59:37.4490,72,W 52 St & 11 Ave,40.767272,-73.993929,458,11 Ave & W 27 St,40.751396,-74.005226,36068,Subscriber,1983,1
2498,277,2020-03-01 09:10:18.9770,2020-03-01 09:14:56.1020,72,W 52 St & 11 Ave,40.767272,-73.993929,448,W 37 St & 10 Ave,40.756604,-73.997901,36324,Subscriber,1997,1
2945,644,2020-03-01 09:32:04.6390,2020-03-01 09:42:48.7880,72,W 52 St & 11 Ave,40.767272,-73.993929,459,W 20 St & 11 Ave,40.746745,-74.007756,30643,Subscriber,1988,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
742682,457,2020-03-15 19:18:18.6460,2020-03-15 19:25:55.7540,72,W 52 St & 11 Ave,40.767272,-73.993929,465,Broadway & W 41 St,40.755136,-73.986580,38669,Subscriber,1997,1
743398,1310,2020-03-15 19:43:51.1040,2020-03-15 20:05:41.3150,72,W 52 St & 11 Ave,40.767272,-73.993929,388,W 26 St & 10 Ave,40.749718,-74.002950,27654,Customer,1969,0
743407,1272,2020-03-15 19:44:05.7530,2020-03-15 20:05:18.1500,72,W 52 St & 11 Ave,40.767272,-73.993929,388,W 26 St & 10 Ave,40.749718,-74.002950,38260,Customer,1969,0
743411,1878,2020-03-15 19:44:20.0680,2020-03-15 20:15:38.9650,72,W 52 St & 11 Ave,40.767272,-73.993929,415,Pearl St & Hanover Square,40.704718,-74.009260,18925,Customer,1969,0


In [9]:
## confirm that station 116 shows up 792 times in df2
df2.query("`start station id` == 116").shape

(792, 15)

In [10]:
## small trick to improve our display
## Unfortunately appears not to work in Colab.
## will allow us to see dataframes side-by-side
from IPython.display import display, HTML

css = """
.output {
    flex-direction: row;
}
"""

HTML('<style>{}</style>'.format(css))

In [11]:
display(df1_count)
display(df2_count)


Unnamed: 0,start station id,start station name,count1
0,72,W 52 St & 11 Ave,2078
1,79,Franklin St & W Broadway,860
2,82,St James Pl & Pearl St,477
3,83,Atlantic Ave & Fort Greene Pl,715
4,116,W 17 St & 8 Ave,2545
...,...,...,...
883,3913,Sands St Gate,309
884,3914,West End Ave & W 78 St,455
885,3916,Pearl St & Peck Slip,996
886,3917,Willoughby St & Ashland Pl,217


Unnamed: 0,start station id,start station name,count2
0,72,W 52 St & 11 Ave,942
1,79,Franklin St & W Broadway,225
2,82,St James Pl & Pearl St,309
3,83,Atlantic Ave & Fort Greene Pl,321
4,116,W 17 St & 8 Ave,792
...,...,...,...
881,3914,West End Ave & W 78 St,256
882,3916,Pearl St & Peck Slip,453
883,3917,Willoughby St & Ashland Pl,78
884,3918,Avenue D & E 8 St,583


In [12]:
march_total = pd.merge(df1_count, df2_count, on = "start station id")
march_total

Unnamed: 0,start station id,start station name_x,count1,start station name_y,count2
0,72,W 52 St & 11 Ave,2078,W 52 St & 11 Ave,942
1,79,Franklin St & W Broadway,860,Franklin St & W Broadway,225
2,82,St James Pl & Pearl St,477,St James Pl & Pearl St,309
3,83,Atlantic Ave & Fort Greene Pl,715,Atlantic Ave & Fort Greene Pl,321
4,116,W 17 St & 8 Ave,2545,W 17 St & 8 Ave,792
...,...,...,...,...,...
880,3913,Sands St Gate,309,Sands St Gate,247
881,3914,West End Ave & W 78 St,455,West End Ave & W 78 St,256
882,3916,Pearl St & Peck Slip,996,Pearl St & Peck Slip,453
883,3917,Willoughby St & Ashland Pl,217,Willoughby St & Ashland Pl,78


In [14]:
## drop extra column
march_total = march_total.drop(["start station name_y"], axis = "columns")
march_total

Unnamed: 0,start station id,start station name_x,count1,count2
0,72,W 52 St & 11 Ave,2078,942
1,79,Franklin St & W Broadway,860,225
2,82,St James Pl & Pearl St,477,309
3,83,Atlantic Ave & Fort Greene Pl,715,321
4,116,W 17 St & 8 Ave,2545,792
...,...,...,...,...
880,3913,Sands St Gate,309,247
881,3914,West End Ave & W 78 St,455,256
882,3916,Pearl St & Peck Slip,996,453
883,3917,Willoughby St & Ashland Pl,217,78


In [15]:
## calculate difference 

march_total["diff"] = march_total["count2"] - march_total["count1"]
march_total

Unnamed: 0,start station id,start station name_x,count1,count2,diff
0,72,W 52 St & 11 Ave,2078,942,-1136
1,79,Franklin St & W Broadway,860,225,-635
2,82,St James Pl & Pearl St,477,309,-168
3,83,Atlantic Ave & Fort Greene Pl,715,321,-394
4,116,W 17 St & 8 Ave,2545,792,-1753
...,...,...,...,...,...
880,3913,Sands St Gate,309,247,-62
881,3914,West End Ave & W 78 St,455,256,-199
882,3916,Pearl St & Peck Slip,996,453,-543
883,3917,Willoughby St & Ashland Pl,217,78,-139


In [18]:
## percent change
# ((new value - old value)  / old value) * 100 
march_total["pct_chg"] = \
((march_total["count2"] - march_total["count1"] )/ march_total["count1"]) * 100

march_total

Unnamed: 0,start station id,start station name_x,count1,count2,diff,pct_chg
0,72,W 52 St & 11 Ave,2078,942,-1136,-54.667950
1,79,Franklin St & W Broadway,860,225,-635,-73.837209
2,82,St James Pl & Pearl St,477,309,-168,-35.220126
3,83,Atlantic Ave & Fort Greene Pl,715,321,-394,-55.104895
4,116,W 17 St & 8 Ave,2545,792,-1753,-68.880157
...,...,...,...,...,...,...
880,3913,Sands St Gate,309,247,-62,-20.064725
881,3914,West End Ave & W 78 St,455,256,-199,-43.736264
882,3916,Pearl St & Peck Slip,996,453,-543,-54.518072
883,3917,Willoughby St & Ashland Pl,217,78,-139,-64.055300


In [21]:
## 20 stations with the biggest percent increase in ridership between 
## first and second half of march

march_total.sort_values(by ="pct_chg", ascending = False).head(20)


Unnamed: 0,start station id,start station name_x,count1,count2,diff,pct_chg
798,3825,Broadway & Furman Ave,13,51,38,292.307692
797,3824,Van Sinderen Ave & Truxton St,30,63,33,110.0
844,3871,Bushwick Ave & Furman Ave,14,28,14,100.0
840,3867,Somers St & Broadway,11,18,7,63.636364
837,3864,Central Ave & Covert St,35,57,22,62.857143
789,3816,Metropolitan Ave & Vandervoort Ave,19,30,11,57.894737
812,3839,Putnam Ave & Knickerbocker Ave,36,55,19,52.777778
424,3302,Columbus Ave & W 103 St,93,141,48,51.612903
803,3830,Halsey St & Evergreen Ave,27,40,13,48.148148
847,3874,Menahan St & Seneca Ave,20,29,9,45.0


In [22]:
## 20 stations with the biggest percent descrease in ridership between 
## first and second half of march

march_total.sort_values(by ="pct_chg", ascending = True).head(20)


Unnamed: 0,start station id,start station name_x,count1,count2,diff,pct_chg
777,3793,Stuyvesant Walk & 1 Av Loop,256,2,-254,-99.21875
238,517,Pershing Square South,2668,457,-2211,-82.871064
72,303,Mercer St & Spring St,1304,247,-1057,-81.058282
240,519,Pershing Square North,5577,1102,-4475,-80.240273
698,3664,North Moore St & Greenwich St,2756,558,-2198,-79.753266
323,3105,N 15 St & Wythe Ave,227,47,-180,-79.295154
715,3708,W 13 St & 5 Ave,1902,400,-1502,-78.969506
119,362,Broadway & W 37 St,989,209,-780,-78.867543
713,3704,47 Ave & Skillman Ave,94,20,-74,-78.723404
612,3546,Pacific St & Classon Ave,178,38,-140,-78.651685


In [23]:
##where was the biggest increase in ridership in real number
## between the first and second half?
## provide top 10

march_total.sort_values(by ="diff", ascending = False).head(10)


Unnamed: 0,start station id,start station name_x,count1,count2,diff,pct_chg
854,3881,12 Ave & W 125 St,641,767,126,19.656786
727,3726,Center Blvd & 51 Ave,316,405,89,28.164557
763,3775,Suydam St & Knickerbocker Ave,195,250,55,28.205128
424,3302,Columbus Ave & W 103 St,93,141,48,51.612903
597,3531,Frederick Douglass Blvd & W 129 St,299,343,44,14.715719
798,3825,Broadway & Furman Ave,13,51,38,292.307692
797,3824,Van Sinderen Ave & Truxton St,30,63,33,110.0
561,3493,E 118 St & 3 Ave,266,296,30,11.278195
282,3053,Marcy Ave & Lafayette Ave,156,183,27,17.307692
581,3514,Astoria Park S & Shore Blvd,283,307,24,8.480565


In [24]:
##where was the biggest descrease in ridership in real number
## between the first and second half?
## provide top 10

march_total.sort_values(by ="diff", ascending = True).head(10)


Unnamed: 0,start station id,start station name_x,count1,count2,diff,pct_chg
240,519,Pershing Square North,5577,1102,-4475,-80.240273
400,3255,8 Ave & W 31 St,4245,1042,-3203,-75.453475
204,477,W 41 St & 8 Ave,3849,960,-2889,-75.058457
170,435,W 21 St & 6 Ave,4433,1625,-2808,-63.343109
223,497,E 17 St & Broadway,4172,1413,-2759,-66.131352
149,402,Broadway & E 22 St,3970,1231,-2739,-68.992443
116,359,E 47 St & Park Ave,3396,743,-2653,-78.121319
230,505,6 Ave & W 33 St,3600,969,-2631,-73.083333
216,490,8 Ave & W 33 St,3552,989,-2563,-72.156532
130,379,W 31 St & 7 Ave,3045,791,-2254,-74.022989
