-
Notifications
You must be signed in to change notification settings - Fork 1
Temporal Analysis of Central Bike Flows
From the Analyzing Spatial Demand using Mapping section we observed a excess of supply in central London when we averaged the data over a week.
We want to understand how the flow of bikes changes at different times. It is not useful just to analyze the overall trips starting and ending at different times. Instead we can create a polygon specifying the excess supply area and analyze trips starting and ending in the different locations at different hours.
We use the polygon from the shapely library to create a geometric area representing central London. The polygon object is used for plotting on the map later and the polygon_geom is spatial co-ordinate data.
def createPoly():
lonList = [-0.18,-0.154,-0.148,-0.14,-0.13,-0.109,-0.105,-0.1,-0.09,-0.074,-0.074,-0.078,-0.117,-0.127,-0.14,-0.148,-0.162,-0.174,-0.18]
latList = [51.5,51.518,51.52,51.521,51.523,51.522,51.524,51.523,51.527,51.524,51.509,51.5,51.497,51.493,51.493,51.492,51.489,51.494,51.495]
polygon_geom = Polygon(zip(lonList, latList))
crs = {'init': 'epsg:4326'}
polygon = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[polygon_geom])
return polygon,polygon_geom
We can plot the polygon which captures the central supply area on the map we created in the previous section. First lets take a look at how the polygon captures the weekly supply.
As can be seen the Polygon captures most of the supply within central London. Next lets take a look at the same polygon with respect to demand.
We see that most of the demand occurs outside of the polygon at the end of the week. We now have a geometric shape that captures the majority of the supply in central London. We can analyze bike flows into this shape over 24 hours to gain useful insights into the flow of bikes at different times.
We want to understand the following
*How many bikes flow in and out of the Polygon at each hour *How does the demand change over time in the Polygon
To do this we start by gathering the station data, trip data and polygon data. We also convert the station data to a Geopandas DataFrame
trips,bikeStations = getAndProcessData.getAndProcessData()
bikeStations = gpd.GeoDataFrame(bikeStations,crs="EPSG:4326",geometry=gpd.points_from_xy(bikeStations.lon, bikeStations.lat) )
polygon,pg = createCentralSupplyPolygon.createPoly()
Next we use similar logic to the one used for processing the daily/weekly trips for mapping. We iterate over 24 hours and apply a time mask at each step.
for i in range:
timeMaskEnd = (trips['End Date'].dt.hour >= i) & (trips['End Date'].dt.hour < i +1)
timeMaskStart = (trips['Start Date'].dt.hour >= i) & (trips['Start Date'].dt.hour < i +1)
endFrame= trips[timeMaskEnd]
endFrameCounts = endFrame['EndStation Id'].value_counts()
startFrame = trips[timeMaskStart]
startFrameCounts = startFrame['StartStation Id'].value_counts()
After this we apply some additional logic in order to identify which of these stations start and end in the polygon.
#trips starting in polygon
startFrameCountsInPoly = startFrameCounts.copy().to_frame(name ="count").reset_index()
m= startFrameCountsInPoly.apply( lambda x:pg.contains(bikeStations.loc[x['index']].geometry),axis=1)
startFrameCountsInPoly = startFrameCountsInPoly[m]
#trips ending in polygon
endFrameCountsInPoly = endFrameCounts.copy().to_frame(name ="count").reset_index()
m= endFrameCountsInPoly.apply( lambda x:pg.contains(bikeStations.loc[x['index']].geometry),axis=1)
endFrameCountsInPoly = endFrameCountsInPoly[m]
What we are doing here is creating a copy of the series containing the counts and id's of all the trips starting and ending in that hour. We then create a mask that filters stations not in the central supply polygon and apply it to the series.
Finally we append the total trips starting and ending in the polygon to their respective arrays (indexed by hour). We also capture the demand inside the polygon at that hour. At the end of the loop we have all the data specified above.
tripsStartingInPoly.append( startFrameCountsInPoly['count'].sum())
tripsEndingInPoly.append(endFrameCountsInPoly['count'].sum())
demandInPoly = demandInPoly + startFrameCountsInPoly['count'].sum() - endFrameCountsInPoly['count'].sum()
demandInPolyAtTime.append(demandInPoly)
We plot the data obtained by the previous code using Seaborn and obtain the following.
We see a clear pattern in the data. Between 5am-8am and 5pm-7pm there is an overall larger number of trips ending and starting in the polygon. This corresponds to the rush hour traffic before work starts and ends. We also note that there are more trips into the polygon than out during the morning rush hour. There are more trips out of the polygon than into it during the evening rush hour. We can infer that the majority of the trips involve people entering central London during the morning and leaving it during the evening. We also plot the cumulative demand over time.
Notice that the polygon almost never has demand for bikes as denoted by the negative demand value. We also see a sharp increase in the supply (supply is negative demand) during the morning rush hour period. We also see a decrease in supply during the evening rush hour, which corresponds to trips out of the polygon as seen in the previous graph. However the decrease at the evening rush hour is not enough to offset the supply incoming from the morning rush hour. This implies that there are more bikes coming in during the morning rush hour than there are bikes leaving during the evening rush hour.