Skip to content

Temporal Analysis of Central Bike Flows

Harsha Ramachandran edited this page Jul 25, 2022 · 11 revisions

From the Analyzing Spatial Demand using Mapping section we observed a excess of supply in central London when we averaged the data over a week.

We want to understand how the flow of bikes changes at different times. It is not useful just to analyze the overall trips starting and ending at different times. Instead we can create a polygon specifying the excess supply area and analyze trips starting and ending in the different locations at different hours.

Creating Polygon

We use the polygon from the shapely library to create a geometric area representing central London. The polygon object is used for plotting on the map later and the polygon_geom is spatial co-ordinate data.

def createPoly():
    lonList = [-0.18,-0.154,-0.148,-0.14,-0.13,-0.109,-0.105,-0.1,-0.09,-0.074,-0.074,-0.078,-0.117,-0.127,-0.14,-0.148,-0.162,-0.174,-0.18]
    latList = [51.5,51.518,51.52,51.521,51.523,51.522,51.524,51.523,51.527,51.524,51.509,51.5,51.497,51.493,51.493,51.492,51.489,51.494,51.495]
    polygon_geom = Polygon(zip(lonList, latList))
    crs = {'init': 'epsg:4326'}
    polygon = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[polygon_geom])       
    return polygon,polygon_geom

Plotting Polygon on Map

We can plot the polygon which captures the central supply area on the map we created in the previous section. First lets take a look at how the polygon captures the weekly supply.

polygon-supply

As can be seen the Polygon captures most of the supply within central London. Next lets take a look at the same polygon with respect to demand.

polygon-demand

We see that most of the demand occurs outside of the polygon at the end of the week. We now have a geometric shape that captures the majority of the supply in central London. We can analyze bike flows into this shape over 24 hours to gain useful insights into the flow of bikes at different times.

Temporal Analysis of Bike Flows in Polygon

Data Manipulation Logic

We want to understand the following

*How many bikes flow in and out of the Polygon at each hour *How does the demand change over time in the Polygon

To do this we start by gathering the station data, trip data and polygon data. We also convert the station data to a Geopandas DataFrame

trips,bikeStations = getAndProcessData.getAndProcessData()
bikeStations = gpd.GeoDataFrame(bikeStations,crs="EPSG:4326",geometry=gpd.points_from_xy(bikeStations.lon, bikeStations.lat) )
polygon,pg = createCentralSupplyPolygon.createPoly()

Next we use similar logic to the one used for processing the daily/weekly trips for mapping. We iterate over 24 hours and apply a time mask at each step.

for i in range:
    timeMaskEnd = (trips['End Date'].dt.hour >= i) & (trips['End Date'].dt.hour < i +1)
    timeMaskStart = (trips['Start Date'].dt.hour >= i) & (trips['Start Date'].dt.hour < i +1)
    endFrame= trips[timeMaskEnd]
    endFrameCounts = endFrame['EndStation Id'].value_counts()
    startFrame = trips[timeMaskStart]
    startFrameCounts = startFrame['StartStation Id'].value_counts()

After this we apply some additional logic in order to identify which of these stations start and end in the polygon.

 #trips starting in polygon 
 startFrameCountsInPoly = startFrameCounts.copy().to_frame(name ="count").reset_index()
 m= startFrameCountsInPoly.apply( lambda x:pg.contains(bikeStations.loc[x['index']].geometry),axis=1)
 startFrameCountsInPoly = startFrameCountsInPoly[m]

 #trips ending in polygon 
 endFrameCountsInPoly = endFrameCounts.copy().to_frame(name ="count").reset_index()
 m= endFrameCountsInPoly.apply( lambda x:pg.contains(bikeStations.loc[x['index']].geometry),axis=1)
 endFrameCountsInPoly = endFrameCountsInPoly[m]

What we are doing here is creating a copy of the series containing the counts and id's of all the trips starting and ending in that hour. We then create a mask that filters stations not in the central supply polygon and apply it to the series.


Finally we append the total trips starting and ending in the polygon to their respective arrays (indexed by hour). We also capture the demand inside the polygon at that hour. At the end of the loop we have all the data specified above.

 tripsStartingInPoly.append( startFrameCountsInPoly['count'].sum())
 tripsEndingInPoly.append(endFrameCountsInPoly['count'].sum())
 demandInPoly =  demandInPoly + startFrameCountsInPoly['count'].sum() - endFrameCountsInPoly['count'].sum()
 demandInPolyAtTime.append(demandInPoly) 

Graphs and Analysis

We plot the data obtained by the previous code using Seaborn and obtain the following.

polygon-supply

We see a clear pattern in the data. Between 5am-8am and 5pm-7pm there is an overall larger number of trips ending and starting in the polygon. This corresponds to the rush hour traffic before work starts and ends. We also note that there are more trips into the polygon than out during the morning rush hour. There are more trips out of the polygon than into it during the evening rush hour. We can infer that the majority of the trips involve people entering central London during the morning and leaving it during the evening. We also plot the cumulative demand over time.

polygon-demand-cum

Notice that the polygon almost never has demand for bikes as denoted by the negative demand value. We also see a sharp increase in the supply (supply is negative demand) during the morning rush hour period. We also see a decrease in supply during the evening rush hour, which corresponds to trips out of the polygon as seen in the previous graph. However the decrease at the evening rush hour is not enough to offset the supply incoming from the morning rush hour. This implies that there are more bikes coming in during the morning rush hour than there are bikes leaving during the evening rush hour.