## How to make a time lapse heatmap with Folium using NYC Bike Share Data

The following is an exercise in working with time series data from <a href="https://s3.amazonaws.com/tripdata/index.html" target=blank>Citibike</a> 
<br>
I chose to work with one month, however a web scraper could be built to continually scrape data as its released monthly.
<br>
It will take _Feb 2020_ data and return a time lapse heat map with aggregated times of day within that month of each stations activity. This will then be displayed on a color spectrum that correlates certain colors with higher activity

In [36]:
#import the packages
import pandas as pd
import numpy as np
import folium
from folium.plugins import HeatMap
from folium.plugins import HeatMapWithTime

Write funtion that generates a Folium base map. It will have certain default values that can be changed if needed. Lat/Long location will be only necessary agrument.

In [37]:
# function to generate base map, has default values for zoom and tiles
def generateBaseMap(loc, zoom=12, tiles='Stamen Toner', crs='ESPG2263'):
    '''
    Function that generates a Folium base map
    Input location lat/long
    Zoom level default 12
    Tiles default to Stamen Toner
    CRS default 2263 for NYC
    '''
    return folium.Map(location=loc, 
                      control_scale=True, 
                      zoom_start=zoom,
                      tiles=tiles)

### Generate Base map

Generate base map with custom function. Pass in list with NYC lat/long.

In [38]:
nyc = [40.7400, -73.985880]
base_map = generateBaseMap(nyc)
base_map

### Read in data

Read in one month of downloaded bikeshare data. 
<br>
Perhaps can eventually make this a bit bigger of a project and webscrape to really do a time series study of Citibike bike share data in NYC. For now purpose is to update blog to be a more accurate representation of a time lapse map.

In [39]:
df = pd.read_csv('./data/202002-citibike-tripdata.csv')
df.head()

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
0,1404,2020-02-01 00:00:05.9460,2020-02-01 00:23:30.7240,316,Fulton St & William St,40.70956,-74.006536,481,S 3 St & Bedford Ave,40.712605,-73.962644,28874,Customer,1995,1
1,1301,2020-02-01 00:00:06.2230,2020-02-01 00:21:48.0580,237,E 11 St & 2 Ave,40.730473,-73.986724,539,Metropolitan Ave & Bedford Ave,40.715348,-73.960241,32588,Subscriber,1991,1
2,474,2020-02-01 00:00:15.7210,2020-02-01 00:08:10.3440,528,2 Ave & E 31 St,40.742909,-73.977061,3785,W 42 St & 6 Ave,40.75492,-73.98455,41013,Subscriber,1994,1
3,487,2020-02-01 00:00:21.0520,2020-02-01 00:08:28.7520,380,W 4 St & 7 Ave S,40.734011,-74.002939,3263,Cooper Square & Astor Pl,40.729515,-73.990753,27581,Subscriber,1973,2
4,619,2020-02-01 00:00:27.4000,2020-02-01 00:10:47.0640,472,E 32 St & Park Ave,40.745712,-73.981948,237,E 11 St & 2 Ave,40.730473,-73.986724,29062,Subscriber,1994,1


In [40]:
# replace all space in column headers with underscore
df.columns = [col.replace(' ', '_') for col in df.columns]

In [41]:
df.shape

(1146830, 15)

Need to turn `starttime` into a datetime object so that I can pull an hour column from it. 

In [42]:
df['starttime'] = pd.to_datetime(df['starttime'], format='%Y-%m-%d %H:%M:%S')

Extract hours from datetime column

In [43]:
df['hour'] = df['starttime'].dt.hour

Add a count column to count how many of rides during each hour were taken from a given station.

In [44]:
df['count'] = 1

Create new df with groupby `start_station_id`, `start_station_latitude`, `start_station_longitude` and sum up `count` column.

In [45]:
df2 = pd.DataFrame(df.groupby(['start_station_id', 'start_station_latitude', 'start_station_longitude'])['count']\
                        .sum().sort_values(ascending=False))

df2.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,count
start_station_id,start_station_latitude,start_station_longitude,Unnamed: 3_level_1
519,40.751873,-73.977706,9393
435,40.74174,-73.994156,7140
3255,40.750585,-73.994685,6993
497,40.73705,-73.990093,6836
402,40.740343,-73.989551,6284


In [46]:
# create list of lat/long and count (as weight)
lst = df2.groupby(['start_station_latitude', 'start_station_longitude']).sum().reset_index().values.tolist()

### Create Heat Map

In [47]:
# add data to basemap 
HeatMap(data=lst, radius=12).add_to(base_map);

# save base map as .html
base_map.save('./bike_station_HeatMap.html')

# call map 
base_map

## Create Heat Map with Time

In [48]:
df_hour_list = []
for hour in df['hour'].sort_values().unique():
    df_hour_list.append(df.loc[df['hour'] == hour, ['start_station_latitude', 'start_station_longitude', 'count']].groupby(['start_station_latitude', 'start_station_longitude']).sum().reset_index().values.tolist())
df_hour_list

[[[40.65708866668485, -74.00870203971863, 2.0],
  [40.6610633719006, -73.97945255041122, 15.0],
  [40.6627059, -73.9569115, 3.0],
  [40.6630619, -73.9538746, 11.0],
  [40.66314, -73.9605695, 4.0],
  [40.663779, -73.98396846, 3.0],
  [40.6642406, -73.9574686, 3.0],
  [40.66514681533792, -73.97637605667114, 5.0],
  [40.665816, -73.956934, 2.0],
  [40.6662078, -73.98199886, 7.0],
  [40.666287, -73.98895053, 4.0],
  [40.6663181, -73.9854617, 1.0],
  [40.666439306870814, -73.9605563879013, 1.0],
  [40.668127, -73.98377641, 2.0],
  [40.668132, -73.97363831, 3.0],
  [40.6685455, -73.99333264, 3.0],
  [40.668603, -73.9904394, 1.0],
  [40.6686273, -73.98700053, 2.0],
  [40.6686627, -73.97988067, 2.0],
  [40.6686744, -73.9618148, 1.0],
  [40.6691783, -73.9554162, 6.0],
  [40.6703837, -73.97839676, 3.0],
  [40.6704922, -73.98541675, 3.0],
  [40.6705135, -73.98876585, 6.0],
  [40.6707767, -73.9576801, 16.0],
  [40.6711978, -73.97484126, 1.0],
  [40.6716493, -73.9631145, 7.0],
  [40.671907, -73.993

In [49]:
# instantiate HeatMapWithTime
HeatMapWithTime(df_hour_list,radius=11, 
                autoplay=True,
                gradient={0.1: 'blue', 0.5: 'lime', 0.7: 'orange', 1: 'red'}, 
                min_opacity=0.4, 
                max_opacity=0.8, 
                use_local_extrema=True).add_to(base_map)

# save as html
base_map.save('./heatmapwithtime_bikeshare.html')

# call result
base_map