****Spatial Visualizations and Analysis of Uber Pickups in NYC****

This directory contains data on over 4.5 million Uber pickups in New York City from April to September 2014, and 14.3 million more Uber pickups from January to June 2015. Trip-level data on 10 other for-hire vehicle (FHV) companies, as well as aggregated data for 329 FHV companies, is also included. All the files are as they were received on August 3, Sept. 15 and Sept. 22, 2015.

> Folium Package

Folium is a Python Library that can allow us to visualize spatial data in an interactive manner, straight within the notebooks environment many (at least myself) prefers. The library is highly intuitive to use, and it offers a high degree of interactivity with a low learning curve. Best of all, it is all Open Source.

In [None]:
import pandas as pd
import folium

In [None]:
#Load the Data
df_apr14 = pd.read_csv('../input/uber-raw-data-apr14.csv')
df_may14 = pd.read_csv('../input/uber-raw-data-may14.csv')
df_jun14 = pd.read_csv('../input/uber-raw-data-jun14.csv')
df_jul14 = pd.read_csv('../input/uber-raw-data-jul14.csv')
df_aug14 = pd.read_csv('../input/uber-raw-data-aug14.csv')
df_sep14 = pd.read_csv('../input/uber-raw-data-sep14.csv')
df_janjune15 = pd.read_csv('../input/uber-raw-data-janjune-15.csv')

In [None]:
#Create Basemap that center in NYC
def generateBaseMap(default_location=[40.693943, -73.985880], default_zoom_start=12):
    base_map = folium.Map(location=default_location, control_scale=True, zoom_start=default_zoom_start)
    return base_map
base_map = generateBaseMap()
base_map

In [None]:
df_14 = pd.concat([df_apr14, df_may14, df_jun14, df_jul14, df_aug14, df_sep14], sort=False, ignore_index=True)
df_14.head()

In [None]:
df_14.info()

We do have quite a high amount of ride data that we will be able to work on (4M+ rows for 6 months of data).

In [None]:
df_14['Date/Time'] = pd.to_datetime(df_14['Date/Time'], format='%m/%d/%Y %H:%M:%S')

In [None]:
df_14['month'] = df_14['Date/Time'].apply(lambda x: x.month)
df_14['week'] = df_14['Date/Time'].apply(lambda x: x.week)
df_14['day'] = df_14['Date/Time'].apply(lambda x: x.day)
df_14['hour'] = df_14['Date/Time'].apply(lambda x: x.hour)

In [None]:
df_14.head()

In [None]:
#We first want to create a heatmap Uber Pickups in April from 12am to 5am.


from folium import plugins
from folium.plugins import HeatMap

df_copy = df_14[(df_14.hour < 5) & (df_14.month == 4)].copy()
df_copy['count'] = 1

In [None]:
df_hour_list = []
for hour in df_copy.hour.sort_values().unique():
    df_hour_list.append(df_copy.loc[df_copy.hour == hour, ['Lat', 'Lon', 'count']].groupby(['Lat', 'Lon']).sum().reset_index().values.tolist())


In [None]:
from folium.plugins import HeatMapWithTime
base_map = generateBaseMap(default_zoom_start=11)
HeatMapWithTime(df_hour_list, radius=5, gradient={0.2: 'blue', 0.4: 'lime', 0.6: 'orange', 1: 'red'}, min_opacity=0.5, max_opacity=0.8, use_local_extrema=True).add_to(base_map)
base_map

In [None]:
#create a heatmap Uber Pickups in April from 5am to 9am.

In [None]:
df_copy2 = df_14[(df_14.hour >= 5) & (df_14.hour < 9) & (df_14.month == 4)].copy()
df_copy2['count'] = 1

In [None]:
df_hour_list2 = []
for hour in df_copy2.hour.sort_values().unique():
    df_hour_list2.append(df_copy2.loc[df_copy2.hour == hour, ['Lat', 'Lon', 'count']].groupby(['Lat', 'Lon']).sum().reset_index().values.tolist())


In [None]:
base_map_59 = generateBaseMap(default_zoom_start=11)
HeatMapWithTime(df_hour_list2, radius=5, gradient={0.2: 'blue', 0.4: 'lime', 0.6: 'orange', 1: 'red'}, 
                min_opacity=0.5, max_opacity=0.8, use_local_extrema=True).add_to(base_map_59)
base_map_59

In [None]:
#create a heatmap Uber Pickups in April from 9am to 11am.

df_copy3 = df_14[(df_14.hour >= 9) & (df_14.hour < 11) & (df_14.month == 4)].copy()
df_copy3['count'] = 1

In [None]:
df_hour_list3 = []
for hour in df_copy3.hour.sort_values().unique():
    df_hour_list3.append(df_copy3.loc[df_copy3.hour == hour, ['Lat', 'Lon', 'count']].groupby(['Lat', 'Lon']).sum().reset_index().values.tolist())


In [None]:
base_map_911 = generateBaseMap(default_zoom_start=11)
HeatMapWithTime(df_hour_list3, radius=5, gradient={0.2: 'blue', 0.4: 'lime', 0.6: 'orange', 1: 'red'}, 
                min_opacity=0.5, max_opacity=0.8, use_local_extrema=True).add_to(base_map_911)
base_map_911