# Map Visualization: New York taxi pickups

**Goal:** biuld several visualization of New York taxi pickups map.

**Step 1.** Let’s prepare the dataset of New York taxi moving over the year.<br>
First of all we download the libraries.

In [7]:
import numpy as np
import pandas as pd

In [8]:
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.style.use(['ggplot'])

Now we download the dataset.

In [9]:
data = pd.read_csv('data_taxi.csv')
data.head()

Unnamed: 0,id,vendor_id,pickup_datetime,passenger_count,pickup_longitude,pickup_latitude,dropoff_longitude,dropoff_latitude,store_and_fwd_flag
0,id3004672,1,2016-06-30 23:59:58,1,-73.988129,40.732029,-73.990173,40.75668,N
1,id3505355,1,2016-06-30 23:59:53,1,-73.964203,40.679993,-73.959808,40.655403,N
2,id1217141,1,2016-06-30 23:59:47,1,-73.997437,40.737583,-73.98616,40.729523,N
3,id2150126,2,2016-06-30 23:59:41,1,-73.95607,40.7719,-73.986427,40.730469,N
4,id1598245,1,2016-06-30 23:59:33,1,-73.970215,40.761475,-73.96151,40.75589,N


Let’s check the data info.

In [10]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 625134 entries, 0 to 625133
Data columns (total 9 columns):
 #   Column              Non-Null Count   Dtype  
---  ------              --------------   -----  
 0   id                  625134 non-null  object 
 1   vendor_id           625134 non-null  int64  
 2   pickup_datetime     625134 non-null  object 
 3   passenger_count     625134 non-null  int64  
 4   pickup_longitude    625134 non-null  float64
 5   pickup_latitude     625134 non-null  float64
 6   dropoff_longitude   625134 non-null  float64
 7   dropoff_latitude    625134 non-null  float64
 8   store_and_fwd_flag  625134 non-null  object 
dtypes: float64(4), int64(2), object(3)
memory usage: 42.9+ MB


We need pickup info: longitudes and latitude.<br>
Also we will require pickup_datetime info.<br>
All these features are in the necessary dtype: **float64** and **object** (the last for datetime).<br>
Datetime is for the markers info so we will not convert it in any other type.

In [11]:
data.shape

(625134, 9)

As we see, we totally have **625134** observations. 

**Step 2.** Generating basic toner and terrain maps.

Let’s import **folium** library.

In [12]:
import folium

Now we build a map of New York location based on the correspond latitude and longitude.

In [14]:
ny_map_base = folium.Map(location=[40.730610, -73.935242], zoom_start=4)
ny_map_base

Let’s check the toner varsion map, which is really useful for states, river and lake borders.

In [15]:
ny_map_toner = folium.Map(location=[40.730610, -73.935242], zoom_start=4, tiles='Stamen Toner')
ny_map_toner

We also can see the zoomed version of the map.<br>
Argument **zoom_start** set to 8 suits it just fine.

In [16]:
ny_map_toner = folium.Map(location=[40.730610, -73.935242], zoom_start=8, tiles='Stamen Toner')
ny_map_toner

We also can check the terrain map, with different terrain features and locations.

In [17]:
ny_map_terrain = folium.Map(location=[40.730610, -73.935242], zoom_start=4, tiles='Stamen Terrain')
ny_map_terrain

The zoomed version is also really useful.

In [18]:
ny_map_terrain = folium.Map(location=[40.730610, -73.935242], zoom_start=8, tiles='Stamen Terrain')
ny_map_terrain

**Step 3.** Building New York map with the first 100 pickup location from the dataset.

Let’s get the work dataset. We get the first 100 items from the taxi records.

In [19]:
data_work = data.iloc[:100,:]
data_work.head()

Unnamed: 0,id,vendor_id,pickup_datetime,passenger_count,pickup_longitude,pickup_latitude,dropoff_longitude,dropoff_latitude,store_and_fwd_flag
0,id3004672,1,2016-06-30 23:59:58,1,-73.988129,40.732029,-73.990173,40.75668,N
1,id3505355,1,2016-06-30 23:59:53,1,-73.964203,40.679993,-73.959808,40.655403,N
2,id1217141,1,2016-06-30 23:59:47,1,-73.997437,40.737583,-73.98616,40.729523,N
3,id2150126,2,2016-06-30 23:59:41,1,-73.95607,40.7719,-73.986427,40.730469,N
4,id1598245,1,2016-06-30 23:59:33,1,-73.970215,40.761475,-73.96151,40.75589,N


We can check the result. Everything is correct: we got 100 rows with 9 features.

In [20]:
data_work.shape

(100, 9)

Now we can use latitude and longitude coordinates of New York for the correspond map.<br>
As we will work with pickup data (street level details), we will use zoom value set for 12.

In [21]:
latitude = 40.730610
longitude = -73.935242

In [22]:
ny_map_work = folium.Map(location=[latitude, longitude], zoom_start=12)
ny_map_work

Now we will set all the necessary markers on the map.<br>
We will use yellow border color and green fill color.

In [23]:
pickup_data = folium.map.FeatureGroup()
pickup_data

<folium.map.FeatureGroup at 0xb3a3de1b80>

In [24]:
for lat, lon in zip(data_work.pickup_latitude, data_work.pickup_longitude):
    pickup_data.add_child(
        folium.features.CircleMarker(
            [lat, lon],
            radius=5,
            color='yellow',
            fill=True,
            fill_color='green',
            fill_opacity=0.6
        )
    )

ny_map_work.add_child(pickup_data)

Instead of green dots we can use an actual markers.<br>
Let’s also set marker information for the map:<br>
we will use **pickup_datetime** feature for this.<br>
This information will show with each LMB click on marker.

In [25]:
pickup_data = folium.map.FeatureGroup()

for lat, lon in zip(data_work.pickup_latitude, data_work.pickup_longitude):
    pickup_data.add_child(
        folium.features.CircleMarker(
            [lat, lon],
            radius=5,
            color='yellow',
            fill=True,
            fill_color='green',
            fill_opacity=0.6
        )
    )

latitudes = list(data_work.pickup_latitude)
longitudes = list(data_work.pickup_longitude)
labels = list(data_work.pickup_datetime)

for lat, lng, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, lng], popup=label).add_to(ny_map_work)

ny_map_work.add_child(pickup_data)

If markers are not necessary, we can use the same data dots that shows popup messages after clicking on them.<br>
For popup info we used the same feature: pickup_datetime.

In [26]:
ny_map_work = folium.Map(location=[latitude, longitude], zoom_start=12)

for lat, lng, label in zip(data_work.pickup_latitude, data_work.pickup_longitude, data_work.pickup_datetime):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5, # define how big you want the circle markers to be
        color='yellow',
        fill=True,
        popup=label,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(ny_map_work)
    
ny_map_work

We also can use the clusters. We have many different pickup dots, so it can be difficult<br>
to check them not on a big scales. The clusters will combine different data according to the<br>
scale in different areas.

In [27]:
from folium import plugins

ny_map_work = folium.Map(location = [latitude, longitude], zoom_start = 12)

pickup_data = plugins.MarkerCluster().add_to(ny_map_work)

for lat, lng, label, in zip(data_work.pickup_latitude, data_work.pickup_longitude, data_work.pickup_datetime):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(pickup_data)

ny_map_work

**Result.** We build several maps as visualization for New York taxi activities: toner, terrain, marked.<br>Each map is interactive, the last one has several version (from simple marker dots to cluster areas).