# Visualization using maps in Python


Another perfect tool for vizualisation data processed in Python is `folium`. It builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. It manipulate your data in Python, then visualize it in on a Leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing Vincent/Vega visualizations as markers on the map.

In [3]:
%matplotlib inline

In [7]:
!sudo -H pip3 install -U folium
!sudo -H pip3 install -U geopandas

Collecting folium
  Using cached https://files.pythonhosted.org/packages/88/89/8186c3441eb2a224d2896d9a8db6ded20ddd225f109e6144494a9893a0c1/folium-0.6.0-py3-none-any.whl
Collecting branca>=0.3.0 (from folium)
  Using cached https://files.pythonhosted.org/packages/b5/18/13c018655f722896f25791f1db687db5671bd79285e05b3dd8c309b36414/branca-0.3.0-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.3.0 folium-0.6.0
[33mYou are using pip version 18.0, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
Collecting geopandas
  Using cached https://files.pythonhosted.org/packages/24/11/d77c157c16909bd77557d00798b05a5b6615ed60acb5900fbe6a65d35e93/geopandas-0.4.0-py2.py3-none-any.whl
Collecting fiona (from geopandas)
[?25l  Downloading https://files.pythonhosted.org/packages/e3/bf/029958f4e3811ce7017fb5805d5203e8bde6c1816b902964acb2dec67863/Fiona-1.7.13-cp36-cp36m-manylinux1_x86_64.whl (15.8M

`folium` provides very detailed map and we may use it to visualize geodata localized in a small scale. Let's get the data from the [Citibike API](http://www.citibikenyc.com/stations/json):

In [8]:
import requests
import pandas as pd

url = 'http://www.citibikenyc.com/stations/json'
results = requests.get(url).json()
data = results["stationBeanList"]

citibike = pd.DataFrame(data)
citibike.set_index('id', inplace=True)

citibike

Unnamed: 0_level_0,altitude,availableBikes,availableDocks,city,landMark,lastCommunicationTime,latitude,location,longitude,postalCode,stAddress1,stAddress2,stationName,statusKey,statusValue,testStation,totalDocks
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
281,,40,11,,,2018-10-10 05:33:59 PM,40.764397,,-73.973715,,Grand Army Plaza & Central Park S,,Grand Army Plaza & Central Park S,1,In Service,False,57
304,,24,6,,,2018-10-10 05:34:47 PM,40.704633,,-74.013617,,Broadway & Battery Pl,,Broadway & Battery Pl,1,In Service,False,33
359,,5,45,,,2018-10-10 05:34:48 PM,40.755103,,-73.974987,,E 47 St & Park Ave,,E 47 St & Park Ave,1,In Service,False,53
377,,7,29,,,2018-10-10 05:33:58 PM,40.722438,,-74.005664,,6 Ave & Canal St,,6 Ave & Canal St,1,In Service,False,45
402,,17,20,,,2018-10-10 05:34:47 PM,40.740343,,-73.989551,,Broadway & E 22 St,,Broadway & E 22 St,1,In Service,False,39
426,,28,1,,,2018-10-10 05:34:13 PM,40.717548,,-74.013221,,West St & Chambers St,,West St & Chambers St,1,In Service,False,29
504,,7,36,,,2018-10-10 05:34:17 PM,40.732219,,-73.981656,,1 Ave & E 16 St,,1 Ave & E 16 St,1,In Service,False,44
514,,26,24,,,2018-10-10 05:34:23 PM,40.760875,,-74.002777,,12 Ave & W 40 St,,12 Ave & W 40 St,1,In Service,False,52
520,,9,30,,,2018-10-10 05:34:37 PM,40.759923,,-73.976485,,W 52 St & 5 Ave,,W 52 St & 5 Ave,1,In Service,False,39
3223,,22,29,,,2018-10-10 05:34:22 PM,40.758997,,-73.968654,,E 55 St & 3 Ave,,E 55 St & 3 Ave,1,In Service,False,53


#### Selecting tiles

The first step is to create a map. At the very basic level, we select the location, zoom level, and potentially the tiles (i.e., the style of the map) for the background. The default is 'OpenStreetMap', but often for visualizations we prefer other, more visually neutral styles. (See http://folium.readthedocs.io/en/latest/quickstart.html for more tile optionS)

In [9]:
import folium
fmap = folium.Map(location=[40.73, -74], zoom_start=13, tiles='OpenStreetMap')
fmap

In [10]:
import folium
fmap = folium.Map(location=[40.73, -74], zoom_start=12,  tiles='cartodbpositron')
fmap

#### Adding Markers 

For every station, we are going to add a marker in the map:
* Using the longitude and latitude for the location 
* Modify the color of the marker to reflect the status of the station
* Modify the opacity to be the percentage of bikes in the station. 
* Modify the size of the circle to corresponds to the size of the station.

In [11]:
for name, row in citibike.iterrows():
    
    # Define the opacity of the marker to be proportional to the percentage of bikes in the station
    opacity = row["availableBikes"]/row["totalDocks"] if row["statusValue"] == 'In Service' else 1.0
    # Make the color green for the working stations, red otherwise
    color = "green" if row["statusValue"] == 'In Service' else "red"
    # The size of the marker is proportional to the number of docks
    size = row["totalDocks"]/10 if row["statusValue"] == 'In Service' else 5

    # We create a marker on the map and we add it to the map
    folium.CircleMarker(location=[row["latitude"], row["longitude"]], 
                        radius = size,
                        color='black', weight=0.5, 
                        fill=True,
                        fill_opacity = opacity,
                        fill_color = color,
                       ).add_to(fmap)
    


In [12]:
fmap

#### Adding popups to the markers

For each marker, we can also have a popup with text, html, or even other charts/visualizations. Here is an example of adding an HTML popup to each marker.

In [13]:
fmap = folium.Map(location=[40.73, -74], zoom_start=13,  tiles='cartodbpositron')

for name, row in citibike.iterrows():
    
    # Define the opacity of the marker to be proportional to the percentage of bikes in the station
    opacity = row["availableBikes"]/row["totalDocks"] if row["statusValue"] == 'In Service' else 1.0
    # Make the color green for the working stations, red otherwise
    color = "green" if row["statusValue"] == 'In Service' else "red"
    # The size of the marker is proportional to the number of docks
    size = row["totalDocks"]/5 if row["statusValue"] == 'In Service' else 5
    
   
    # The code below defines a pop-up for each station with details such as 
    # the address, number of bikes, capacity, etc.
    html = "<p style='font-family:sans-serif;font-size:11px'>" + \
           "<strong>Address: </strong>" + row["stAddress1"] + \
           "<br><strong>Available Bikes: </strong>" + str(row["availableBikes"]) + \
            "<br><strong>Total Docks: </strong>" + str(row["totalDocks"])
    iframe = folium.IFrame(html=html, width=200, height=60)
    popup = folium.Popup(iframe, max_width=200)

    # We create a marker on the map and we add it to the map
    folium.CircleMarker(location=[row["latitude"], row["longitude"]], 
                        radius = size,
                        popup = popup, 
                        color='black', weight=0.5, 
                        fill=True,
                        fill_opacity = opacity,
                        fill_color = color,
                       ).add_to(fmap)

In [14]:
    
fmap

In [19]:
# This code below is for generating an image screenshot from a Folium map
# It is kind of convoluted, as it involves saving an HTML file
# and then launching a Selenium-driven browser and saving a screenshot
# NOTE: This requires having a properly working installation
# of Selenium


import os
import time

from pyvirtualdisplay import Display

display = Display(visible=0, size=(1600, 1600))
display.start()

delay=5
fn='citibike.html'
tmpurl='file://{path}/{mapfile}'.format(path=os.getcwd(),mapfile=fn)
fmap.save(fn)

In [18]:
!sudo pip3 install pyvirtualdisplay

[33mThe directory '/home/ubuntu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
[33mThe directory '/home/ubuntu/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
Collecting pyvirtualdisplay
  Downloading https://files.pythonhosted.org/packages/39/37/f285403a09cc261c56b6574baace1bdcf4b8c7428c8a7239cbba137bc0eb/PyVirtualDisplay-0.2.1.tar.gz
Collecting EasyProcess (from pyvirtualdisplay)
  Downloading https://files.pythonhosted.org/packages/0d/f1/d2de7591e7dfc164d286fa16f051e6c0cf3141825586c3b04ae7cda7ac0f/EasyProcess-0.2.3.tar.gz
Installing collected packages: EasyProcess, pyvirtualdisplay
  Running setup.py install for EasyProcess ... [?25ldon

In [None]:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get(tmpurl)
#Give the map tiles some time to load
time.sleep(delay)
browser.save_screenshot('citibike.png')
browser.quit()

#### Using shapefiles 

In [None]:
# Dataset from NYC Open Data: https://data.cityofnewyork.us/City-Government/Neighborhood-Tabulation-Areas/cpf4-rkhq
!curl 'https://data.cityofnewyork.us/api/geospatial/cpf4-rkhq?method=export&format=GeoJSON' -o data/nyc-neighborhoods.geojson

In [None]:
# NYC Zipcodes
!curl 'https://data.cityofnewyork.us/download/i8iw-xf4u/application%2Fzip' -o 'data/ZIP_CODE_040114.zip'

In [None]:
import geopandas as gpd

nyc_neighborhoods = gpd.GeoDataFrame.from_file('data/nyc-neighborhoods.geojson')
nyc_neighborhoods = nyc_neighborhoods[['ntacode', 'ntaname', 'geometry']]
nyc_neighborhoods.head(10)

In [None]:
nyc_neighborhoods.plot(
    figsize=(20,20), 
    color = 'white', 
    edgecolor = 'black'
)

In [None]:
import folium
fmap = folium.Map(location=[40.73, -74], zoom_start=12, tiles='cartodbpositron')

folium.GeoJson(nyc_neighborhoods,
               name='NYC Neighborhoods',
               style_function=lambda feature: {
                    'fillColor': '#c0fefe',
                    'color': 'black',
                    'weight': 1,
                    'fillOpacity': 0.25
                }
              ).add_to(fmap)

fmap

We first transform the Citibike Dataframe into a GeoPandas datagframe, by creating 
a column that contains Points, naming the column "geometry" and then 
setting the CRS (coordinate system) to be the same as the one for NYC neighborhoods

In [None]:
from shapely.geometry import Point

citibike['geometry'] = citibike.apply(lambda row: Point(row['longitude'], row['latitude']), axis=1 )
citibike_gs = gpd.GeoDataFrame.from_records(citibike)


#### Spatial Join using GeoPandas

Now we join the `nyc_neighborhoods` dataframe that describes the NYC neighborhoods with the `citibike_gs` that has the locations of the Citibike stations.

In [None]:
citibike_gs.crs = nyc_neighborhoods.crs
stations_to_neighborhoods = gpd.sjoin(nyc_neighborhoods, citibike_gs, how="inner", op='intersects')

In [None]:
totaldocks = pd.pivot_table(
    data = stations_to_neighborhoods, 
    index = 'ntacode',
    values = 'totalDocks', 
    aggfunc = 'sum'
).drop(['MN99', 'BK99']) # drop the 'misc' areas

totaldocks.head(5)

In [None]:
# GeoPandas choropleths / not very appealing visually
# 
# nyc.set_index('ntacode').join(totaldocks, how='left').fillna(0).plot(
#     figsize=(20,20), column='totalDocks', cmap='OrRd', scheme='Quantiles', linewidth=0.1)

In [None]:
fmap = folium.Map(location=[40.73, -74], zoom_start=12, tiles='cartodbpositron')
# folium.LayerControl().add_to(fmap)

fmap.choropleth(geo_data='data/nyc-neighborhoods.geojson', 
                data=totaldocks.reset_index(),
                columns=['ntacode', 'totalDocks'],
                key_on='feature.properties.ntacode',
                fill_color='OrRd', 
                fill_opacity=0.5, 
                line_opacity=0.1,
                legend_name='Total Docks'
               )
#folium.LayerControl().add_to(fmap)
fmap