# MVGmeinRad Part 1: Visualizing bike sharing data via Folium

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/76/MVGmeinRad-Logo.jpg/1200px-MVGmeinRad-Logo.jpg" width="500" height="500" />

Since 2012 the transportation operator (MVG) of my adopted city of Mainz offers a bike rental system called [MVGmeinRad](https://de.wikipedia.org/wiki/MVGmeinRad) with more than [100 stations](https://www.mainzer-mobilitaet.de/en/mainzigartig-mobil/mit-mvgmeinrad/einfuehrung.html) citywide.
One can check via an interactive [map](https://www.mainzer-mobilitaet.de/mainzigartig-mobil/mit-mvgmeinrad/stationen.html) where each station is located and what its current numper of free bikes is. The underlying data is readily stored in JSON format and can thus be used for data exploration hassle-free. Unfortunately, only data related to the station itself, and not on actual trips taken, is made available. However, even this limited information is worth-while to explore.

The inspiration for this notebook came from several [great](http://luisvalesilva.com/datasimple/citibike.html) [blog](https://georgetsilva.github.io/posts/mapping-points-with-folium/) [posts](https://blog.prototypr.io/interactive-maps-with-python-part-1-aa1563dbe5a9) on how to graphical explore bike sharing data with the [folium](https://github.com/python-visualization/folium) package and which made me want to do the same for Mainz.

Let's first have a look on the interactive map of all [MVGMeinRad-stations](https://www.mainzer-mobilitaet.de/mainzigartig-mobil/mit-mvgmeinrad/stationen.html):

<img src="https://raw.githubusercontent.com/MrPreacher/MVGmeinRad/master/Images/MVG_Screenshot.jpg">

Let's take a look at the station data in its raw JSON form:

In [1]:
#Load packages:
import os
os.chdir("D:\\MVGmeinRad")
import numpy as np
import pandas as pd
import datetime as dt
import time
from selenium import webdriver
import folium
import requests
import json
from matplotlib.colors import Normalize, rgb2hex
import matplotlib.cm as cm

In [2]:
#Load URL from Textfile read current data in JSON format
url = pd.read_csv('MVG_URL.txt', header = None).values[0][0]
content = requests.get(url).json()
print("MVGmeinRad station information retrived at",dt.datetime.now())
print(json.dumps(content[0:2], indent=1))

MVGmeinRad station information retrived at 2019-02-09 22:07:55.384872
[
 {
  "latitude": "50.009604",
  "capacity": 16,
  "longitude": "8.268905",
  "id": 111,
  "blocked": false,
  "bikes_available": 5,
  "docks_available": 11,
  "address": "Taunusstra\u00dfe / Ecke Kaiserstra\u00dfe",
  "name": "Kaisertor",
  "address_hint": "Taunusstra\u00dfe / Ecke Kaiserstra\u00dfe"
 },
 {
  "latitude": "50.00252",
  "capacity": 18,
  "longitude": "8.271294",
  "id": 4,
  "blocked": false,
  "bikes_available": 2,
  "docks_available": 16,
  "address": "Haltestelle Schusterstra\u00dfe / Bauerngasse",
  "name": "Flachsmarkt",
  "address_hint": "Haltestelle Schusterstra\u00dfe / Bauerngasse"
 }
]


This information can be easier read and analyzed by loading it into a Pandas DataFrame:

In [3]:
df = pd.read_json(url)
df.head()

Unnamed: 0,address,address_hint,bikes_available,blocked,capacity,docks_available,id,latitude,longitude,name
0,Taunusstraße / Ecke Kaiserstraße,Taunusstraße / Ecke Kaiserstraße,5,False,16,11,111,50.009604,8.268905,Kaisertor
1,Haltestelle Schusterstraße / Bauerngasse,Haltestelle Schusterstraße / Bauerngasse,2,False,18,16,4,50.00252,8.271294,Flachsmarkt
2,Treppenaufgang,Treppenaufgang,6,False,12,6,97,50.00098,8.27668,Rathaus-Parkhaus
3,Haltestelle Höfchen / Alte Universität,Haltestelle Höfchen / Alte Universität,23,False,30,7,24,49.999648,8.271715,Höfchen
4,Feldbergstraße / Ecke Wallaustraße,Feldbergstraße / Ecke Wallaustraße,4,False,24,20,108,50.010559,8.260654,Feldbergstraße


We can see that the station data can be separated into three broad groups:

-  the station `name` and the unique station-`id` help **identify** each station
-  **location information** is given via the station-address and the `latitude` and `longitude` parameters 
-  information on the stations **current state** are:
    -  `capacity` gives the total number of docks of a station
    -  `bikes_available` and `docks_available` gives the current utilization of the station
    -  `blocked` is a boolean that indicates wheter the station is temporarily not available
    
Unfortunately there is no information on individual bike-rides in the data. Still, the available information on the 119 bike-stations allow us to look at the average utilization of each station, correlation between stations, the utilization of the entire grid and so forth. 

However, in this notebook we will restrict ourselves to plotting interactive maps of Mainz and its network of bike rental stations. We will do so with the help of the `Folium` library, which allows for the easy creation of interactive Leaflet maps in the spirit of the JavaScript library of the same name. 

Plotting a map is as easy as specifying the coordinates of the map-midpoint. One of its nice features is that `Folium` offers different tilesets to alter the general look of the map. Two very nice tilesets are `CartoDB` and `Stamen`. Of the latter, my personal favorite is `stamenwatercolor`, which lets us depict the city of Mainz in the style of a beatiful water colour painting.

Before we do so, though, we'll define a function that let us save the interactive map as static imagae via the `Selenium` library as described in [this blog post](https://alcidanalytics.com/p/geographic-heatmap-in-python). The reason for this is the inability to render an interactive map on a static Jupyter notebook on Github.

In [4]:
#Function to save Folium map as PNG file via the webbrowser:
def save_map(folium_map,map_name):
    fn='testmap.html'
    folium_map.save(fn)
    tmpurl='file://{path}/{mapfile}'.format(path=os.getcwd(),mapfile=fn)
    browser = webdriver.Firefox()
    browser.get(tmpurl)
    time.sleep(1)
    browser.save_screenshot(map_name + '.png')
    browser.quit()

In [5]:
folium_map = folium.Map(location=[50.01, 8.24], tiles="stamenwatercolor", zoom_start=13.3)

#Save static map
save_map(folium_map,"MVG_stamenwatercolor")

#Display interactive map
#folium_map

<img src="https://raw.githubusercontent.com/MrPreacher/MVGmeinRad/master/Images/MVG_stamenwatercolor.png"  width="1000" height="2000" />

We will begin by recreating the interactive map displayed above via the folium library. `Folium` allows the easy creation of interactive Leaflet maps in the spirit of the JavaScript library of the same name. Let us beginn with a relative simple map that will mark each station via its `latitude` and `longitude`. In addition, the respective station name should be displayed for a selected marker:

In [6]:
#Define folium map
folium_map = folium.Map(location=[49.99, 8.17], zoom_start=12.3)

#Iterate over station and place on map
for point in range(df.shape[0]):
    folium.Marker(location=df.loc[point,['latitude', 'longitude']].values.tolist(),
                  popup=df['name'][point]).add_to(folium_map)

#Save static map
save_map(folium_map,"MVG_Overview")

#Display interactive map
#folium_map

<img src="https://raw.githubusercontent.com/MrPreacher/MVGmeinRad/master/Images/MVG_Overview.png" width="1000" height="2000" />

This map gives a good overview over the magnitude of the grid, which extends to nearby towns like Ingelheim and even crosses the Rhine into the neighboring federal state of Hesse.

However,the above visualization does not make use of the information regarding the current state of a station. As a finger exersice we want to recreate the spirit of the original map, by defining the icons for each station as follows:

-  stations blocked due to repairs are indicated by a wrench on gray background
-  for each staton we want to display the number of available bikes and free docks along with the name of the station
-  stations with less than two bikes available should be highlighted in red
-  if a station has at least half its bike available should be indicated by a green background, if not by a orange background

Unfortunately the number of available bikes is not easily displayed directly on each station-icon. However, the fontawsome project allows for the easy incorporation of many different [icons](https://fontawesome.com/v4.7.0/icons/).

Finally, we choose a different map-style via the `tiles=` option in the `folium.Map` statement. The resulting map resembles the original map much closer and offers more information at first glance:


In [7]:
#Define folium map
folium_map = folium.Map(location=[50, 8.27], tiles="stamenterrain", zoom_start=14)

#Iterate over station and place on map
for index, row in df.iterrows():
    icon = "bicycle"
    if row["blocked"] == True:
        icon = "wrench"
        color = "gray"
    elif row["bikes_available"] <= 1:
        color = color="red"
    elif row["bikes_available"] >= row["docks_available"]:
        color="green"
    else:
        color="orange" 
        
    folium.Marker(location=(row["latitude"],row["longitude"]),
                  icon=folium.Icon(color=color, prefix = 'fa', icon=icon),
                  popup=row["name"]+"  |  Bikes available: " + str(row["bikes_available"]) 
                          +"  |  Free docks: " + str(row["docks_available"])).add_to(folium_map)
    
#Save map
save_map(folium_map,"MVG_FahrradAmpel")
#Display map
#folium_map

<img src="https://raw.githubusercontent.com/MrPreacher/MVGmeinRad/master/Images/MVG_FahrradAmpel.png" width="1000" height="2000" />

This map already gives us a good idea about the current utilization of the bike rental system. However, we can highlight other aspects of the data by using different map settings. For instance, `Foliums CircleMarker` represents points on a map as circles. Thus, we can illustrate a stations capacity by the radius of its circle. At the same time we want to represent 

with green representing a high share of available bikes relative to a stations capacity and red indicating a low share
similar to before. But now we want to represent this in a continuos fashion by mapping the share to a [colormap](https://nbviewer.jupyter.org/github/python-visualization/folium/blob/v0.2.0/examples/Colormaps.ipynb)





In [8]:
#Define folium map
folium_map = folium.Map(location=[49.995123, 8.267426], zoom_start=13, 
                        tiles='cartodbdark_matter') #dark tileset to make colors stand out more clearly

#Iterate over station and place on map
for index, row in df.iterrows():
    
    #Compute fraction of available bikes relative to total capacity and map share on colormap
    bike_share =float(row["bikes_available"])/float(row["capacity"])
    color = rgb2hex(cm.RdYlGn(bike_share))
    
    folium.CircleMarker(location=(row["latitude"],row["longitude"]),
                        popup= row["name"] + ": " + str(row["bikes_available"]) + " out of " + str(row["capacity"]) + " bikes available",
                        radius=row["capacity"],
                        color=color,
                        fill=True).add_to(folium_map )
    
#Save map
save_map(folium_map,"MVG_Cartodbdark_matter")
#Display map
#folium_map

<img src="https://raw.githubusercontent.com/MrPreacher/MVGmeinRad/master/Images/MVG_Cartodbdark_matter.png" width="1000" height="2000" />

Another nice feature of the `Folium` library is its capitbility of overlaying a map with stunning heatmaps via external `plugins`. The implementation as can be seen in [this Kaggle notebook](https://www.kaggle.com/rachan/how-to-folium-for-maps-heatmaps-time-analysis#). The important point to keep in mind is that one cannot use a `Pandas DataFrame` but has to use a list of latitudinal longitudinal points. The number of nearby points determines the "heat" of an area. 

We can illustrate the distribution of bike rental stations in this fashion:

In [9]:
from folium import plugins
from folium.plugins import HeatMap

#Define folium map
folium_map = folium.Map(location=[49.995123, 8.267426],
                        tiles='Stamen Toner', #Light tile makes the heatmap stand out more easily
                        zoom_start=13)

#Keep only longitude and latitude as list of lists
heat_data = df[["latitude", "longitude"]].values.tolist()

#Add to map
HeatMap(heat_data).add_to(folium_map)

#Save map
save_map(folium_map,"MVG_Heatmap_I")
#Display map
#folium_map

<img src="https://raw.githubusercontent.com/MrPreacher/MVGmeinRad/master/Images/MVG_Heatmap_I.png" width="1000" height="2000" />

This is a nice style of visualization, but heatmaps do really shine when it comes to displaying movement over time. For instance, it seems worthwhile to look at changes in the number of bikes at each station. We will do so in a later notebook, once we have collected repeated observations over time.

At this point we will content ourselves to a heatmap of the available bikes per station for the given point in time. For this, we keep the number of availables bikes as a weight-variable along with the longitude-latitude pairs. The resulting heatmap reveals a clustering of available bikes around the two existing train stations in Mainz:

In [10]:
#Define folium map
folium_map = folium.Map(location=[49.995123, 8.267426],tiles='Stamen Toner',zoom_start=13)

#Keep number of available bikes as well as weight variable
heat_data = df[["latitude", "longitude","bikes_available"]].values.tolist()

#Add to map
HeatMap(heat_data).add_to(folium_map)

#Save map
save_map(folium_map,"MVG_Heatmap_II")
#Display map
#folium_map

<img src="https://raw.githubusercontent.com/MrPreacher/MVGmeinRad/master/Images/MVG_Heatmap_II.png" width="1000" height="2000" />