In [None]:
%%HTML
<style>
.rendered_html table, .rendered_html th, .rendered_html tr, .rendered_html td {
     font-size: 95%;
}
</style>

# Interactive Data Viz 
## With Jupyter & Python


#### Join me live in this notebook via Binder:   
➡️ [https://bit.ly/2MEFBRN](https://bit.ly/2MEFBRN)

<p>👋 Jes Simkin
 
<img src='https://emojis.slackmojis.com/emojis/images/1450822151/257/github.png?1450822151' width=25 align='left'/> &nbsp;&nbsp; @jessimk  

<img src='https://emojis.slackmojis.com/emojis/images/1450733056/231/twitter.png?1450733056' width=25 align='left'/> &nbsp;&nbsp; @ _jes5
</p>


In [None]:
# Loading packages

#!pip install pandas
import pandas as pd

#!pip install matplotlib
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [30, 15]
%matplotlib inline

#!pip install seaborn
import seaborn as sns

#!pip install plotly_express
import plotly.express as px

#!pip install folium
import folium
from folium.plugins import MarkerCluster, FeatureGroupSubGroup

# 🤞 What I hope you'll leave today with:

- Intro to interactive data viz with Jupyter & Python!
- Code to get you started with interactive plots in Jupyter!
- Two packages for today: Plotly Express & Folium!
- Two words for today: layers & context!


# 🚲 Mobi Bike Share Data

## June 2018 Trips
- Vancouver's public bike share system!  


- [Data subset for today lives here](https://raw.githubusercontent.com/jessimk/interactive_data_viz/master/mobi_data_presentation_subset.csv)      
    
- original data from Mobi lives here:
    - https://www.mobibikes.ca/en/system-data

In [None]:
#loading our data in a pandas dataframe
departures = pd.read_csv('mobi_data_presentation_subset.csv', index_col=[0])

print("Number of rows:", departures.shape[0])

departures.head(5)

# Questions to explore: 
### How are June 2018 trips distributed across Vancouver?
### Can we learn anything about peak departure by station hours across the city?


# ✨ Why add interactivity to your data viz?

### ✨ Why add interactivity to your data viz?  

Same data. Same Jupyter notebook. 

Different levels of context.  

Different possibilities for engagement & communication.

Different options to avoid overplotting.

<img src="staticawesome.png" align = left width=375 ></img>  <img src="interactive_awesome.gif" align = right width=400></img>

In [None]:
#plotting mobi data with a scatter plot
plt.scatter(x=departures.longitude, y=departures.latitude,
            c=departures.dept_station_peak_hour, cmap="inferno")
plt.colorbar(label="peak_hour_by_station")
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title('Mobi Trip Departures (June 2018)')

In [None]:
#Plotting a hexbin plot with marginal distributions 
sns.jointplot(x=departures.longitude, 
              y=departures.latitude, 
              kind="hex", gridsize=20)

We're still missing a lot of info. Does this look like Vancouver?

# Adding interactive context

# Layers. Layers. Layers. 
###### (code + data)

## General Plotly Express function structure:

- plot type: 
    - scatter(), bar(), line(), histogram()
- pandas dataframe
- x,y
- color, size, symbol
- interactive text

and much more...

Source: [Plotly Express Documentation](https://www.plotly.express/plotly_express/)

In [None]:
#Interactive plot with 1 function call
#Scatterplot with marginal distributions
fig = px.scatter(departures, x="longitude", y="latitude", opacity=0.009,
                 #adding station name
                 hover_name="departure_station",
                 #adding marginal distributions
                 marginal_x='histogram',marginal_y='histogram')

fig.update_layout(title_font_size=18,
    title_text="Mobi Bike Trips (June 2018)") #adding a title
fig.show()
#Try zooming, panning, selecting, saving!

But we're still missing some geospatial context. 

What about streets and neighbourhoods? ...

#### Let's add map tiles to our geospatial data viz 🗺️

### What are map tiles? 

__Map tiles make up the background we'll add to our plot to give our  viz geospatial context! 🗺️__

## What are map tiles? 
Some things to know about map tiles sets:
- Options include free to use, open-source, and paid licenses
- Check usage policies and cite your tile set
    - Plotly Express & Folium build map tile citations into plots ✓ 
    
We'll use these free tile sets:
- __'open-street-map'__ from [OpenStreetMap](https://www.openstreetmap.org/)
- __'stamen-toner'__ from [Stamen](https://maps.stamen.com/)  
  
Read more about map tiles [here.](https://wiki.openstreetmap.org/wiki/Tiles)

In [None]:
#Plotting a scatterplot on a set of map tiles

#use scatter_mapbox() instead of scatter()
fig = px.scatter_mapbox(
    departures, lon="longitude", lat="latitude",
    color="dept_station_peak_hour", #colour by peak hour
    color_continuous_scale=px.colors.cyclical.IceFire, #colour palette
    zoom = 12, #starting zoom
    hover_name="departure_station" #Hover tooltip title
)
fig.update_layout(mapbox_style="stamen-toner") #map tiles
# fig.update_layout(mapbox_style="open-street-map") or try this one
fig.show()

# General Folium Structure

- __build a base map:__ lat & lon center, initial zoom, tiles
- __add layers!__
    - 📍 markers or groups of markers
    - map tiles
    - annotations
    - and more!
    
Disclaimer: We'll subset our data and only plot a sample of 800 trips as Folium has some bugs reported with displaying many markers in [Chrome browsers](https://github.com/python-visualization/folium/issues/812).   
  
Other options with Folium and large marker sets include saving an interactive map to html.

In [None]:
#initializing a base map

#default tiles are open street map tiles
m = folium.Map(location=[49.275, -123.11], #lat & lon to center map
               zoom_start=13)              #initial zoom

#display the map in jupyter
m

In [None]:
# creating a base map object
m = folium.Map(location=[49.275, -123.11], #lat & lon to center map
               zoom_start=13)              #initial zoom

#looping over rows and creating markers
#also subsetting our data to 1.5k trips
for trip in departures.sample(800).iterrows():
    
    #adding peak hour colour 
    if trip[1]['dept_station_peak_hour'] >= 12:
        peak_hour_colour = 'blue'
    else:
        peak_hour_colour = 'orange'
        
    #setting the variables we want to include in our marker
    coord = (trip[1]['latitude'],trip[1]['longitude'])
    station_name = trip[1]['departure_station']
    peak_hr = str(trip[1]['dept_station_peak_hour'])
    
    #creating a marker object with coordinates & pop-up text info
    folium.Marker(
        coord, 
        popup= "Station: "+station_name+" , Peak Departure Hour: "+peak_hr,
        icon=folium.Icon(
            color=peak_hour_colour,
            icon='bicycle', prefix='fa') #bike icon from font awesome library
        
    ).add_to(m) #and saving the marker to the map `m`
        
#displaying map in jupyter
m

Uh oh! Overplotting :(

Let's gather markers together into clusters to avoid overplotting.

 `base map`  

    `↖ marker clustering layer`  
    
        `↖ groups of markers`
        
            `↖ individual markers`


In [None]:
#subsetting data & splitting into AM & PM groups
subset_df = departures.sample(800)

#initializing base map
m = folium.Map(location=[49.275, -123.11], zoom_start=13)

#adding marker clusters objects to the map
mc = MarkerCluster(name='Base Layer', control=False)
m.add_child(mc)

#creating groups for markers so they can be toggled on/off.
#adding these groups to the marker cluster layer

Afternoon_Evening = FeatureGroupSubGroup(mc, 'Afternoon & Evening Departures', show=True)
m.add_child(Afternoon_Evening)

Morning = FeatureGroupSubGroup(mc, 'Morning Departures', show=True)
m.add_child(Morning)

Morning_df = subset_df.query("dept_station_peak_hour <= 12")
Afternoon_Evening_df = subset_df.query("dept_station_peak_hour > 12")

In [None]:
# creating markers for morning trips
for trip in Morning_df.iterrows():
        
    #setting the variables we want to include in our marker
    coord = (trip[1]['latitude'],trip[1]['longitude'])
    station_name = trip[1]['departure_station']
    peak_hr = str(trip[1]['dept_station_peak_hour'])
    
    #creating a marker object with coordinates & pop-up text info
    folium.Marker(
        coord, 
        popup= "Station: "+station_name+" , Peak Departure Hour: "+peak_hr,
        icon=folium.Icon(
            icon='bicycle', prefix='fa') #bike icon from font awesome library
        
    ).add_to(Morning) #and saving the marker to the `morning` group
    
# creating markers for afternoon and evening trips
for trip in Afternoon_Evening_df.iterrows():
        
    #setting the variables we want to include in our marker
    coord = (trip[1]['latitude'],trip[1]['longitude'])
    station_name = trip[1]['departure_station']
    peak_hr = str(trip[1]['dept_station_peak_hour'])
    
    #creating a marker object with coordinates & pop-up text info
    folium.Marker(
        coord, 
        popup= "Station: "+station_name+" , Peak Departure Hour: "+peak_hr,
        icon=folium.Icon(
            icon='bicycle', prefix='fa') #bike icon from font awesome library
        
    ).add_to(Afternoon_Evening) #and saving the marker to the `afternoon & evening` group

In [None]:
#Grand Finale
#adding tile layer options
folium.TileLayer('OpenStreetMap').add_to(m)
folium.TileLayer('Stamen Toner').add_to(m)

# adding a sidebar menu to toggle layers on and off
folium.LayerControl().add_to(m)

m #displaying the map

## Summary

- Jupyter is set up well for interactive data viz
- Interactive features can help layer data + add (geospatial) context
- Plotly Express and Folium are some of many options
- Interactive features are within reach!


## Land Acknowledgement
I spoke about geospatial context and geospatial data so I also need to acknowledge that my work explores and takes place on the unceded lands of the Musqueam, Squamish, and Tsleil-Waututh nations. 

## Thank Yous
- [Tiffany Timbers](https://github.com/ttimbers), [Reka Solymosi](https://github.com/maczokni) for recommending Binder & other advice!

- [Patrick Walls](https://github.com/patrickwalls) for organizing JupyterDay 2019!

## Questions?
<p>👋 Jes Simkin
 
<img src='https://emojis.slackmojis.com/emojis/images/1450822151/257/github.png?1450822151' width=25 align='left'/> &nbsp;&nbsp; @jessimk  

<img src='https://emojis.slackmojis.com/emojis/images/1450733056/231/twitter.png?1450733056' width=25 align='left'/> &nbsp;&nbsp; @ _jes5
</p>

## Further Readings & Resources

[Plotly Express Launch Blog](https://medium.com/plotly/introducing-plotly-express-808df010143d)

[Folium Documentation](https://github.com/python-visualization/folium)

Layering & Data Viz: [Ch 1.2 'What is the grammar of graphics?', ggplot2 book by Hadley Wickham](https://ggplot2-book.org/introduction.html)

Read more about map tiles [here.](https://wiki.openstreetmap.org/wiki/Tiles)

##### Try using these map tile sets:
- 'stamen-terrain', 'stamen-watercolor' from [Stamen](http://maps.stamen.com)
- 'carto-positron' and 'carto-darkmatter' from [Carto](https://carto.com/location-data-services/basemaps/)
- 'white-bg'