# <span style="color:#2462C0">Using Folium Library for Geographic Overlays</span>
### <span style="color:#2462C0">Further exploring CO2 Emissions per capita in the World Development Indicators Dataset</span>


In [1]:
import folium
import pandas as pd

#### <span style="color:#2462C0">Country coordinates for plotting</span>
source: https://github.com/python-visualization/folium/blob/master/examples/data/world-countries.json

In [2]:
country_geo = './Files/folium-master/examples/data/world-countries.json'
#country_geo = 'https://raw.githubusercontent.com/python-visualization/folium/master/examples/data/world-countries.json'

In [3]:
data = pd.read_csv('./Files/world-development-indicators/Indicators.csv')
data.shape

(5656458, 6)

In [4]:
data.head()

Unnamed: 0,CountryName,CountryCode,IndicatorName,IndicatorCode,Year,Value
0,Arab World,ARB,"Adolescent fertility rate (births per 1,000 wo...",SP.ADO.TFRT,1960,133.5609
1,Arab World,ARB,Age dependency ratio (% of working-age populat...,SP.POP.DPND,1960,87.7976
2,Arab World,ARB,"Age dependency ratio, old (% of working-age po...",SP.POP.DPND.OL,1960,6.634579
3,Arab World,ARB,"Age dependency ratio, young (% of working-age ...",SP.POP.DPND.YG,1960,81.02333
4,Arab World,ARB,Arms exports (SIPRI trend indicator values),MS.MIL.XPRT.KD,1960,3000000.0


Pull out CO2 emissions for every country in 2011

In [5]:
hist_indicator = 'CO2 emissions \(metric'
hist_year = 2011

mask1 = data['IndicatorName'].str.contains(hist_indicator)
mask2 = data['Year'].isin([hist_year])

stage = data[mask1 & mask2]
stage.head()

Unnamed: 0,CountryName,CountryCode,IndicatorName,IndicatorCode,Year,Value
5026275,Arab World,ARB,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,4.7245
5026788,Caribbean small states,CSS,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,9.69296
5027295,Central Europe and the Baltics,CEB,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,6.911131
5027870,East Asia & Pacific (all income levels),EAS,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,5.859548
5028456,East Asia & Pacific (developing only),EAP,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,5.302499


#### <span style="color:#2462C0">Setup data for plotting</span>
Create a dataframe with just the country codes and the values to plot

In [6]:
plot_data = stage[['CountryCode','Value']]
plot_data[:3]

Unnamed: 0,CountryCode,Value
5026275,ARB,4.7245
5026788,CSS,9.69296
5027295,CEB,6.911131


In [7]:
# label for legend
hist_indicator = stage.iloc[0]['IndicatorName']

#### <span style="color:#2462C0">Visualize CO2 emissions per capita using Folium</span>
Folium provides interactive maps with ability to create sophisticated overlays for data visualization

In [8]:
# setup a folium map at a high-level zoom
map = folium.Map(location=[100,0], zoom_start=1.5)

In [9]:
# choropleth maps bind pandas dataframes and json geometries
map.choropleth(geo_data=country_geo, data=plot_data,
             columns=['CountryCode', 'Value'],
             key_on='feature.id',
             fill_color='YlGnBu', fill_opacity=0.7, line_opacity=0.2,
             legend_name=hist_indicator)

In [10]:
map.save('plot_data.html')

In [11]:
# import the folium interactive html file
from IPython.display import HTML
HTML('<iframe src=plot_data.html width=700 height=500></iframe>')

The following list provides a few plotting libraries for you to get started based on their use case(s).  This list is focused on providing a few solid options for each case rather than overwhelming you with the variety of options available.

The foundation: Matplotlib, most used plotting library, best for two-dimensional non-interactive plots. A possible replacement is pygal, it provides similar functionality but generates vector graphics SVG output and has a more user-friendly interface.

Specific use cases:

* Specialized **statistical plots**, like automatically fitting a linear regression with confidence interval or like scatter plots color-coded by category.

 * seaborn: it builds on top of Matplotlib and it can also be used as a replacement for matplotlib just for an easier way to specify color palettes and **plotting aestetics**

* **Grammar of graphics plotting**, if you find the interface of Matplotlib too verbose, Python provides packages based on a different paradigm of plot syntax based on R's ggplot2

 * ggplot: it provides similar functionality to Matplotlib and is also based on Matplotlib but provides a different interface.
 * altair: it has a simpler interface compared to ggplot and generates Javascript based plots easily embeddable into the Jupyter Notebook or exported as PNG.

* **Interactive plots**, i.e. pan, zoom that work in the Jupyter Notebooks but also can be exported as Javascript to work standalone on a webpage.

 * bokeh: maintained by Continuum Analytics, the company behind Anaconda
 * plotly: is both a library and a cloud service where you can store and share your visualizations (it has free/paid accounts)

* **Interactive map visualization**

 * folium: Creates HTML pages that include the Leaflet.js javascript plotting library to display data on top of maps. 
 * plotly: it supports color-coded country/world maps embedded in the Jupyter Notebook.

Realtime plots that update with streaming data, even integrated in a dashboard with user interaction.

bokeh plot server: it is part of Bokeh but requires to launch a separate Python process that takes care of responding to events from User Interface or from streaming data updates.
3D plots are not easy to interpret, it is worth first consider if a combination of 2D plots could provide a better insight into the data

mplot3d: Matplotlib tookit for 3D visualization

More Folium Examples can be found at:<br>
https://folium.readthedocs.io/en/latest/quickstart.html#getting-started <br>

Documentation at:<br>
https://media.readthedocs.org/pdf/folium/latest/folium.pdf