Data Source: https://www.kaggle.com/worldbank/world-development-indicators <br> Folder: 'world-development-indicators'

# Using Folium Library for Geographic Overlays

### Further exploring CO2 Emissions per capita in the World Development Indicators Dataset


Lets start by installing the folium module. (You can skip this if youve already downloaded it)

In [1]:
!pip install folium

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/fd/a0/ccb3094026649cda4acd55bf2c3822bb8c277eb11446d13d384e5be35257/folium-0.10.1-py2.py3-none-any.whl (91kB)
[K     |████████████████████████████████| 92kB 2.9MB/s eta 0:00:011
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/81/6d/31c83485189a2521a75b4130f1fee5364f772a0375f81afff619004e5237/branca-0.4.0-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.0 folium-0.10.1


In [3]:
import folium
import pandas as pd

### Country coordinates for plotting

Download the raw form: https://raw.githubusercontent.com/python-visualization/folium/588670cf1e9518f159b0eee02f75185301327342/examples/data/world-countries.json _(Right click, "save as")_

In [14]:
country_geo = 'geo/world-countries.json'

In [15]:
# Read in the World Development Indicators Database
data = pd.read_csv('world-development-indicators/Indicators.csv')
data.shape

(5656458, 6)

In [16]:
data.head()

Unnamed: 0,CountryName,CountryCode,IndicatorName,IndicatorCode,Year,Value
0,Arab World,ARB,"Adolescent fertility rate (births per 1,000 wo...",SP.ADO.TFRT,1960,133.5609
1,Arab World,ARB,Age dependency ratio (% of working-age populat...,SP.POP.DPND,1960,87.7976
2,Arab World,ARB,"Age dependency ratio, old (% of working-age po...",SP.POP.DPND.OL,1960,6.634579
3,Arab World,ARB,"Age dependency ratio, young (% of working-age ...",SP.POP.DPND.YG,1960,81.02333
4,Arab World,ARB,Arms exports (SIPRI trend indicator values),MS.MIL.XPRT.KD,1960,3000000.0


Pull out CO2 emisions for every country in 2011

In [17]:
# select CO2 emissions for all countries in 2011
hist_indicator = 'CO2 emissions \(metric'
hist_year = 2011

mask1 = data['IndicatorName'].str.contains(hist_indicator) 
mask2 = data['Year'].isin([hist_year])

# apply our mask
stage = data[mask1 & mask2]
stage.head()

Unnamed: 0,CountryName,CountryCode,IndicatorName,IndicatorCode,Year,Value
5026275,Arab World,ARB,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,4.7245
5026788,Caribbean small states,CSS,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,9.69296
5027295,Central Europe and the Baltics,CEB,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,6.911131
5027870,East Asia & Pacific (all income levels),EAS,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,5.859548
5028456,East Asia & Pacific (developing only),EAP,CO2 emissions (metric tons per capita),EN.ATM.CO2E.PC,2011,5.302499


### Setup our data for plotting.  

Create a data frame with just the country codes and the values we want plotted.

In [18]:
plot_data = stage[['CountryCode','Value']]
plot_data.head()

Unnamed: 0,CountryCode,Value
5026275,ARB,4.7245
5026788,CSS,9.69296
5027295,CEB,6.911131
5027870,EAS,5.859548
5028456,EAP,5.302499


In [19]:
# label for the legend
hist_indicator = stage.iloc[0]['IndicatorName']

## Visualize CO2 emissions per capita using Folium

Folium provides interactive maps with the ability to create sophisticated overlays for data visualization

In [20]:
# Setup a folium map at a high-level zoom @Alok - what is the 100,0, doesn't seem like lat long
map = folium.Map(location=[100, 0], zoom_start=1.5)

We'll use the built-in method called choropleth to attach the country's geographic json and the plot data.

We need to specify the relevant columns. `key on=feature.id` refers to the label in the json object
which has the country code as the `feature.id` attached to each country's border information. You can find this by reading the json object. This is the tie that we need to set up our data. Our country code in the data frame matches the `feature.id` in the json object.

Next, we specify some of the asthetics, like the color scheme and the opacity and then we label the legend.

In [25]:
# choropleth maps bind Pandas Data Frames and json geometries.  This allows us to quickly visualize data combinations
folium.Choropleth(geo_data=country_geo, data=plot_data,
             columns=['CountryCode', 'Value'],
             key_on='feature.id',
             fill_color='YlGnBu', fill_opacity=0.7, line_opacity=0.2,
             legend_name=hist_indicator).add_to(map)

<folium.features.Choropleth at 0x120d3b390>

In [26]:
# Create Folium plot
map.save('plot_data.html')

The output of this plot is gonna be saved as a html file. The html file is actually interactive. So what we'll need to do is save it and read it back into the notebook in order to interact with it on the map.

In [27]:
# Import the Folium interactive html file
from IPython.display import HTML
HTML('<iframe src=plot_data.html width=700 height=450></iframe>')

And now we have our map. Notice first the dark colors imply a higher CO2 emissions per capita. The US and some of the European countries as well as the Middle Eastern countries stand out as being high producers of CO2 per capita. But remember that this is not total CO2 emissions, this is CO2 emissions per capita. So countries with large populations could have high CO2 emissions but still have a lower CO2 emissons per capita.

More Folium Examples can be found at:<br>
http://python-visualization.github.io/folium/docs-v0.5.0/quickstart.html#Getting-Started <br>

Documentation at:<br>
http://python-visualization.github.io/folium/docs-v0.5.0/modules.html