In [None]:
import os
import pandas as pd
import numpy as np
import json
import folium
import matplotlib.pyplot as pp
folium.__version__ == '0.5.0'

First, we load the excel file that was previously downloaded from the eurostat website. The latter contains the unemployment rate of european countries in September 2017. We remove rows corresponding to countries for which we don't have any value. For instance, we notice that since the Brexit, no value is available for the UK.

In [None]:
df=pd.read_excel('eurostat.xlsx')
df['rate'] = pd.to_numeric(df['rate'],errors=False)
df = df.dropna()
df

We add a country ID to the previous dataframe because it is easier to use in the choropleth function for the "key_on" identifier. We do it by hand relating IDs from the topojson file with the country names from the dataframe, it is fine to do so as we have less than 30 countries.

In [None]:
country_to_id = {
"Belgium": "BE",
"Bulgaria": "BG",
"Czech Republic": "CZ",
"Denmark": "DK",
"Germany": "DE",
"Estonia": "EE",
"Ireland": "IE",
"Greece": "GR",
"Spain": "ES",
"France": "FR",
"Croatia": "HR",
"Italy": "IT",
"Cyprus": "CY",
"Latvia": "LV",
"Lithuania": "LT",
"Luxembourg": "LU",
"Hungary": "HU",
"Malta": "MT",
"Netherlands": "NL",
"Austria": "AT",
"Poland": "PL",
"Portugal": "PT",
"Romania": "RO",
"Slovenia": "SI",
"Slovakia": "SK",
"Finland": "FI",
"Sweden": "SE",
"United Kingdom": "GB",
"Iceland": "IS"
}
df["country_id"] = df["country"].map(country_to_id)
df

In order to choose the colouring threshold for the choropleth map, we looked at some dataframe stats. Initially, we chose to place the threshold at the rate values corresponding to 25%, 50% and 75% of the european countries.

In [None]:
df.describe()

However, we noticed that using such thresholds, some countries with very similar unemployment rates might have dramatically different colours. Hence, we decided to plot the different rates and to place thresholds in between visible clusters.

In [None]:
m_europe=folium.Map([50,10], tiles='cartodbpositron', zoom_start=4)
pp.plot(df["rate"], np.zeros_like(df["rate"]), 'x')
pp.show()

Finally, we construct the choropleth map showing the unemployment rate in 2016 in Europe using the thresholds indentified in the previous plot, namely: 1,6,11,12.5,17.5,24. We chose to use colors from the red gradient (OrRd) because it has a negative conotation and unemployment rates aren't exactly joyful values to look at. The strongest color is red and shows the countries with highest unemployment rate.

In [None]:
geo_path = r'topojson/europe.topojson.json'
geo_json_data = json.load(open('topojson/europe.topojson.json'))
i=4.2
j=5
k=7
l=11
m_europe.choropleth(geo_data= geo_json_data, topojson='objects.europe', 
                    key_on='id',
                    data=df, columns=['country_id','rate'],fill_color='OrRd', 
                    threshold_scale = [1,i,j,k,l,24],
                    fill_opacity=0.7, line_opacity=0.2,
                   legend_name= 'Unemployment rate in 2016 in Europe (%)')

folium.LayerControl().add_to(m_europe)
m_europe


We want to compare Switzerland's unemployment rate with the rest of Europe, however, the data is missing for this country. Hence we decided to take the unemployment rate of Switzerland in September 2017 from the amstat dataset (question 2)and to add it to the dataframe. 

In [None]:
df.loc[30]=["Switzerland", 3, "CH"]
m_europeswiss=folium.Map([50,10], tiles='cartodbpositron', zoom_start=4)
pp.plot(df["rate"], np.zeros_like(df["rate"]), 'x')
pp.show()

We notice that the same threshold values apply with the added value for Switzerland. Hence we can reuse them for the new map.

In [None]:
geo_path = r'topojson/europe.topojson.json'
geo_json_data = json.load(open('topojson/europe.topojson.json'))
m_europeswiss.choropleth(geo_data= geo_json_data, topojson='objects.europe', 
                    key_on='id',
                    data=df, columns=['country_id','rate'],fill_color='OrRd', 
                    threshold_scale = [1,i,j,k,l,24],
                    #fill_opacity=0.7, line_opacity=0.2,
                   legend_name= 'Unemployment rate in 2016 in Europe (%)')

folium.LayerControl().add_to(m_europeswiss)
m_europeswiss

We can look at the new stats on the dataframe including the Switzerland data.

In [None]:
df.describe()

Switzerland has a 3% unemployment rate which is very low compared to the rest of Europe, it is in 25% of countries with lowest rate in September 2017.