# Homework 03 - Interactive Viz

## Deadline

Wednesday November 8th, 2017 at 11:59PM

## Important Notes

- Make sure you push on GitHub your Notebook with all the cells already evaluated
- Note that maps do not render in a standard Github environment : you should export them to HTML and link them in your notebook.
- Remember that `.csv` is not the only data format. Though they might require additional processing, some formats provide better encoding support.
- Don't forget to add a textual description of your thought process, the assumptions you made, and the solution you plan to implement!
- Please write all your comments in English, and use meaningful variable names in your code

## Background

In this homework we will be exploring interactive visualization, which is a key ingredient of many successful data visualizations (especially when it comes to infographics).

Unemployment rates are major economic metrics and a matter of concern for governments around the world. Though its definition may seem straightforward at first glance (usually defined as the number of unemployed people divided by the active population), it can be tricky to define consistently. For example, one must define what exactly unemployed means : looking for a job ? Having declared their unemployment ? Currently without a job ? Should students or recent graduates be included ? We could also wonder what the active population is : everyone in an age category (e.g. `16-64`) ? Anyone interested by finding a job ? Though these questions may seem subtle, they can have a large impact on the interpretation of the results : `3%` unemployment doesn't mean much if we don't know who is included in this percentage. 

In this homework you will be dealing with two different datasets from the statistics offices of the European commission ([eurostat](http://ec.europa.eu/eurostat/data/database)) and the Swiss Confederation ([amstat](https://www.amstat.ch)). They provide a variety of datasets with plenty of information on many different statistics and demographics at their respective scales. Unfortunately, as is often the case is data analysis, these websites are not always straightforward to navigate. They may include a lot of obscure categories, not always be translated into your native language, have strange link structures, â€¦ Navigating this complexity is part of a data scientists' job : you will have to use a few tricks to get the right data for this homework.

For the visualization part, install [Folium](https://github.com/python-visualization/folium) (*HINT*: it is not available in your standard Anaconda environment, therefore search on the Web how to install it easily!). Folium's `README` comes with very clear examples, and links to their own iPython Notebooks -- make good use of this information. For your own convenience, in this same directory you can already find two `.topojson` files, containing the geo-coordinates of 

- European countries (*liberal definition of EU*) (`topojson/europe.topojson.json`, [source](https://github.com/leakyMirror/map-of-europe))
- Swiss cantons (`topojson/ch-cantons.topojson.json`) 

These will be used as an overlay on the Folium maps.

## Assignment

1. Go to the [eurostat](http://ec.europa.eu/eurostat/data/database) website and try to find a dataset that includes the european unemployment rates at a recent date.

   Use this data to build a [Choropleth map](https://en.wikipedia.org/wiki/Choropleth_map) which shows the unemployment rate in Europe at a country level. Think about [the colors you use](https://carto.com/academy/courses/intermediate-design/choose-colors-1/), how you decided to [split the intervals into data classes](http://gisgeography.com/choropleth-maps-data-classification/) or which interactions you could add in order to make the visualization intuitive and expressive. Compare Switzerland's unemployment rate to that of the rest of Europe.

2. Go to the [amstat](https://www.amstat.ch) website to find a dataset that includes the unemployment rates in Switzerland at a recent date.

   > *HINT* Go to the `details` tab to find the raw data you need. If you do not speak French, German or Italian, think of using free translation services to navigate your way through. 

   Use this data to build another Choropleth map, this time showing the unemployment rate at the level of swiss cantons. Again, try to make the map as expressive as possible, and comment on the trends you observe.

   The Swiss Confederation defines the rates you have just plotted as the number of people looking for a job divided by the size of the active population (scaled by 100). This is surely a valid choice, but as we discussed one could argue for a different categorization.

   Copy the map you have just created, but this time don't count in your statistics people who already have a job and are looking for a new one. How do your observations change ? You can repeat this with different choices of categories to see how selecting different metrics can lead to different interpretations of the same data.

3. Use the [amstat](https://www.amstat.ch) website again to find a dataset that includes the unemployment rates in Switzerland at recent date, this time making a distinction between *Swiss* and *foreign* workers.

   The Economic Secretary (SECO) releases [a monthly report](https://www.seco.admin.ch/seco/fr/home/Arbeit/Arbeitslosenversicherung/arbeitslosenzahlen.html) on the state of the employment market. In the latest report (September 2017), it is noted that there is a discrepancy between the unemployment rates for *foreign* (`5.1%`) and *Swiss* (`2.2%`) workers. 

   Show the difference in unemployment rates between the two categories in each canton on a Choropleth map (*hint* The easy way is to show two separate maps, but can you think of something better ?). Where are the differences most visible ? Why do you think that is ?

   Now let's refine the analysis by adding the differences between age groups. As you may have guessed it is nearly impossible to plot so many variables on a map. Make a bar plot, which is a better suited visualization tool for this type of multivariate data.

4. *BONUS*: using the map you have just built, and the geographical information contained in it, could you give a *rough estimate* of the difference in unemployment rates between the areas divided by the [RÃ¶stigraben](https://en.wikipedia.org/wiki/R%C3%B6stigraben)?

## Additional resources

This file provides some links to some interesting data visualization projects across the web. This list is not intented to be exhaustive, just to offer a reference for inspiration or information destined to the curious.



* [A map which takes its color scheme from images](https://www.mapbox.com/cartogram/)

* [Interesting map visualizations](http://www.viewsoftheworld.net/)

* [Dataviz Blog](https://bl.ocks.org/mbostock) by Mike Bostock, the creator of [`d3.js`](https://d3js.org/), a popular visualization tool for the web 

* [Collection of interesting visualizations](https://flowingdata.com/category/visualization/)

* [Gene explorer, for biologists](http://www.bar.utoronto.ca/GeneSlider/?datasource=CNSData&chr=1&start=3120&end=5000)

* [Exploring 100k stars (Chrome only)](https://stars.chromeexperiments.com/)

* [Interactive map of world trade over time](http://www.visualcapitalist.com/interactive-mapping-flow-international-trade/)

* [Visualizing deaths in conflicts across the world](http://www.informationisbeautiful.net/visualizations/senseless-conflict-deaths-per-hour/)

* [Where did immigrants to the US come from over time ?](http://metrocosm.com/animated-immigration-map/)

* [Listening to Wikipedia](http://listen.hatnote.com/)

* [A live map of Twitter](https://www.mapd.com/demos/tweetmap/)

* [Collection of cool visualizations](http://www.informationisbeautiful.net/)

* [A Choropleth map of Switzerland with mountains in relief](https://timogrossenbacher.ch/2016/12/beautiful-thematic-maps-with-ggplot2-only/)

* [Interactive datavizualisations of the UK](https://mappl.uk/)

* [Most used word in each state of the US (xkcd)](https://imgs.xkcd.com/comics/state_word_map.png)

* [Drone deaths in Pakistan]( http://drones.pitchinteractive.com/)

* [Map projection transitions](https://www.jasondavies.com/maps/transition/)

* [The five main projects of the Belt and Road project in China](http://multimedia.scmp.com/news/china/article/One-Belt-One-Road/index.html)

* [Full images of the Earth datastory](https://pudding.cool/2017/10/satellites/)

* [Surprise! Showing the unexpected](https://medium.com/@uwdata/surprise-maps-showing-the-unexpected-e92b67398865)


In [1]:
import pandas as pd
import numpy as np
from IPython.display import display

import os
import json
import folium as fo

# Question 2

## Data Messaging

In [2]:
DF = pd.read_excel('./2_1 Taux de chomage 2.xlsx', header=[0, 1])
DF.drop(('Mois', 'Mesures'), axis=1, inplace=True)
# https://stackoverflow.com/questions/36747750/remove-column-from-multi-index-dataframe
DF.columns = pd.MultiIndex.from_tuples(DF.columns.to_series())
DF = DF.transpose()

DF

Unnamed: 0,Unnamed: 1,Zurich,Berne,Lucerne,Uri,Schwyz,Obwald,Nidwald,Glaris,Zoug,Fribourg,...,Grisons,Argovie,Thurgovie,Tessin,Vaud,Valais,Neuchâtel,Genève,Jura,Total
Janvier 2017,Taux de chômage,3.9,3.0,2.2,1.5,2.0,1.0,1.3,2.5,2.6,3.2,...,1.9,3.5,2.6,4.0,5.2,5.2,6.6,5.7,5.3,3.7
Janvier 2017,Chômeurs inscrits,32387.0,16954.0,4985.0,297.0,1794.0,217.0,305.0,567.0,1756.0,5215.0,...,2117.0,12622.0,3967.0,6757.0,20672.0,9059.0,6052.0,13306.0,1930.0,164466.0
Février 2017,Taux de chômage,3.9,3.0,2.2,1.4,2.0,1.1,1.3,2.6,2.6,2.9,...,1.7,3.4,2.6,4.0,5.0,4.6,6.5,5.5,5.2,3.6
Février 2017,Chômeurs inscrits,31619.0,16738.0,4808.0,276.0,1766.0,228.0,306.0,576.0,1784.0,4811.0,...,1882.0,12551.0,3936.0,6623.0,19987.0,8033.0,6014.0,12971.0,1915.0,159809.0
Mars 2017,Taux de chômage,3.8,2.9,2.0,1.3,1.9,1.0,1.2,2.4,2.5,2.7,...,1.5,3.3,2.5,3.6,4.8,3.9,6.2,5.4,4.9,3.4
Mars 2017,Chômeurs inscrits,30841.0,16035.0,4493.0,256.0,1670.0,217.0,297.0,551.0,1729.0,4499.0,...,1668.0,12098.0,3721.0,6106.0,19027.0,6771.0,5707.0,12712.0,1782.0,152280.0
Avril 2017,Taux de chômage,3.6,2.7,1.9,1.2,1.8,1.0,1.1,2.3,2.5,2.7,...,2.0,3.2,2.3,3.3,4.6,3.7,5.9,5.3,4.7,3.3
Avril 2017,Chômeurs inscrits,29542.0,15322.0,4315.0,228.0,1580.0,205.0,269.0,517.0,1704.0,4362.0,...,2213.0,11628.0,3422.0,5566.0,18353.0,6405.0,5414.0,12329.0,1718.0,146327.0
Mai 2017,Taux de chômage,3.5,2.6,1.8,1.0,1.7,0.8,1.1,2.1,2.4,2.4,...,1.8,3.1,2.2,3.1,4.4,3.3,5.6,5.2,4.4,3.1
Mai 2017,Chômeurs inscrits,28624.0,14397.0,4082.0,196.0,1462.0,174.0,256.0,484.0,1652.0,3981.0,...,1989.0,11306.0,3262.0,5274.0,17614.0,5687.0,5136.0,12204.0,1615.0,139778.0


In [3]:
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
    display(DF)
    display(DF.index)

Unnamed: 0,Unnamed: 1,Zurich,Berne,Lucerne,Uri,Schwyz,Obwald,Nidwald,Glaris,Zoug,Fribourg,Soleure,Bâle-Ville,Bâle-Campagne,Schaffhouse,Appenzell Rhodes-Extérieures,Appenzell Rhodes-Intérieures,St-Gall,Grisons,Argovie,Thurgovie,Tessin,Vaud,Valais,Neuchâtel,Genève,Jura,Total
Janvier 2017,Taux de chômage,3.9,3.0,2.2,1.5,2.0,1.0,1.3,2.5,2.6,3.2,3.2,4.2,3.1,3.6,2.0,1.4,2.7,1.9,3.5,2.6,4.0,5.2,5.2,6.6,5.7,5.3,3.7
Janvier 2017,Chômeurs inscrits,32387.0,16954.0,4985.0,297.0,1794.0,217.0,305.0,567.0,1756.0,5215.0,4744.0,4181.0,4659.0,1587.0,610.0,129.0,7597.0,2117.0,12622.0,3967.0,6757.0,20672.0,9059.0,6052.0,13306.0,1930.0,164466.0
Février 2017,Taux de chômage,3.9,3.0,2.2,1.4,2.0,1.1,1.3,2.6,2.6,2.9,3.2,4.1,3.1,3.7,1.9,1.3,2.6,1.7,3.4,2.6,4.0,5.0,4.6,6.5,5.5,5.2,3.6
Février 2017,Chômeurs inscrits,31619.0,16738.0,4808.0,276.0,1766.0,228.0,306.0,576.0,1784.0,4811.0,4602.0,4075.0,4656.0,1608.0,592.0,118.0,7334.0,1882.0,12551.0,3936.0,6623.0,19987.0,8033.0,6014.0,12971.0,1915.0,159809.0
Mars 2017,Taux de chômage,3.8,2.9,2.0,1.3,1.9,1.0,1.2,2.4,2.5,2.7,3.0,4.0,3.1,3.5,1.8,1.1,2.5,1.5,3.3,2.5,3.6,4.8,3.9,6.2,5.4,4.9,3.4
Mars 2017,Chômeurs inscrits,30841.0,16035.0,4493.0,256.0,1670.0,217.0,297.0,551.0,1729.0,4499.0,4420.0,3992.0,4536.0,1520.0,551.0,98.0,6983.0,1668.0,12098.0,3721.0,6106.0,19027.0,6771.0,5707.0,12712.0,1782.0,152280.0
Avril 2017,Taux de chômage,3.6,2.7,1.9,1.2,1.8,1.0,1.1,2.3,2.5,2.7,2.9,3.9,2.9,3.3,1.8,1.0,2.4,2.0,3.2,2.3,3.3,4.6,3.7,5.9,5.3,4.7,3.3
Avril 2017,Chômeurs inscrits,29542.0,15322.0,4315.0,228.0,1580.0,205.0,269.0,517.0,1704.0,4362.0,4281.0,3863.0,4345.0,1436.0,539.0,86.0,6685.0,2213.0,11628.0,3422.0,5566.0,18353.0,6405.0,5414.0,12329.0,1718.0,146327.0
Mai 2017,Taux de chômage,3.5,2.6,1.8,1.0,1.7,0.8,1.1,2.1,2.4,2.4,2.8,3.7,2.9,3.3,1.7,0.8,2.3,1.8,3.1,2.2,3.1,4.4,3.3,5.6,5.2,4.4,3.1
Mai 2017,Chômeurs inscrits,28624.0,14397.0,4082.0,196.0,1462.0,174.0,256.0,484.0,1652.0,3981.0,4074.0,3652.0,4227.0,1441.0,534.0,72.0,6383.0,1989.0,11306.0,3262.0,5274.0,17614.0,5687.0,5136.0,12204.0,1615.0,139778.0


MultiIndex(levels=[['Août 2017', 'Avril 2017', 'Février 2017', 'Janvier 2017', 'Juillet 2017', 'Juin 2017', 'Mai 2017', 'Mars 2017', 'Septembre 2017', 'Total'], ['Chômeurs inscrits', 'Taux de chômage']],
           labels=[[3, 3, 2, 2, 7, 7, 1, 1, 6, 6, 5, 5, 4, 4, 0, 0, 8, 8, 9, 9], [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0]])

## Cantron Name to ID Dictionary

### First Appraoch

In [4]:
DF_main = DF.transpose().reset_index()
DF_main

Unnamed: 0_level_0,index,Janvier 2017,Janvier 2017,Février 2017,Février 2017,Mars 2017,Mars 2017,Avril 2017,Avril 2017,Mai 2017,...,Juin 2017,Juin 2017,Juillet 2017,Juillet 2017,Août 2017,Août 2017,Septembre 2017,Septembre 2017,Total,Total
Unnamed: 0_level_1,Unnamed: 1_level_1,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,...,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits
0,Zurich,3.9,32387.0,3.9,31619.0,3.8,30841.0,3.6,29542.0,3.5,...,3.4,27925.0,3.4,27992.0,3.4,27514.0,3.3,27225.0,3.6,263669.0
1,Berne,3.0,16954.0,3.0,16738.0,2.9,16035.0,2.7,15322.0,2.6,...,2.4,13590.0,2.4,13633.0,2.5,13829.0,2.4,13658.0,2.7,134156.0
2,Lucerne,2.2,4985.0,2.2,4808.0,2.0,4493.0,1.9,4315.0,1.8,...,1.7,3884.0,1.7,3875.0,1.8,3992.0,1.7,3885.0,1.9,38319.0
3,Uri,1.5,297.0,1.4,276.0,1.3,256.0,1.2,228.0,1.0,...,0.8,159.0,0.7,129.0,0.6,123.0,0.6,112.0,1.0,1776.0
4,Schwyz,2.0,1794.0,2.0,1766.0,1.9,1670.0,1.8,1580.0,1.7,...,1.6,1411.0,1.7,1447.0,1.7,1466.0,1.7,1455.0,1.8,14051.0
5,Obwald,1.0,217.0,1.1,228.0,1.0,217.0,1.0,205.0,0.8,...,0.8,167.0,0.8,176.0,0.8,164.0,0.7,153.0,0.9,1701.0
6,Nidwald,1.3,305.0,1.3,306.0,1.2,297.0,1.1,269.0,1.1,...,1.0,252.0,1.0,241.0,1.0,247.0,1.0,248.0,1.1,2421.0
7,Glaris,2.5,567.0,2.6,576.0,2.4,551.0,2.3,517.0,2.1,...,2.0,441.0,1.8,415.0,1.9,435.0,1.8,416.0,2.2,4402.0
8,Zoug,2.6,1756.0,2.6,1784.0,2.5,1729.0,2.5,1704.0,2.4,...,2.3,1576.0,2.3,1574.0,2.4,1604.0,2.3,1543.0,2.4,14922.0
9,Fribourg,3.2,5215.0,2.9,4811.0,2.7,4499.0,2.7,4362.0,2.4,...,2.4,3892.0,2.7,4372.0,2.8,4667.0,2.7,4466.0,2.7,40265.0


In [8]:
import warnings

geo_path = './topojson/ch-cantons.topojson.json'
geo_json_data = json.load(open(geo_path))

_name2id = {item['properties']['name'] : item['id'] 
 for item in geo_json_data['objects']['cantons']['geometries']}

def name2id (name):
    try: 
        ID = _name2id[name]
    except KeyError:
        warnings.warn("Name '{}' not in the dictionary.".format(name))
        ID = np.NaN
    return ID 

def get_closest_match(left_series, right_df, right_df_matching_column_name):
    # gets the closest matching between the left_series and right_df on the column name specified. 
    import difflib
    result = difflib.get_close_matches(left_series, 
                                       right_df[right_df_matching_column_name], 
                                       n=1, 
                                       cutoff=0.6) # the default threshold is 60% similarity.
    try:
        return result[0]
    except IndexError:
        return np.nan
    
DF_Names = pd.DataFrame(list(_name2id.items()), columns=['name', 'id'])

DF_main['matched_index'] = DF_main['index'].apply(lambda x: 
                                                  get_closest_match(x, DF_Names, 'name'))


DF_main['ID'] = DF_main['matched_index'].apply(lambda x: name2id(x))

sequence = ['ID', 'index', 'matched_index']
# your_dataframe = your_dataframe.reindex(columns=sequence)
DF_main

  del sys.path[0]


Unnamed: 0_level_0,index,Janvier 2017,Janvier 2017,Février 2017,Février 2017,Mars 2017,Mars 2017,Avril 2017,Avril 2017,Mai 2017,...,Juillet 2017,Juillet 2017,Août 2017,Août 2017,Septembre 2017,Septembre 2017,Total,Total,matched_index,ID
Unnamed: 0_level_1,Unnamed: 1_level_1,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,...,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Taux de chômage,Chômeurs inscrits,Unnamed: 20_level_1,Unnamed: 21_level_1
0,Zurich,3.9,32387.0,3.9,31619.0,3.8,30841.0,3.6,29542.0,3.5,...,3.4,27992.0,3.4,27514.0,3.3,27225.0,3.6,263669.0,Zürich,ZH
1,Berne,3.0,16954.0,3.0,16738.0,2.9,16035.0,2.7,15322.0,2.6,...,2.4,13633.0,2.5,13829.0,2.4,13658.0,2.7,134156.0,Bern/Berne,BE
2,Lucerne,2.2,4985.0,2.2,4808.0,2.0,4493.0,1.9,4315.0,1.8,...,1.7,3875.0,1.8,3992.0,1.7,3885.0,1.9,38319.0,Luzern,LU
3,Uri,1.5,297.0,1.4,276.0,1.3,256.0,1.2,228.0,1.0,...,0.7,129.0,0.6,123.0,0.6,112.0,1.0,1776.0,Uri,UR
4,Schwyz,2.0,1794.0,2.0,1766.0,1.9,1670.0,1.8,1580.0,1.7,...,1.7,1447.0,1.7,1466.0,1.7,1455.0,1.8,14051.0,Schwyz,SZ
5,Obwald,1.0,217.0,1.1,228.0,1.0,217.0,1.0,205.0,0.8,...,0.8,176.0,0.8,164.0,0.7,153.0,0.9,1701.0,Obwalden,OW
6,Nidwald,1.3,305.0,1.3,306.0,1.2,297.0,1.1,269.0,1.1,...,1.0,241.0,1.0,247.0,1.0,248.0,1.1,2421.0,Nidwalden,NW
7,Glaris,2.5,567.0,2.6,576.0,2.4,551.0,2.3,517.0,2.1,...,1.8,415.0,1.9,435.0,1.8,416.0,2.2,4402.0,Glarus,GL
8,Zoug,2.6,1756.0,2.6,1784.0,2.5,1729.0,2.5,1704.0,2.4,...,2.3,1574.0,2.4,1604.0,2.3,1543.0,2.4,14922.0,Zug,ZG
9,Fribourg,3.2,5215.0,2.9,4811.0,2.7,4499.0,2.7,4362.0,2.4,...,2.7,4372.0,2.8,4667.0,2.7,4466.0,2.7,40265.0,Fribourg,FR


In [18]:
DF_plot1 = DF_main[[('Janvier 2017', 'Taux de chômage'), 
                    ('ID', '')]]
DF_plot1.columns = ['data', 'id']
DF_plot1
# DF_main.columns
# DF_plot1.reindex_axis(['data', 'ID'], axis=1)
DF_main.columns

MultiIndex(levels=[['Août 2017', 'Avril 2017', 'Février 2017', 'Janvier 2017', 'Juillet 2017', 'Juin 2017', 'Mai 2017', 'Mars 2017', 'Septembre 2017', 'Total', 'index', 'matched_index', 'ID'], ['Chômeurs inscrits', 'Taux de chômage', '']],
           labels=[[10, 3, 3, 2, 2, 7, 7, 1, 1, 6, 6, 5, 5, 4, 4, 0, 0, 8, 8, 9, 9, 11, 12], [2, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 2, 2]])

In [30]:
idx = pd.IndexSlice
DF_main.loc[idx[:, :], idx[:, 'Taux de chômage']]

In [61]:
plot_data = (DF_main.drop('Chômeurs inscrits', level=1, axis=1)
                  .drop(['Total', 'matched_index', 'index'], axis=1, level=0))
_columns = [item[0] for item in DF_temp.columns.tolist()]
plot_data.columns = _columns
plot_data
# DF_main.xs('Taux de chômage', level= axis=1)

Unnamed: 0,Janvier 2017,Février 2017,Mars 2017,Avril 2017,Mai 2017,Juin 2017,Juillet 2017,Août 2017,Septembre 2017,ID
0,3.9,3.9,3.8,3.6,3.5,3.4,3.4,3.4,3.3,ZH
1,3.0,3.0,2.9,2.7,2.6,2.4,2.4,2.5,2.4,BE
2,2.2,2.2,2.0,1.9,1.8,1.7,1.7,1.8,1.7,LU
3,1.5,1.4,1.3,1.2,1.0,0.8,0.7,0.6,0.6,UR
4,2.0,2.0,1.9,1.8,1.7,1.6,1.7,1.7,1.7,SZ
5,1.0,1.1,1.0,1.0,0.8,0.8,0.8,0.8,0.7,OW
6,1.3,1.3,1.2,1.1,1.1,1.0,1.0,1.0,1.0,NW
7,2.5,2.6,2.4,2.3,2.1,2.0,1.8,1.9,1.8,GL
8,2.6,2.6,2.5,2.5,2.4,2.3,2.3,2.4,2.3,ZG
9,3.2,2.9,2.7,2.7,2.4,2.4,2.7,2.8,2.7,FR


### Second Appraoch

In [159]:
DF_main = DF.transpose().reset_index()
DF_plot = (DF_main.iloc[:-1, :].drop('Chômeurs inscrits', level=1, axis=1)
                  .drop(['Total', 'matched_index'], axis=1, level=0))

_columns = [item[0] for item in DF_plot.columns.tolist()]
DF_plot.columns = _columns

canton_list = [['Zurich', 'Berne', 'Lucerne', 'Uri', 'Schwyz', 'Obwald', 'Nidwald',
       'Glaris', 'Zoug', 'Fribourg', 'Soleure', 'Bâle-Ville', 'Bâle-Campagne',
       'Schaffhouse', 'Appenzell Rhodes-Extérieures',
       'Appenzell Rhodes-Intérieures', 'St-Gall', 'Grisons', 'Argovie',
       'Thurgovie', 'Tessin', 'Vaud', 'Valais', 'Neuchâtel', 'Genève',
       'Jura'],['ZH','BE','LU', 'UR', 'SZ', 'OW', 'NW', 'GL', 'ZG', 'FR',
               'SO', 'BS', 'BL', 'SH', 'AR', 'AI', 'SG', 'GR', 'AG', 'TG',
                'TI', 'VD', 'VS', 'NE', 'GE', 'JU']]

canton_dict = {canton_list[0][x] : canton_list[1][x] for x in range(len(canton_list[1]))}

DF_plot['id'] = DF_plot['index'].apply(lambda x: canton_dict[x])
DF_plot


Unnamed: 0,index,Janvier 2017,Février 2017,Mars 2017,Avril 2017,Mai 2017,Juin 2017,Juillet 2017,Août 2017,Septembre 2017,id
0,Zurich,3.9,3.9,3.8,3.6,3.5,3.4,3.4,3.4,3.3,ZH
1,Berne,3.0,3.0,2.9,2.7,2.6,2.4,2.4,2.5,2.4,BE
2,Lucerne,2.2,2.2,2.0,1.9,1.8,1.7,1.7,1.8,1.7,LU
3,Uri,1.5,1.4,1.3,1.2,1.0,0.8,0.7,0.6,0.6,UR
4,Schwyz,2.0,2.0,1.9,1.8,1.7,1.6,1.7,1.7,1.7,SZ
5,Obwald,1.0,1.1,1.0,1.0,0.8,0.8,0.8,0.8,0.7,OW
6,Nidwald,1.3,1.3,1.2,1.1,1.1,1.0,1.0,1.0,1.0,NW
7,Glaris,2.5,2.6,2.4,2.3,2.1,2.0,1.8,1.9,1.8,GL
8,Zoug,2.6,2.6,2.5,2.5,2.4,2.3,2.3,2.4,2.3,ZG
9,Fribourg,3.2,2.9,2.7,2.7,2.4,2.4,2.7,2.8,2.7,FR


## Choropleth map

In [203]:
ch_map_list = [fo.Map([46.8182, 8.22], tiles='cartodbpositron', zoom_start=7.3)
         for count in range(len(plot_data.columns.tolist()) -1)]

def add_choropleth(args):
    ch_map, month = args
    ch_map.choropleth(geo_data=geo_json_data, data=plot_data,
                  columns=['ID', month],
                  key_on='feature.id',
                  fill_color='OrRd', fill_opacity=0.3, line_opacity=0.3,
#                   legend_name = "rate difference between unemployed foreigners and swiss in (%)",
                  topojson = 'objects.cantons')
    return ch_map

months = plot_data.drop('ID', axis=1).columns.tolist()
ch_map_list_out = map(add_choropleth, zip(ch_map_list, months))
# ch_map.choropleth(geo_data=geo_json_data, data=plot_data,
#                   columns=['ID', ],
#                   key_on='feature.id',
#                   fill_color='PuBu', fill_opacity=0.3, line_opacity=0.3,
# #                   legend_name = "rate difference between unemployed foreigners and swiss in (%)",
#                   topojson = 'objects.cantons')

# zip(ch_map_list, plot_data.columns.tolist().remove('ID'))

from IPython.display import display

for m in ch_map_list_out:
    display(m)
# fo.GeoJson(geo_json_data).add_to(ch_map)
# type(ch_map[0])
# ch_map

# list(ch_map_list_out)

In [108]:
list(zip(ch_map_list, months))

[(<folium.folium.Map at 0x110414d30>, 'Janvier 2017'),
 (<folium.folium.Map at 0x1105f1080>, 'Février 2017'),
 (<folium.folium.Map at 0x1107bd320>, 'Mars 2017'),
 (<folium.folium.Map at 0x1107a6940>, 'Avril 2017'),
 (<folium.folium.Map at 0x1107d26a0>, 'Mai 2017'),
 (<folium.folium.Map at 0x1107a6f98>, 'Juin 2017'),
 (<folium.folium.Map at 0x110528ac8>, 'Juillet 2017'),
 (<folium.folium.Map at 0x1106f09e8>, 'Août 2017'),
 (<folium.folium.Map at 0x1106043c8>, 'Septembre 2017')]

## Approach 2


In [202]:
## This cell can be skipped.
from branca.colormap import linear

vmax = DF_plot.max(numeric_only=True).max()
vmin = DF_plot.min(numeric_only=True).min()

colormap = linear.RdYlBu.scale(vmin, vmax)


geo_path = r'topojson/ch-cantons.topojson.json'
geo_json_data = json.load(open(geo_path))

swiss_coord = [46.8182, 8.22]

ch_map = fo.Map(location=swiss_coord,
                tiles='cartodbpositron',
                zoom_start=8)
# display(DF_plot['Mars 2017'])
df = DF_plot.set_index('id')

fo.TopoJson(
    geo_json_data, 
    'objects.cantons',
    name='Mars 2017',
    style_function = lambda feature: {
        'fillColor': colormap(df['Mars 2017'][feature['id']]),
        'color': 'black',
        'weight': 1,
        'dashArray': '5, 5',
        'fillOpacity': 0.7
    }
).add_to(ch_map)


fo.TopoJson(
    geo_json_data, 
    'objects.cantons',
    name='Avril 2017',
    style_function = lambda feature: {
        'fillColor': colormap(df['Avril 2017'][feature['id']]),
        'color': 'black',
        'weight': 1,
        'dashArray': '5, 5',
        'fillOpacity': 0.7
    }
).add_to(ch_map)

fo.LayerControl().add_to(ch_map)

ch_map
# DF_plot

In [201]:
from branca.colormap import linear

vmax = DF_plot.max(numeric_only=True).max()
vmin = DF_plot.min(numeric_only=True).min()

colormap = linear.RdYlBu.scale(vmin, vmax)

geo_path = r'topojson/ch-cantons.topojson.json'
geo_json_data = json.load(open(geo_path))

swiss_coord = [46.8182, 8.22]

ch_map = fo.Map(location=swiss_coord,
                tiles='cartodbpositron',
                zoom_start=8)

# fo.LayerControl(collapsed=True).add_to(ch_map)

def add_choropleth_layers(month):
    df = DF_plot.set_index('id')
    fo.TopoJson(
        geo_json_data, 
        'objects.cantons',
        name = month, 
        style_function = lambda feature: {
            'fillColor': colormap(df[month][feature['id']]),
            'color': 'black',
            'weight': 1,
            'dashArray': '5, 5',
            'fillOpacity': 0.7
        }).add_to(ch_map)
    return ch_map

months = plot_data.drop('ID', axis=1).columns.tolist()
out_list = list(map(add_choropleth_layers, months))
fo.LayerControl(collapsed=False).add_to(ch_map)

display(out_list[-1])
display(colormap)