# Homework 3

In [1]:
import folium
import pandas as pd
import numpy as np
import json
import os

## Question 1: unemployment rate in Europe

### Definitions
**Unemployed persons** are all persons 15 to 74 years of age (16 to 74 years in ES, IT and the UK) who were not employed during the reference week, had actively sought work during the past four weeks and were ready to begin working immediately or within two weeks. Figures show the number of persons unemployed in thousands.

**Employed persons** are all persons who worked at least one hour for pay or profit during the reference week or were temporarily absent from such work. For the unemployment rate, only persons from 15 to 74 years of age are used.

The **unemployment rate** is the number of people unemployed as a percentage of the labour force. The labour force is the total number of people employed and unemployed. In the database, unemployment rates can be downloaded by chosing the unit "PC_ACT", Percentage of Active Population.

The dataset that we used was http://ec.europa.eu/eurostat/web/products-datasets/-/ei_lmhr_m which contains monthly data of the unemployment rate of each country.

In the dataset we have values of the rate considering 3 different populations: 
- unemployed under 25 years
- unemployed over 25 years
- total

We've used the total unemployment rate as the indicator.

In [2]:
file_path = r'ei_lmhr_m_1_Data.csv'
# Loading the data in the dataframe
unemployment_df = pd.read_csv(file_path)
# Selecting values of people between 15 and 74 years
unemployment_df = unemployment_df.loc[unemployment_df.INDIC=='Unemployment according to ILO definition - Total']
# Selecting most recent data
unemployment_df = unemployment_df.loc[unemployment_df.TIME=='2017M07'] # lack data for more recent months
# Keeping useful columns
unemployment_df = unemployment_df[['GEO', 'GEO_LABEL', 'Value']]
# Reseting the index
unemployment_df.reset_index(drop=True, inplace = True)
unemployment_df.Value = pd.to_numeric(unemployment_df.Value)
unemployment_df.head()

Unnamed: 0,GEO,GEO_LABEL,Value
0,BE,Belgium,7.3
1,BG,Bulgaria,6.1
2,CZ,Czech Republic,2.8
3,DK,Denmark,5.8
4,DE,Germany (until 1990 former territory of the FRG),3.7


In [14]:
EU_coordinates = [53.5775, 23.106111]
EU_map = folium.Map(location=EU_coordinates, tiles='cartodbpositron', zoom_start=4)
unemployment_df

Unnamed: 0,GEO,GEO_LABEL,Value,Relative to Switzerland
0,BE,Belgium,7.3,-2.7
1,BG,Bulgaria,6.1,-3.9
2,CZ,Czech Republic,2.8,-7.2
3,DK,Denmark,5.8,-4.2
4,DE,Germany (until 1990 former territory of the FRG),3.7,-6.3
5,EE,Estonia,5.9,-4.1
6,IE,Ireland,6.2,-3.8
7,EL,Greece,21.0,11.0
8,ES,Spain,16.9,6.9
9,FR,France,9.7,-0.3


Things to take into account for the plot:
- number of classes
- color
- data classification method
- which interactivity could we add?

In [107]:
countries_geo = os.path.join('topojson', 'europe.topojson.json')
geo_json_data = json.load(open(countries_geo))
#geo_json_data = geo_json_data['objects']['europe']['geometries']

mymap = folium.TopoJson(geo_json_data, "objects.europe")
#folium.LayerControl().add_to(EU_map)
EU_map 

AttributeError: 'NoneType' object has no attribute 'get'

<folium.folium.Map at 0x111c16e80>

In [98]:
geo_json_data['objects']['europe']['geometries'][1]

{'arcs': [[12, 13, 14, 15, 16, 17, 18]],
 'id': 'AL',
 'properties': {'NAME': 'Albania'},
 'type': 'Polygon'}

#### Comparison with Switzerland

In [51]:
CH_unemployment_rate = 10.0  # find real value
unemployment_df['Relative to Switzerland'] = unemployment_df.Value - CH_unemployment_rate
#print(unemployment_df.loc[unemployment_df['2016 ']==min(unemployment_df['2016 '])])
unemployment_df.head()

Unnamed: 0,GEO,GEO_LABEL,Value,Relative to Switzerland
0,BE,Belgium,7.3,-2.7
1,BG,Bulgaria,6.1,-3.9
2,CZ,Czech Republic,2.8,-7.2
3,DK,Denmark,5.8,-4.2
4,DE,Germany (until 1990 former territory of the FRG),3.7,-6.3


## Question 2: unemployment rate in Switzerland

data that we will need:
    - unemployment rate of different cantons in CH 15-74 years (question 2)
    - unemployment rate of different cantons in CH 15-74 years making a distinction between *Swiss* and *foreign* (question 3)
    - unemployment rate of different cantons in CH taking differences between age groups (question 3)

In [52]:
file_path = r'Taux_de_chômage_nationality.csv'
# Loading rates in Switzerland
df_CH_nationality = pd.read_csv(file_path, encoding='utf-16')
# Drop first and last line of the dataframe (not needed) & reseting index
df_CH_nationality.drop(df_CH_nationality.index[0], inplace=True)
df_CH_nationality.drop(df_CH_nationality.index[-1], inplace=True)
df_CH_nationality.reset_index(drop=True, inplace=True)
# Selecting columns
df_CH_nationality = df_CH_nationality[['Canton', 'Nationalité', 'Juillet 2017']]


# Selecting values of people between 25 and 74 years
df_CH_nationality.head()

Unnamed: 0,Canton,Nationalité,Juillet 2017
0,Zurich,Etrangers,5.5
1,Zurich,Suisses,2.6
2,Berne,Etrangers,5.5
3,Berne,Suisses,1.8
4,Lucerne,Etrangers,3.9


In [53]:
file_path = r'Taux_de_chômage_age.csv'
# Loading rates in Switzerland
df_CH_age = pd.read_csv(file_path, encoding='utf-16')
# Drop first and last line of the dataframe (not needed) & reseting index
df_CH_age.drop(df_CH_age.index[0], inplace=True)
df_CH_age.drop(df_CH_age.index[-1], inplace=True)
df_CH_age.reset_index(drop=True, inplace=True)
# Selecting columns
df_CH_age = df_CH_age[['Canton', "Classes d'âge par étapes de 5 ans", 'Juillet 2017']]
# Selecting values of people between 25 and 74 years
df_CH_age.head()

Unnamed: 0,Canton,Classes d'âge par étapes de 5 ans,Juillet 2017
0,Zurich,15-19 ans,4.0
1,Zurich,20-24 ans,3.4
2,Zurich,25-29 ans,3.6
3,Zurich,30-34 ans,3.9
4,Zurich,35-39 ans,3.9


In [54]:
final_df = pd.merge(df_CH_age, df_CH_nationality, on='Canton', how='outer')
final_df.rename(columns={'Juillet 2017_x': 'data_age', 'Juillet 2017_y': 'data_nationality'}, inplace=True)
final_df = final_df.apply(pd.to_numeric, errors='ignore')
#final_df.groupby(["Classes d'âge par étapes de 5 ans", 'Nationalité']).mean()
final_df.loc[final_df.data_age=='...']

Unnamed: 0,Canton,Classes d'âge par étapes de 5 ans,data_age,Nationalité,data_nationality
300,Appenzell Rhodes-Intérieures,15-19 ans,...,Etrangers,2.1
301,Appenzell Rhodes-Intérieures,15-19 ans,...,Suisses,0.6
306,Appenzell Rhodes-Intérieures,30-34 ans,...,Etrangers,2.1
307,Appenzell Rhodes-Intérieures,30-34 ans,...,Suisses,0.6
308,Appenzell Rhodes-Intérieures,35-39 ans,...,Etrangers,2.1
309,Appenzell Rhodes-Intérieures,35-39 ans,...,Suisses,0.6
316,Appenzell Rhodes-Intérieures,55-59 ans,...,Etrangers,2.1
317,Appenzell Rhodes-Intérieures,55-59 ans,...,Suisses,0.6
318,Appenzell Rhodes-Intérieures,60 ans et plus,...,Etrangers,2.1
319,Appenzell Rhodes-Intérieures,60 ans et plus,...,Suisses,0.6


In [55]:
df = final_df.groupby('Canton').mean()
df.reset_index(inplace=True)
df

Unnamed: 0,Canton,data_nationality
0,Appenzell Rhodes-Extérieures,2.6
1,Appenzell Rhodes-Intérieures,1.35
2,Argovie,3.85
3,Berne,3.65
4,Bâle-Campagne,3.5
5,Bâle-Ville,3.75
6,Fribourg,3.4
7,Genève,5.2
8,Glaris,2.45
9,Grisons,1.4


In [56]:
CH_coordinates = [46.818188, 8.227512]
CH_map = folium.Map(location=CH_coordinates, tiles='cartodbpositron', zoom_start=7)
CH_map

In [57]:
CH_geo = os.path.join('topojson', 'ch-cantons.topojson.json')
geo_json_data = json.load(open(CH_geo))

CH_map.choropleth(
    geo_data=geo_json_data,
    data=df,
    columns=['Canton', 'data_nationality'],
    key_on='feature.id',
    fill_color='BuPu',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='Unemployment Rate in Switzerland (%)')

#folium.LayerControl().add_to(CH_map)
CH_map

In [62]:
geo_json_data

[{'arcs': [[[0, 1, 2]], [[3]], [[4]], [[5, 6, 7, 8, 9, 10], [11]]],
  'id': 'AZ',
  'properties': {'NAME': 'Azerbaijan'},
  'type': 'MultiPolygon'},
 {'arcs': [[12, 13, 14, 15, 16, 17, 18]],
  'id': 'AL',
  'properties': {'NAME': 'Albania'},
  'type': 'Polygon'},
 {'arcs': [[[-12]], [[19, -3, 20, 21, -7], [-5], [-4]]],
  'id': 'AM',
  'properties': {'NAME': 'Armenia'},
  'type': 'MultiPolygon'},
 {'arcs': [[22, 23, 24, 25, 26, 27, 28, 29, 30, 31]],
  'id': 'BA',
  'properties': {'NAME': 'Bosnia and Herzegovina'},
  'type': 'Polygon'},
 {'arcs': [[32, 33, 34, 35, 36, 37]],
  'id': 'BG',
  'properties': {'NAME': 'Bulgaria'},
  'type': 'Polygon'},
 {'arcs': [[38]],
  'id': 'CY',
  'properties': {'NAME': 'Cyprus'},
  'type': 'Polygon'},
 {'arcs': [[[39]],
   [[40]],
   [[41]],
   [[42]],
   [[43]],
   [[44]],
   [[45]],
   [[46]],
   [[47]],
   [[48]],
   [[49]],
   [[50]],
   [[51]],
   [[52]],
   [[53]],
   [[54, 55]],
   [[56]],
   [[57]]],
  'id': 'DK',
  'properties': {'NAME': 'Denmar