In [14]:
import json
import folium
import pandas as pd 
import numpy as np
from IPython.display import display, Math, Latex

In [None]:
In order to view the maps correctly please visit the link : 

## Task 2

In this section, we zoom in on Switzerland. The overall unemployment rate is $3\%$ as stated by the [Swiss Confederation website](https://www.amstat.ch/v2/index.jsp), but here we are interested in viewing the unemployment rates by cantons. In a second step, we will consider two sub-categories of job seekers - those who are currently employed and those who are not- and investigate the changes that these different definitions bring to the map of Switzerland.

### The data

The map of the Swiss cantons is represented by the `ch_Cantons TopoJson.Json` file, wher the objects are cantons , and are defined by a two letter id, a name and a list of arcs. 

As for the unemployment data, it is available for download in the Swiss Confederation Website under the section 'details'. We went under the category *2 Chomeurs et demandeurs d'emploi* , then * 2-1 taux de chômage * .
When asked to choose the desired variables to create the report, we selected the following : 
- current month
- Unemployment rate Indicators : unemployment rates 
- Unemployed Indicators : registered Unemployed 
- Job seekers indicators : employed Job seekers - Unemployed Job Seekers 
- Geographic characteristics : cantons and linguistic regions 

This is an overview of the dataframe we obtain: 


In [15]:
# read the .csv file. (it was modified as a text file to translate the headers to english)
df=pd.read_csv('U_R_CH_2.csv')
#drop the last line ( corresponds to the total ; not a canton)
df=df.drop(26)
df.head(5)

Unnamed: 0,Index,Canton,Unemployment_rate,Registered_unemployed,Job_seekers,Job_seekers_Employed
0,0,Zurich,3.3,27225,34156,6931
1,1,Berne,2.4,13658,18385,4727
2,2,Lucerne,1.7,3885,6756,2871
3,3,Uri,0.6,112,257,145
4,4,Schwyz,1.7,1455,2229,774


We notice that :
- The cantons are listed in the same order as in the map topoJson file (checked manually)
- The cantons are identified by their names, which are different from the ones used in the map topoJson file.
In order to correctly match the data to the map, we ought to use the cantons Id (e.g 'ZU', 'BE',...),which are not available in the dataframe. 
One easy way to solve this problem, is to extract a list of Ids from the map Topojson file and concatenate it to the dataframe. We emphasize on the fact that this operation is only possible because the order of the cantons are the same.

**Extracting the list of Ids from the map topojson file**


In [16]:
#load the topojson file as a dictionary.
cantons_geo_path = r'ch-cantons.topojson.json'
dic = json.load(open(cantons_geo_path))
#access the list of cantons in the file 
elements =dic['objects']['cantons']['geometries']
#here we can view the elements corresponding to Zurich, Bern and Luzern 
elements[0:3]

[{'arcs': [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]],
  'id': 'ZH',
  'properties': {'name': 'Zürich'},
  'type': 'Polygon'},
 {'arcs': [[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], [23], [24]],
   [[25, 26]],
   [[27]],
   [[28, 29]]],
  'id': 'BE',
  'properties': {'name': 'Bern/Berne'},
  'type': 'MultiPolygon'},
 {'arcs': [[-12, 30, 31, 32, 33, 34]],
  'id': 'LU',
  'properties': {'name': 'Luzern'},
  'type': 'Polygon'}]

In [17]:
# We will iterate on the elements of the list and extract their id, and store it in a new list "identification
identification = []
for d in elements:
    identification.append(d['id'])
# We add identification as a column to the dataframe
df['id']=identification
df

Unnamed: 0,Index,Canton,Unemployment_rate,Registered_unemployed,Job_seekers,Job_seekers_Employed,id
0,0,Zurich,3.3,27225,34156,6931,ZH
1,1,Berne,2.4,13658,18385,4727,BE
2,2,Lucerne,1.7,3885,6756,2871,LU
3,3,Uri,0.6,112,257,145,UR
4,4,Schwyz,1.7,1455,2229,774,SZ
5,5,Obwald,0.7,153,319,166,OW
6,6,Nidwald,1.0,248,436,188,NW
7,7,Glaris,1.8,416,713,297,GL
8,8,Zoug,2.3,1543,2615,1072,ZG
9,9,Fribourg,2.7,4466,7837,3371,FR


### Q-2.a
Now that we have a commun identification, we can match the unemployment rates to the catons on the map.
First,We extract a list from the dataframe that contains only the variable we need (`Unemply_rate`) and is indexed by the Ids.
We then create a blank map centered on Switzerland and import the palette we will use to color the cantons: We need
sequential palette where light colors corresponds to low employment rate and dark ones correspond to high employment rate. We chose a red-ish colormap since high employment rate tend to be a negative economic metric.


In [18]:
unemployment_dict = df.set_index('id')['Unemployment_rate']
unemployment_dict[0:5]

id
ZH    3.3
BE    2.4
LU    1.7
UR    0.6
SZ    1.7
Name: Unemployment_rate, dtype: float64

In [20]:
#create a blank map centered on Switzerland.
map_ch = folium.Map(location =[46.75, 8.25], zoom_start=7)
#map_ch

In [21]:
#Importing the colors : 
import branca.colormap as cm


colormap = cm.linear.YlOrRd.scale(
    df.Unemployment_rate.min(),
    df.Unemployment_rate.max())

print(colormap(5.0))

colormap

#be0623


We can now create our choropleth map via the function `TopoJson` and an adequate style function.

In [22]:
folium.TopoJson(open('ch-cantons.topojson.json'),
                'objects.cantons',
                style_function=lambda feature: {
        'fillColor': colormap(unemployment_dict[feature['id']]),
#the color of each canton is proportional to the corresponding value presented in unemployment_dict 
        'color': 'black',
        'weight': 1,
        'dashArray': '5, 5',
        'fillOpacity': 0.9,
    },name='Unemployement rates'
               ).add_to(map_ch)

# we add a color scale 
colormap.caption = 'Unemployment color scale'
colormap.add_to(map_ch)

map_ch.save('CH_Unemployment_Rate1.html')
map_ch


We can see on this map that Geneva and Neuchatel have the highest unemployment rates $(>5.1)$.
On the other hand , Obwald and Uri have the lowest rates $(<0.7)$.
All in all, it seems that the unemployment rates are lower in the german  and Italian speaking parts than in the French speaking ones, with the exception of Zurich (only German speaking canton that reaches a rate above $3\%$). 

### Q-2.b

AS mentionned in the introduction, the unemployment rates are subject to various interpretations ,depending on the way they are defined.Until now, we have been considering the following definition:

$$
\text{Rate} = 100* \frac{\text{Job Seekers}}{\text{Active population}}
$$
However, the dataframe shows that there are two types of job seekers: those who currently have a job and are looking for a new one, and those who are not employed at the moment.
As we can easily check, the sum of the two columns gives us the total number of job seekers.
We will compute the proportion of each category and include it to our dataframe.
\begin{align}
\text{Percentage_Unemployed_Job_Seekers}&= 100\times \frac{\text{Unemployed Job Seekers}}{\text{Active population}}\\
&=100 \times  \frac{\text{Job Seekers}}{\text{Active population}}\times \frac{\text{Unemployed Job Seekers}}{\text{Job Seekers}} \\
&=\text{Unemployment rate }\times \frac{\text{Unemployed Job Seekers}}{\text{Job Seekers}}
\end{align}


In [23]:

#df.Registered_unemployed+df.Job_seekers_Employed==df.Job_seekers

df['percentage_Employed_Job_Seekers']=df.Unemployment_rate*df.Job_seekers_Employed/df.Job_seekers
df['percentage_Unemployed_Job_Seekers']=df.Unemployment_rate*df.Registered_unemployed/df.Job_seekers
df.head(4)

Unnamed: 0,Index,Canton,Unemployment_rate,Registered_unemployed,Job_seekers,Job_seekers_Employed,id,percentage_Employed_Job_Seekers,percentage_Unemployed_Job_Seekers
0,0,Zurich,3.3,27225,34156,6931,ZH,0.669642,2.630358
1,1,Berne,2.4,13658,18385,4727,BE,0.617068,1.782932
2,2,Lucerne,1.7,3885,6756,2871,LU,0.722425,0.977575
3,3,Uri,0.6,112,257,145,UR,0.338521,0.261479


In order to create two new choropleth maps viewing these results, we follow the same steps as previously. In order to avoid repetition, we define a function `add_layer_toMap`: Given the name of the chosen column, this function will add it as a layer to the map.
In this part, we define the colormap differently, using the quantiles of the chosen variable in order to help with the interpretation.

In [12]:
def add_layer_toMap (definition,name_control):
    #first parameter definition : name of the column we want to represent
    #second parameter name_control: the name to be used in the layer control panel 
    dic_definition = df.set_index('id')[definition]
    colormap_definition =cm.linear.OrRd.to_step(
    n=4,
    data=df[definition],
    method='quantiles')
     #linear.OrRd.scale(df[definition].min(),df[definition].max())
    folium.TopoJson(open('ch-cantons.topojson.json'),'objects.cantons',
                    style_function=lambda feature: {
                        'fillColor': colormap_definition(dic_definition[feature['id']]),
                        'color': 'black','weight': 1,'dashArray': '5, 5','fillOpacity': 0.9,
                    },name=name_control,
                   ).add_to(map_ch)
    colormap_definition.caption = name_control
    colormap_definition.add_to(map_ch)
    return map_ch



In [13]:
# create a blank map
map_ch = folium.Map(location =[46.75, 8.25],tiles='Mapbox Bright', zoom_start=6.5)
#view the map with the initial unemployment rate 
add_layer_toMap('Unemployment_rate','Unemployement rate (%)')
map_ch.save('CH__Unemployment_Rate2.html')
map_ch 

This is the same map as previously,where the colors are assigned differently: this highlights the cantons where the unemployment rate is above the 75% quantile . The difference between the different linguistic parts of Switzerland is also more saliant.

In [24]:
# create a blank map
map_ch = folium.Map(location =[46.75, 8.25],tiles='Mapbox Bright', zoom_start=6.5)
#add a layer corresponding to the percentage of unemployed job seekers 
add_layer_toMap('percentage_Unemployed_Job_Seekers','Unemployed Job seekers (%)')
map_ch.save('CH_Unemployed_Job_seekers.html')
map_ch


We can say that the proportions of unemployed job seekers follows roughtly the same trend as the total proportion of job seekers, although their distributions are shifted by $1\%$. 
As before, the french speaking part and Zurich are above the $75\%$ quantile . 
Tessin, however, is closer to the median according to this definition.


In [25]:
# create a blank map
map_ch = folium.Map(location =[46.75, 8.25],tiles='Mapbox Bright', zoom_start=6.5)
#add a layer corresponding to the percentage of unemployed job seekers 
add_layer_toMap('percentage_Employed_Job_Seekers','Employed Job seekers (%)')
map_ch.save('CH_Employed_Job_seekers.html')
map_ch


This map shows the proportion of employed job seekers. These are globally lower than the counterpart - unemployed job seekers. Still, they are higher than $1.2\%$ in the French speaking part, Schaffhouse ,Basel-Stadt and Tessin.

### Unemployed /Employed job seekers ratio accross the cantons

Finally, we can summarize the variation between Employed and Unemployed job seekers by representing their ratio in each canton.

In [39]:
#create a blank map
map_ch = folium.Map(location =[46.75, 8.25],tiles='Mapbox Bright', zoom_start=6.5)
#define the ratio 
df['difference']= df.Job_seekers_Employed/df.Registered_unemployed
dic_definition = df.set_index('id')['difference']
# Here we use a diverging colors : red for ratio<1 and red for ratio >1
colormap_definition=cm.linear.RdYlBu.scale(dic_definition.min(),dic_definition.max())

In [40]:
folium.TopoJson(open('ch-cantons.topojson.json'),'objects.cantons',style_function=lambda feature: {
    'fillColor': colormap_definition(dic_definition[feature['id']]),
    'color': 'black','weight': 1,'dashArray': '5, 5','fillOpacity': 0.9,}).add_to(map_ch)
colormap_definition.caption = 'Employed Job Seekers / Unemployed Job Seekers '
colormap_definition.add_to(map_ch)
map_ch

In this map, blue regions have a higher proportion of employed job seekers than unemployed ones.

This confirms that job seekers in the french speaking parts of Switzerland tend to be unemployed when looking for a job. Obwald	,Uri and Grissons have the highest ratio, meaning that the number of employed job seekers are roughtly $1.3$ times larger than the number of unemployed job seekers.

The ratio varies accross the Italian and the German speaking cantons, however we see here again that Zurich follows the same trend as the French speaking regions.

### the data

Similarly to the second task, the data has been collected from the Swiss Confederation Website under the section 'details'.
In order to analyze the difference between Swiss and Foreign nationals , we selected the following indicators: 
- current month
- Unemployment rate Indicators : unemployment rates 
- Various employment indicators 
- Geographic characteristics : cantons and linguistic regions
- other attributes : Nationality
the first time and age category the second time(We can only choose one at a time)

We proceed the same way to build a dataframe representing the variation of unemployment rates according to age, choosing *class age* instead of *nationality* in the *other attribute* list.