# PART 2 - Map vizualisation + BONUS

In [1]:
import warnings
warnings.filterwarnings('ignore') # to prevent undesirable warning at the end

import pandas as pd
import folium
import json

In this part we will first import and adjust the data collected in notebook 1 in order, in the second step, to feed the suited data into Folium and generate a map visualization.
* __PART 2.1 : ADJUSTING DATA__ Modify our data so that it suits Folium requirements
* __PART 2.2 : VISUALIZATION__ Feed data into Folium to generate the map
* __BONUS__ Generate an alternative map with spoken-language variable and rostigraben border.
* __CONCLUSION__

# 2.1 Modify our data so that it suits Folium requirements

We'll need a list of all existing swiss cantons

In [2]:
SWISS_CANTONS = ['ZH', 'BE', 'LU', 'UR', 'SZ', 'OW', 'NW', 'GL', 'ZG', 'FR', 'SO', 'BS', 'BL',
          'SH', 'AR', 'AI', 'SG', 'GR', 'AG', 'TG', 'TI', 'VD', 'VS', 'NE', 'GE', 'JU']

Now let's import our data exported from the other notebook

In [3]:
universities_df = pd.read_csv('data/universities.csv', 
                              index_col='University', 
                              usecols=["University","Approved Amount","Canton"]).dropna()

We check if the dataframe is well structured

In [4]:
universities_df.head()

Unnamed: 0_level_0,Approved Amount,Canton
University,Unnamed: 1_level_1,Unnamed: 2_level_1
Allergie- und Asthmaforschung - SIAF,19169960.0,GR
Berner Fachhochschule - BFH,31028700.0,BE
Biotechnologie Institut Thurgau - BITG,2492535.0,TG
Centre de rech. sur l'environnement alpin - CREALP,1567678.0,VS
EPF Lausanne - EPFL,1175316000.0,VD


There are a few problems in our data, namely :
- Some cantons don't make sense (e.g *Lazio* and *HE*) so we have to filter them out __(see function `remove_non_cantons()`)__.
- We could like to have all cantons represented in the dataframe, so we should append missing cantons with 0 grant __(see function `fill_empty_cantons()`)__

In [5]:
def remove_non_cantons(df):
    """
    This function takes a dataframe of universities and their grants, cantons but filters
    in order to keep only frames where the canton is actually a swiss canton.
    """
    
    is_a_swiss_canton = df.Canton.isin(SWISS_CANTONS)
    
    return df.ix[is_a_swiss_canton]

In [6]:
def fill_empty_cantons(df):
    """
    In case some cantons don't appear in the DF, add them with 0 grant.
    """
    
    # get the current cantons
    
    current_cantons = list(set(df["Canton"]))

    # see which cantons miss
    
    def difference(a, b):
        """
        For two data structures A and B, this function outputs the elements of A that are not in B
        """
        return list(set(a) - set(b))

    empty_cantons = difference(SWISS_CANTONS, current_cantons)
    
    # build a dataframe with these missing cantons with 0 grant
    
    empty_cantons_df = pd.DataFrame(data=list(zip(empty_cantons, [0]*len(empty_cantons))),
                                columns=["Canton","Approved Amount"])
    
    # append this to current dataframe and reindex
    
    df = df.groupby("Canton").sum().reset_index()
    df = pd.concat([df, empty_cantons_df])
    df['Approved Amount'] = pd.to_numeric(df['Approved Amount'], errors='coerce')
    df = df[['Canton', 'Approved Amount']]
    
    return df
    

Let's apply these functions on our dataset

In [7]:
canton_df = fill_empty_cantons(remove_non_cantons(universities_df))

In [8]:
canton_df.head()

Unnamed: 0,Canton,Approved Amount
0,AG,115269000.0
1,BE,1555048000.0
2,BL,3476142.0
3,BS,1392481000.0
4,FR,459073700.0


Now we're ready to use Folium maps

# 2.2 Generating plots

We first define the general layout of the map

In [9]:
map_ = folium.Map(location=[46.7303575,8.2950065], zoom_start=7)

Now we define a function `chloropleth_map()` that will add cantons layout plus colors according to the amount of grants they received. We add an optional boolean parameter `with_universities` that enables, if set __True__, to display ticks for university positions. 

In [10]:
def choropleth_map(with_universities=True):
    
    map_ = folium.Map(location=[46.7303575,8.2950065], zoom_start=7)
    
    if with_universities:  # this adds ticks for university positions

        # load universities with lat/long measures
        universities_positions = pd.read_csv('data/universities.csv', 
                                  index_col='University', 
                                  usecols=["University","Latitude", "Longitude"]).dropna()
    
        for tuple_ in universities_positions.itertuples():
            folium.Marker(location=[tuple_.Latitude, tuple_.Longitude], # add ticks
                          popup=tuple_.Index).add_to(map_)
            
    
    
    # Setting the color scale
    amounts = canton_df["Approved Amount"]
    pos_amounts = amounts[amounts > 0]
    
    min_amount = int(pos_amounts.min())  # we want the min>0 so that we still isolate cantons without grants
    max_amount = int(amounts.max())
    scale = list(range(min_amount, max_amount, 
                       int((max_amount-min_amount)/5)))

    
    # Adding the choropleth layer
    map_.choropleth(geo_path='data/ch-cantons.topojson.json',
                       data=canton_df,
                       columns=['Canton','Approved Amount'],
                       key_on='feature.id',
                       fill_color='YlOrRd',  #YlGn, PuRd, YlGnBu, YlOrRd
                       legend_name='Amount',
                       topojson='objects.cantons',
                       threshold_scale = scale)

    return map_

Now we just run the function to display the map

In [11]:
map_cantons = choropleth_map(with_universities=True)
map_cantons

# (Dynamic file findable as `map_cantons.html`)
<a href="https://dl.dropboxusercontent.com/u/109081671/map_cantons.html"> <img src="static_map_cantons.png"></img></a>

In [12]:
map_cantons.save('map_cantons.html')

# Bonus

We now want to display this same map but by doing aggregations depending on the language spoken. The border is called  the Rostigraben (a border that separates the concerned cantons depending on languages) and looks almost as below :

<img src="https://3.bp.blogspot.com/-82ZIXgGLZB4/U0_lDyHT_hI/AAAAAAAAGLo/tZ3l5zCn_-E/s1600/poele-rosti.jpg" style="width:30%"></img>

We define lists of cantons in French-speaking part (6) and in German-speaking part (20, including the canton Ticino) 

In [13]:
CANTON_ROM = ['GE', 'VD', 'FR', 'JU', 'VS', 'NE']
CANTON_ALE = ['ZH','BE','LU','UR','SZ','OW','NW','GL','ZG','SO','BS','BL','SH','AR','AI','SG','GR','AG','TG','TI']

We add a new column called Region to specify in which region the cantons are located.

In [14]:
grants_rom = canton_df[[x in CANTON_ROM for x in canton_df.Canton.values]]
grants_rom['Region'] = 'ROMANDE'
grants_ale = canton_df[[x in CANTON_ALE for x in canton_df.Canton.values]]
grants_ale['Region'] = 'ALEMANIQUE'

# Concat the two dataframes
canton_withRegion_df = pd.concat([grants_rom, grants_ale], ignore_index=True)

# Group by region
amount_byRegion = canton_withRegion_df.groupby('Region').sum().reset_index()
amount_byRegion

Unnamed: 0,Region,Approved Amount
0,ALEMANIQUE,7004160000.0
1,ROMANDE,5159038000.0


We set the same amount for every canton located in German-speaking part and the same amount for every canton located in Frech-speaking part 

In [15]:
canton_withRegion_df['Amount per region'] = None
for i in canton_withRegion_df.index:
    if canton_withRegion_df['Region'][i] == 'ROMANDE':
        canton_withRegion_df['Amount per region'][i] = amount_byRegion['Approved Amount'][1]
    else:
        canton_withRegion_df['Amount per region'][i] = amount_byRegion['Approved Amount'][0]
        
amount_perRegion = canton_withRegion_df[['Canton','Amount per region']]
amount_perRegion

Unnamed: 0,Canton,Amount per region
0,FR,5159040000.0
1,GE,5159040000.0
2,JU,5159040000.0
3,NE,5159040000.0
4,VD,5159040000.0
5,VS,5159040000.0
6,AG,7004160000.0
7,BE,7004160000.0
8,BL,7004160000.0
9,BS,7004160000.0


Put into a Folium map in the same fashion as `choropleth_map()`

In [16]:
# define min and max for the color scale

minb = int(amount_perRegion['Amount per region'].min())
maxb = int(amount_perRegion['Amount per region'].max())
scale = list(range(minb, maxb, int((maxb-minb)/2)))  # we want only 2 colors for French-speaking part and German's

# define the initial map layout
map_rosti = folium.Map(location=[46.7303575,8.2950065], zoom_start=7)

# add the choropleth layer
map_rosti.choropleth(geo_path='data/ch-cantons.topojson.json',
                   data=amount_perRegion,
                   columns=['Canton','Amount per region'],
                   key_on='feature.id',
                   fill_color='YlOrRd',
                   legend_name='Röstigraben frontier',
                   fill_opacity = 0.7,
                   line_opacity = 0.2,
                   topojson='objects.cantons',
                   threshold_scale = scale,
                   )

# add the rostigraben border
border_coordinates = [
    [47.439510, 7.326860],
    [47.325022, 7.534927],
    [47.047795, 7.003491],
    [46.870970, 7.360345],
    [46.360059, 7.230347],
    [46.412430, 7.575753],
    [45.987958, 7.560415]]

border=folium.PolyLine(locations=border_coordinates,weight=5)
map_rosti.add_children(border)

# display
map_rosti

# (Dynamic file findable in `map_rosti.html`)
<a href="https://dl.dropboxusercontent.com/u/109081671/map_rosti.html"> <img src="static_map_rosti.png"></img></a>

We save the result as an HTML standalone file

In [17]:
map_rosti.save('map_rosti.html')

# Conclusion

We've seen that Zurich and more generally the German-speaking part of Switzerland is a large winner regarding grants received ! 