# Interactive Visualisation

This work aims at plotting a Choropleth map showing the amount of research funds granted to different cantons in Switzerland.

In [8]:
import numpy as np
import pandas as pd
import folium
import requests
import json
#import urllib
from urllib.request import urlopen
from urllib import parse

Load csv data into pandas dataframe. Retain only the 'University' and 'Approved Amount' columns.

In [18]:
df = pd.read_csv('P3_GrantExport.csv', sep=';')
df = df[['University','Approved Amount']]
df.head()

Unnamed: 0,University,Approved Amount
0,Nicht zuteilbar - NA,11619.0
1,Université de Genève - GE,41022.0
2,"NPO (Biblioth., Museen, Verwalt.) - NPO",79732.0
3,Universität Basel - BS,52627.0
4,"NPO (Biblioth., Museen, Verwalt.) - NPO",120042.0


In [19]:
df['Canton'] = ''; df['Latitude'] = ''; df['Longitude'] = ''
df.head()

Unnamed: 0,University,Approved Amount,Canton,Latitude,Longitude
0,Nicht zuteilbar - NA,11619.0,,,
1,Université de Genève - GE,41022.0,,,
2,"NPO (Biblioth., Museen, Verwalt.) - NPO",79732.0,,,
3,Universität Basel - BS,52627.0,,,
4,"NPO (Biblioth., Museen, Verwalt.) - NPO",120042.0,,,


Using Geonames Full Text Search API to map the universities to their respective columns. 'requests' library is used to get the HTTP response. But since 'University' names have spaces and other special characters, it needs to be UTF-8 encoded before using the requests.get. This is accomplished using 'parse.quote' method in 'urllib' library.

In [23]:
num_projects = len(df)

for i in range(num_projects):
    
    url = 'http://api.geonames.org/search?q=' + df['University'][i] + '&maxRows=2&username=ada_homework&type=json'
    ## Encode special characters and spaces
    query = parse.quote(url,safe=':/&=?')
    ## Parse JSON data
    d = json.loads(requests.get(query).text)
    if bool(d['geonames']):
        df['Canton'].iloc[i] = d['geonames'][0]['adminCode1']
        df['Latitude'].iloc[i] = d['geonames'][0]['lat']
        df['Longitude'].iloc[i] = d['geonames'][0]['lng']


{'geonames': [], 'totalResultsCount': 0}
{'geonames': [], 'totalResultsCount': 0}
{'geonames': [], 'totalResultsCount': 0}
{'geonames': [{'adminName1': 'Basel-City', 'countryCode': 'CH', 'fclName': 'spot, building, farm', 'toponymName': 'Universität Basel', 'countryName': 'Switzerland', 'adminCode1': 'BS', 'name': 'University of Basel', 'countryId': '2658434', 'lat': '47.55832', 'fcodeName': 'university', 'fcode': 'UNIV', 'lng': '7.58403', 'fcl': 'S', 'population': 0, 'geonameId': 6930308}, {'adminName1': 'Basel-City', 'countryCode': 'CH', 'fclName': 'spot, building, farm', 'toponymName': 'Universität', 'countryName': 'Switzerland', 'adminCode1': 'BS', 'name': 'Universität', 'countryId': '2658434', 'lat': '47.55707', 'fcodeName': 'bus station', 'fcode': 'BUSTN', 'lng': '7.58405', 'fcl': 'S', 'population': 0, 'geonameId': 7114328}], 'totalResultsCount': 2}
{'geonames': [], 'totalResultsCount': 0}
{'geonames': [{'adminName1': 'Fribourg', 'countryCode': 'CH', 'fclName': 'spot, building, f

KeyError: 'geonames'

Currently considering only those universities which were mapped to their cantons. 

In [24]:
## Removing all univerisites without canton mappings [TODO: To be changes]
df = df[df['Canton'] != ''] 
## Converting amount from string to float
df['Approved Amount'] = df['Approved Amount'].astype(float)
df.head()

Unnamed: 0,University,Approved Amount,Canton,Latitude,Longitude
3,Universität Basel - BS,52627.0,BS,47.55832,7.58403
5,Université de Fribourg - FR,53009.0,FR,46.80683,7.15317
6,Université de Fribourg - FR,25403.0,FR,46.80683,7.15317
7,Universität Zürich - ZH,47100.0,ZH,47.37092,8.53434
10,Université de Fribourg - FR,153886.0,FR,46.80683,7.15317


Plotting all the universities with their respective latitudes and longitudes (to verify the values obtained from Geonames full text Search API.)

In [29]:
## [Caution: Takes time to execute]
m = folium.Map(location=[46.76, 8.26], zoom_start=8, tiles='Mapbox Bright')
for i in range(len(df)):
    ## Add markers on all universities
    folium.Marker([df['Latitude'].iloc[i], df['Longitude'].iloc[i]], popup=df['University'].iloc[i],
                   icon = folium.Icon(icon = 'cloud')).add_to(m)
m

Calculate the total grant money for each canton using groupby on the 'Canton' column. 

In [26]:
## Compute total grant money for each canton
grant = df.groupby('Canton').apply(lambda x: x['Approved Amount'].sum())
canton_df = pd.DataFrame(grant,columns=['Grant'])
canton_df.reset_index(level=0, inplace=True)
canton_df

Unnamed: 0,Canton,Grant
0,BE,25697457.0
1,BS,15424819.0
2,FR,15030840.0
3,NE,5588158.0
4,ZH,29121071.0


Plot the map using JSON file (with coordinates for each canton) and using the dataframe containing grant money for each canton.

In [27]:
m = folium.Map(location=[46.76, 8.26], zoom_start=8, tiles='Mapbox Bright')
topo_path = r'ch-cantons.topojson.json'
m.geo_json(geo_path=topo_path, data=canton_df, columns=['Canton','Grant'], 
             fill_color='YlGn', fill_opacity=0.7, line_opacity=0.2,)

  return self.choropleth(*args, **kwargs)


Derive all the canton 'id's using the JSON file

In [28]:
with open('ch-cantons.topojson.json') as data_file:
    data = json.load(data_file)
canton_id = [d['id'] for d in data['objects']['cantons']['geometries']]
canton_id

['ZH',
 'BE',
 'LU',
 'UR',
 'SZ',
 'OW',
 'NW',
 'GL',
 'ZG',
 'FR',
 'SO',
 'BS',
 'BL',
 'SH',
 'AR',
 'AI',
 'SG',
 'GR',
 'AG',
 'TG',
 'TI',
 'VD',
 'VS',
 'NE',
 'GE',
 'JU']