# Interactive visualization homework overview
In this homework, we want to make an interactive visualization of the grants received from the SNSF in each canton. The data is the P3 data given on the [SNSF website](http://p3.snf.ch/), called P3_GrantExport.csv
To do so, 
* we first load the data with pandas;
* we only keep the columns of interest (University name and amount of money received for each project);
* and we only keep the rows of interest (corresponding to Swiss universities, that is any non-nan "University" entry is valid). 
* Then, we have to map the universities to their corresponding cantons using [Geonames Full Text Search API in JSON](http://www.geonames.org/export/web-services.html) together with some manual tuning. 
* We finally visualize the results thanks to folium on the map of Switzerland, using a cloropleth map.

## Import librarires and load data

In [1]:
import numpy as np
import pandas as pd
import folium
import requests

In [2]:
geo = r'ch-cantons.topojson.json' #Geolocalization of the cantons
grants_csv = r'P3_GrantExport.csv' #P3 data
grants_df = pd.read_csv(grants_csv,delimiter=';') #Read it as a csv file with delimiter ;

In [3]:
grants_df.head()

Unnamed: 0,"﻿""Project Number""",Project Title,Project Title English,Responsible Applicant,Funding Instrument,Funding Instrument Hierarchy,Institution,University,Discipline Number,Discipline Name,Discipline Name Hierarchy,Start Date,End Date,Approved Amount,Keywords
0,1,Schlussband (Bd. VI) der Jacob Burckhardt-Biog...,,Kaegi Werner,Project funding (Div. I-III),Project funding,,Nicht zuteilbar - NA,10302,Swiss history,Human and Social Sciences;Theology & religious...,01.10.1975,30.09.1976,11619.0,
1,4,Batterie de tests à l'usage des enseignants po...,,Massarenti Léonard,Project funding (Div. I-III),Project funding,Faculté de Psychologie et des Sciences de l'Ed...,Université de Genève - GE,10104,Educational science and Pedagogy,"Human and Social Sciences;Psychology, educatio...",01.10.1975,30.09.1976,41022.0,
2,5,"Kritische Erstausgabe der ""Evidentiae contra D...",,Kommission für das Corpus philosophorum medii ...,Project funding (Div. I-III),Project funding,Kommission für das Corpus philosophorum medii ...,"NPO (Biblioth., Museen, Verwalt.) - NPO",10101,Philosophy,Human and Social Sciences;Linguistics and lite...,01.03.1976,28.02.1985,79732.0,
3,6,Katalog der datierten Handschriften in der Sch...,,Burckhardt Max,Project funding (Div. I-III),Project funding,Abt. Handschriften und Alte Drucke Bibliothek ...,Universität Basel - BS,10302,Swiss history,Human and Social Sciences;Theology & religious...,01.10.1975,30.09.1976,52627.0,
4,7,Wissenschaftliche Mitarbeit am Thesaurus Lingu...,,Schweiz. Thesauruskommission,Project funding (Div. I-III),Project funding,Schweiz. Thesauruskommission,"NPO (Biblioth., Museen, Verwalt.) - NPO",10303,Ancient history and Classical studies,Human and Social Sciences;Theology & religious...,01.01.1976,30.04.1978,120042.0,


## Choose data of interest
We are only interested in the 'University' and 'Approved Amount' fields, so that we only keep then. Moreover, we can note that some entries contain 'Nicht zuteilbar - NA' to tell us that no information has been given. We thus set them to nan values. In the documentation of the P3 dataset, some more information has been given: the nan entries in the 'University' field correspond to non-Swiss university partnerships, so that the values can easily been thrown away without consequence on what we want to analyse.

Choose rows and replace 'Nicht zuteilbar - NA' by nan

In [4]:
grants_uni_df = grants_df[['Institution', 'University','Approved Amount']].replace('Nicht zuteilbar - NA', np.nan)

We count the total number of entries in order to check afterwards how many rows have been deleted because of international university fields. 

In [5]:
total_entries = grants_uni_df.shape[0]
print(total_entries)
grants_uni_df.head()

63969


Unnamed: 0,Institution,University,Approved Amount
0,,,11619.0
1,Faculté de Psychologie et des Sciences de l'Ed...,Université de Genève - GE,41022.0
2,Kommission für das Corpus philosophorum medii ...,"NPO (Biblioth., Museen, Verwalt.) - NPO",79732.0
3,Abt. Handschriften und Alte Drucke Bibliothek ...,Universität Basel - BS,52627.0
4,Schweiz. Thesauruskommission,"NPO (Biblioth., Museen, Verwalt.) - NPO",120042.0


Check if some null values remain.

In [6]:
print(grants_uni_df[grants_uni_df['Approved Amount'].isnull()].shape[0])

0


Check how many null-entries there were.

In [7]:
null_uni_entries = grants_uni_df[grants_uni_df['University'].isnull()].shape[0]
print(null_uni_entries)

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Drop null entries.

In [None]:
grants_uni_CH_df = grants_uni_df.dropna()
grants_uni_CH_df.head()

Check if the actual number of entries of the dataframe corresponds to the size of the original dataframe minus the number of null entries.

In [None]:
print(grants_uni_CH_df[grants_uni_CH_df['University'].isnull()].shape[0])
grants_uni_CH_df.size == total_entries - null_uni_entries

## Mapping from University to Canton

In [None]:
username = 'ochanon'
url='http://api.geonames.org/postalCodeSearchJSON?'
parameters={'username':username,'placename':'CH','maxRows':1,'operator':'OR'}
r=requests.get(url,params=parameters)

In [None]:
r

## Interactive visualization using Folium

In [None]:
map = folium.Map(location=[46.8, 8], zoom_start=8)
map.choropleth(geo_path=geo, data=None,
             columns=['Canton', 'Amount'],
             key_on='feature.id',
             fill_color='YlGn', fill_opacity=0.7, line_opacity=0.2,
             legend_name='Amount of grants (CHF)')

In [None]:
map