# Introduction

Following script creates a geographic map of scientific mobility. It shows scientists going from one institute to another institute for a given year.

# Preprocessing

IMPORTANT: The information about the various institutes of the scientists should be available. You should know geolocations (latitude and longitude) of the host and guest institutes. If not, use 'geopy-googlemaps-batchgeocoder' from github to generate geoinfo (latitude and longitude) (You need to purchase necessary Google API key). As described in the 'geopy-googlemaps-batchgeocoder' folder, you will need to create a csv input file with the names of the host and guest institutes, adressline and city. You will have to run batchcoder's python code twice, i.e. once for host institutes and once for guest institutes. When processing csv output file, first open the excel program and then import the data in the excel file with the import button. Convert the geodata to text before clicking the finish button. This is to get geodata in a proper format.

# Further processing

The data from csv output files of host and guest institute are bundled with information such as applicant, title, year, name of institution. An example excel file is provided with this repository, to match with the column names used in the script below. Alternatively you can change column names in the scripts to match with your excel. The output of the python code is a map in an html format.

# Script

In [1]:
#Importing various modules required for analysis
#importing the pandas library
import pandas as pd
# For plotting maps
import folium
import os
os.getcwd()
import tkinter as tk
from tkinter import filedialog
import pandas as pd


#Importing the scientific mobility example folium import file with Tkinter module
root= tk.Tk()

canvas1 = tk.Canvas(root, width = 300, height = 300, bg = 'lightsteelblue')
canvas1.pack()

def getExcel ():
    global df
    
    import_file_path = filedialog.askopenfilename()
    df = pd.read_excel (import_file_path)
    #print(df)
    
browseButton_Excel = tk.Button(text='Import Excel File', command=getExcel, bg='green', fg='white', font=('helvetica', 12, 'bold'))
canvas1.create_window(150, 150, window=browseButton_Excel)

root.mainloop()


In [2]:
#Converting the datatypes to allow it to proces in folium
df['Guest institute'] = df['Guest institute'].astype(str)
df['Host institute'] = df['Host institute'].astype(str)
df['Year'] = df['Year'].astype(str)
df['icon_num']=df['icon_num'].astype(str)

In [3]:
#import folium.plugins for Markercluster
from folium import plugins
#generating a map
mapa = folium.Map(location=[df.origin_lat.mean(axis=0), df.origin_lng.mean(axis=0)], tiles= "CartoDB Positron", zoom_start=2)
#adding layers to map, default 'off' for cluster host universities
fg0=folium.FeatureGroup(name='Host universities cluster', show=False )
mapa.add_child(fg0)
fg1=folium.FeatureGroup(name='Guest universities cluster' )
mapa.add_child(fg1)
#adding marker clusters to respective layers
marker_cluster0=plugins.MarkerCluster().add_to(fg0)
marker_cluster1=plugins.MarkerCluster().add_to(fg1)
folium.LayerControl(collapsed=False, autoZIndex=False ).add_to(mapa)
#adding markers to marker cluster for each source university
for each in df.iterrows():  
    folium.Marker(list([each[1]['origin_lat'],each[1]['origin_lng']]),popup= folium.Popup(("<b>Title research :</b>"+each[1]['Title']+"<br> <b>Summary : </b>"+each[1]['Summary']+"<br> <b>Researchtime : </b>"+each[1]['Researchtime']), max_width=600), icon=plugins.BeautifyIcon(number=each[1]['icon_num'], border_color='blue', text_color='blue', inner_icon_style='margin-top: 0px;'), tooltip=each[1]['Scientist']+' ('+each[1]['Gender']+')'+ ' from '+ each[1]['Host institute']+ ' went '+ each[1]['Year']+ ' to '+ each[1]['Guest institute']+'.'+' Click for research information.' ).add_to(marker_cluster0)

#adding markers to marker cluster for each destination university
for each in df.iterrows(): 
    folium.Marker(list([each[1]['destination_lat'],each[1]['destination_lng']]),popup= folium.Popup(("<b>Title research :</b>"+each[1]['Title']+"<br> <b>Summary : </b>"+each[1]['Summary']+"<br> <b>Researchtime : </b>"+each[1]['Researchtime']), max_width=600), icon=plugins.BeautifyIcon(number=each[1]['icon_num'], border_color='red', text_color='red', inner_icon_style='margin-top: 0px;'), tooltip=each[1]['Scientist']+' ('+each[1]['Gender']+ ')'+ ' from ' +each[1]['Host institute']+ ' came '+each[1]['Year'] +' to '+ each[1]['Guest institute']+'.'+ ' Click for research information.').add_to(marker_cluster1)

#adding lines going from host institute to guest institutes   
for each in df.iterrows():
    folium.PolyLine([[each[1]['origin_lat'], each[1]['origin_lng']], 
                      [each[1]['destination_lat'], each[1]['destination_lng']]], weight=1.0, opacity=0.25, color="#008B9F", tooltip=each[1]['Host institute']+ ' -> '+ each[1]['Guest institute']).add_to(mapa)

#adding title to the map
loc = 'Scientific mobility year xxxx'
title_html = '''
             <h3 align="center" style="font-size:16px"><b>{}</b></h3>
             '''.format(loc)   
mapa.get_root().html.add_child(folium.Element(title_html))
#saving the map in the folder where your current jupyter notebook is.
mapa.save('Scientific mobility year xxxx.html')
mapa

# User manual html file

You can view the number of scientists per region and per zoom level. The markers are colored based on quantity: more than 10 is color red yellow, 2 to 10 is color green. If you hover over the color red yellow markers, you can see the range. There are two layers of information: Host and guest institute info with the guest institute info default on. Zooming in further shows the numbers of scientists per region. Zooming in even further gives number of scientists per university. The last zoom lists each scientist by address with number 1 in the circle. If you hover over a circle, you get information about the scientist. You can see the specific research information when you click on it. The guest addresses are in red color and the host addresses are in blue color.