Create a map with the countries color according to the number of received abstracts.

# Import 

In [1]:
import folium
import json
import pandas as pd
from geopy.geocoders import Nominatim
from geoip import geolite2
import pycountry
import re
from collections import Counter 

* [Folium](https://pypi.python.org/pypi/folium) to visualise with Leaflet,
* [json](https://docs.python.org/2/library/json.html) to read country information,
* [pandas](http://pandas.pydata.org/) to read the abstract data,
* [geopy](https://pypi.python.org/pypi/geopy) to get coordinates from location,
* [geopip](https://pypi.python.org/pypi/GeoIP/) to get the location from the IP address,
* [pycountry](https://pypi.python.org/pypi/pycountry) to change the country name into country code.

# Files I/O

## Input files

Two input files are necessary: 
* abstractfile, which contains a list of JSON files corresponding to the submitted abstract (can be obtained with ls json > abstractfile);
* geo_json_data, a file containing the country boundaries in JSON format. Can be obtained from https://github.com/johan/world.geo.json.

In [2]:
abstractfile = '../data/abstractslist2016.txt'
geo_json_data = r'countries.geo.json'

## Output file

* mapfile, an html file showing the final map. 

In [3]:
match = re.search(r'\d{4}', abstractfile)
year = match.group()
mapfile = 'countryabstract' + year + '.html'

# Coordinates

Get the coordinates from a few places (not used in this notebook).

In [4]:
geolocator = Nominatim()
location = geolocator.geocode("Place du XX Aout, Liege, Belgium")
hotelibis1 = geolocator.geocode("41 pl de la République Francaise, 4000 Liege, Belgium")
hotelibis2 = geolocator.geocode("Rue de l'Arbre Courte Joie 380, 4000 Liège, Belgium")
hotelramada = geolocator.geocode("Quai St Léonard 36, 4000 Liège, Belgium")

# Create a list of countries from the list of abstracts

See plot_abstracts_map.ipynb for more details on this part.

In [5]:
countrylist = []
nlines = 0
with open(abstractfile, 'r') as f:
    for lines in f.readlines():
        nlines += 1
        match = re.search(r'(\d{4})-(\d{2})-(\d{2})_(\d{2}):(\d{2}):(\d{2})_IP_(\d+\.\d+\.\d+\.\d+).json', lines)
        if match:
            IP = match.group(7)
            matchIP = geolite2.lookup(IP)
            if matchIP: 
                countrylist.append(matchIP.country)
abstract_countrycount = Counter(countrylist)

Change the dictionary into a data file in CSV format (loop on the keys and values of the dictionnary; probably another way to do it).

In [6]:
countryfile = 'countryabstract' + year + '.csv'
with open(countryfile, 'w') as f:
    f.write('Country,Abstracts\n')
    for ii, jj in zip(abstract_countrycount.keys(), abstract_countrycount.values()):
        countrycode = pycountry.countries.get(alpha2=ii).alpha3
        f.write(countrycode + ',' + str(jj) + '\n')

Read the abstract data

In [7]:
abstract_data = pd.read_csv(countryfile)

# Create the map

We initialize the map with the center on Belgium and with a global view (zoom = 1.5).

In [8]:
map_clq = folium.Map(location=[location.latitude, location.longitude], 
                     zoom_start=1.5)
map_clq.simple_marker([location.latitude, location.longitude], popup='Colloquium Venue')
#map_clq.geo_json('countries.geo.json')

Then we add the information on the abstracts:

In [11]:
map_clq.geo_json(geo_path=geo_json_data, data=abstract_data,
             columns=['Country','Abstracts'],
             key_on='feature.id',
             fill_color='YlGn', fill_opacity=0.5, line_opacity=0.2,
             legend_name='Number of submitted abstracts',
             reset='True')
# map_clq.lat_lng_popover()
map_clq.create_map(mapfile)

In [12]:
map_clq