# Folium II

### Dataset: Galician Libraries

**Rede de bibliotecas de Galicia**

Dataset from Opendata @bertos, Xunta de GAlicia

https://abertos.xunta.gal/catalogo/cultura-ocio-deporte/-/dataset/0230/rede-bibliotecas-galicia

In [None]:
# Load libraries
import pandas as pd
import folium

In [None]:
# Display galician libraries on a map

# Note: remember that you have to add markers one by one

In [None]:
bibliotecas = pd.read_csv('https://abertos.xunta.gal/catalogo/cultura-ocio-deporte/-/dataset/0230/rede-bibliotecas-galicia/001/descarga-directa-ficheiro.csv',sep=';')
bibliotecas.head()

In [None]:
bibliotecas.info()

The most important data that we will use are the coordinates, so we have to make sure that the data are clean and will work.

Some of the problems we may encounter are:
- null values
- blank spaces
- wrongly coded values
- wrong values
- offset columns

Depending on the problem and our objectives we can:
- correct the data
- clean the data
- delete the rows
- etc...

In [None]:
# Are there null values?
bibliotecas[bibliotecas.COORDENADAS.isnull()]

# We could remove the lines with null values
# bibliotecas.drop(index= bibliotecas[bibliotecas.COORDENADAS.isnull()].index, inplace=True)
# bibliotecas.reset_index(drop=True,inplace=True)

In [None]:
# In this case we found that coordinates are coden in the same colum, separated by comma and a space

In [None]:
bibliotecas.COORDENADAS.head()

In [None]:
# There are no lines that do not match the next regex
bibliotecas[bibliotecas['COORDENADAS'].str.match(r'^[0-9]+\.[0-9]+\,\s[+\-]*[0-9]+\.[0-9]+$')==False]

In [None]:
# Create a new column with coordinates without spaces
bibliotecas['COORDENADAS_nospaces'] = bibliotecas.COORDENADAS.apply(lambda x: x.replace(' ',''))

In [None]:
bibliotecas.drop('COORDENADAS',axis='columns',inplace=True)
bibliotecas.rename(columns={'COORDENADAS_nospaces':'COORDENADAS'},inplace=True)

In [None]:
# Create a new dataframe. Verify that all coordinates are correct
biblios = bibliotecas[bibliotecas['COORDENADAS'].str.match(r'^[0-9]+\.[0-9]+\,[+\-]*[0-9]+\.[0-9]+$')==True].copy()

In [None]:
m = folium.Map(location=[43, -8.20],zoom_start=8, width=600, height=600)

for index, biblioteca in biblios.iterrows():
    folium.Marker([biblioteca['COORDENADAS'].split(',')[0], biblioteca['COORDENADAS'].split(',')[1]]).add_to(m)
    #folium.Marker([biblioteca['COORDENADAS'].split(',')[0], biblioteca['COORDENADAS'].split(',')[1]],popup=biblioteca['NOME']).add_to(m)
    #folium.Marker([biblioteca['COORDENADAS'].split(',')[0], biblioteca['COORDENADAS'].split(',')[1]],popup=biblioteca['NOME'],icon=folium.Icon(icon='book')).add_to(m)
m

In [None]:
# There are several ways to iterate the DataFrame and display the markers
m = folium.Map(location=[43, -8.20],zoom_start=8, width=600, height=600)

for i in range(len(biblios)):
    folium.Marker([biblios.iloc[i]['COORDENADAS'].split(',')[0], biblios.iloc[i]['COORDENADAS'].split(',')[1]]).add_to(m)
m

In [None]:
len(bibliotecas)

In [None]:
# If we do not want to clean the data before painting, we can also manage the exceptions in case of errors.
m = folium.Map(location=[43, -8.20],zoom_start=8, width=600, height=600)

for i in range(len(bibliotecas)):
    try:
        folium.Marker([bibliotecas.iloc[i]['COORDENADAS'].split(',')[0], bibliotecas.iloc[i]['COORDENADAS'].split(',')[1]]).add_to(m)
    except:
        coordenadas = bibliotecas.iloc[i]['COORDENADAS']
        print(f'Datos incorrectos: biblioteca {i}, coordenadas {coordenadas}')
        # if we do not want to show any data in case of error, we can simply use the following line and "continue" the execution        
        # continue
        # pass 
m

### Cluster of markers

It is possible to user cluster to manage groups of markers

1. The markers are added to the cluster.
2. The cluster is added to the map

In [None]:
import folium.plugins

m = folium.Map(location=[43, -8.20],zoom_start=8, width=600, height=600)

marker_cluster = folium.plugins.MarkerCluster().add_to(m)

for index, biblioteca in biblios.iterrows():
    folium.Marker([biblioteca['COORDENADAS'].split(',')[0], biblioteca['COORDENADAS'].split(',')[1]],popup=biblioteca['NOME'],icon=folium.Icon(icon='book')).add_to(marker_cluster)

m

### KMZ/KML files

In the @bertos portal created by Xunta de Galicia we can also find the data in KML format.

https://abertos.xunta.gal/catalogo/cultura-ocio-deporte/-/dataset/0230/rede-bibliotecas-galicia

KMZ and KML files
- These are formats popularized by Google (Earth/Maps) to represent geographic information.
- KML is an XML file
- KMZ is a ZIP file where we find a doc.kml and other files with extra information.

The interesting fields for us in the KML files are:
```
<Placemark>
    <name>
        <Location>
        <longitude>
        <latitude> # <latitude>
```

In [None]:
# There are XML files so we can parser them with Beautifulsoup
# KML File
url = 'https://abertos.xunta.gal/catalogo/cultura-ocio-deporte/-/dataset/0230/rede-bibliotecas-galicia/002/descarga-directa-ficheiro.kml'

import requests
import lxml # It can be necessary to install the xml parser
from bs4 import BeautifulSoup

response = requests.get(url)
#print(response.text)
soup = BeautifulSoup(response.content,'lxml')

In [None]:
# A información de cada biblioteca está nun único "placemark"
# Creamos unha lista de todas as bibliotecas
bibliotecas = soup.find_all('placemark')
len(bibliotecas)

In [None]:
# Visualizamos unha biblioteca
bibliotecas[0]

In [None]:
# we see that the coordinates are "upside down" from what folium expects (longitude,latitude)
bibliotecas[0].find('coordinates').text

In [None]:
for biblioteca in bibliotecas:
    print(biblioteca.find('name').text)
    print(biblioteca.find('coordinates').text)

In [None]:
# In KML there can also be errors in the coordinates
# We could build a dataframe with the data and paint the map in the same way as before.
# We can also iterate the data and go painting the map
# Instead of using the "match" function of pandas.series we will use the search function of the module 're'.
import re

m = folium.Map(location=[43, -8.20],zoom_start=8, width=600, height=600)

#^[0-9]+\.[0-9]+\,[+\-]*[0-9]+\.[0-9]+$

for biblioteca in bibliotecas:
    if (re.search('^[+\-]*[0-9]+\.[0-9]+\,[0-9]+\.[0-9]+$',biblioteca.find('coordinates').text)):
        folium.Marker([biblioteca.find('coordinates').text.split(',')[1], biblioteca.find('coordinates').text.split(',')[0]]).add_to(m)        
m

In [None]:
# We can also paint the map by checking for errors with the try/except structure
m = folium.Map(location=[43, -8.20],zoom_start=8, width=600, height=600)

for biblioteca in bibliotecas:
    try:
        folium.Marker([biblioteca.find('coordinates').text.split(',')[1], biblioteca.find('coordinates').text.split(',')[0]]).add_to(m)      
    except:
        pass
m

### Multilayer maps

Folium, together with leaflet, allows the creation of interactive multilayer maps.

In [None]:
import geopandas as gpd

In [None]:
# We can create DataFrames also from Shapefiles to display on Folium maps

df_concellos = gpd.read_file('../datasets/Concellos/Concellos_IGN.shp')
#df_ferrocarril = gpd.read_file('/huge/datasets/Ferrocarril/ESTACION_FFCC.shp')

In [None]:

# class folium.features.GeoJson
# Creates a GeoJson object for plotting into a Map
# https://python-visualization.github.io/folium/modules.html#folium.features.GeoJson

m = folium.Map(location=[43, -8.20],zoom_start=8, width=600, height=600)
folium.GeoJson(df_concellos).add_to(m)
#folium.GeoJson(df_ferrocarril).add_to(m)
m

In [None]:
df_concellos.crs

In [None]:
m.crs