# <center>Folium Practices  </center>


### Importing Libraries

In [40]:
#basic
import pandas as pd
import numpy as np
import plotly.express as px

#read enconding
import chardet

#url requests
import urllib.request

# geospatial
import folium
from folium import plugins


<br>

## Fetching Data

### About Dataset

Two kind of data sets were used in this notebook.

* External semi-automatic defibrillators located outside the healthcare environment Dataset"*
* New cases of cancer by autonomous communities in the last year

<br>

Sources: 

* https://abertos.xunta.gal/busca-de-datos

* https://www.epdata.es/datos/cancer-espana-datos-estadisticas/289?accion=3

* https://github.com/codeforgermany/click_that_hood/blob/main/public/data/spain-communities.geojson

<br> 

**External semi-automatic defibrillators located outside the healthcare environment Dataset"**

From: https://abertos.xunta.gal/catalogo/saude-servicios-sociais/-/dataset/0604/desfibriladores-semiautomaticos-externos-situados

Semi-automatic external defibrillators, also known as DESA, are a medical product intended to analyze the heart rhythm, identify fatal arrhythmias requiring defibrillation and administer an electric shock with the aim of restoring a viable heart rhythm with high levels of safety.

This is how Decree 38/2017, dated March 23, defines it, which regulates its installation and use in the healthcare field in Galicia, also creating the Registry of External Defibrillators in Galicia.

The file contains information collected in this record in relation to:

* or registration number of each defibrillator
* to a public or private entity that requests
* or place where it is found (postal address and geographical coordinates)
* or type of facility not available (administrations or public institutions, pilgrim hostels, airports, stations, educational centers, private companies, sports facilities, municipal swimming pools...).

In [2]:
#pip install chardet

In [3]:
#to detecet encoding and read from a csv file in the local computer

# Detecting encoding 
with open("./equipos-DESA-061.csv", 'rb') as f:
    result = chardet.detect(f.read())

encoding_detected = result['encoding']
print(f"Encoding detectado: {encoding_detected}")

#read
data_DESA = pd.read_csv("./equipos-DESA-061.csv", encoding=encoding_detected)

data_DESA.head()


Encoding detectado: Windows-1252


Unnamed: 0,codequipo,solicitante,ubicacion,municipio,provincia,lat,lon,tipoInstalacion
0,1002,USC (Campus Santiago),Enfermaría Piscina Universitaria,SANTIAGO,A CORUÑA,42.878,-8.5552,En centros docentes
1,1003,Aeropuerto de Vigo,Aeroporto de Vigo,VIGO,PONTEVEDRA,42.2257,-8.6328,En aeropuertos o estaciones
2,1010,Deputación Provincial da Coruña,Teatro Colón,"CORUÑA, A",A CORUÑA,43.3702,-8.3986,Otros
3,1012,"Showa Denko Carbón Spain, SA",Showa Denko Carbón Spain (junto al servicio mé...,"CORUÑA, A",A CORUÑA,43.345,-8.4335,En empresa privada
4,1020,PC Concello A Coruña,Praia Lapas,"CORUÑA, A",A CORUÑA,43.3836,-8.4062,Protección civil (no amb.)


<br>

**New cases of cancer by autonomous communities in the last year**

From:https://www.epdata.es/nuevos-casos-cancer-cada-100000-personas-comunidades-autonomas/37a78337-e295-45f8-8ba1-70dea98075c3

GEOJson: https://github.com/codeforgermany/click_that_hood/blob/main/public/data/spain-communities.geojson

This Dataset represents the cases detected in the last year (2022) per 100,000 inhabitants.



In [4]:
#to detecet encoding and read from a csv file in the local computer

# Detecting encoding 
with open("./spain_cancer.csv", 'rb') as f:
    result = chardet.detect(f.read())

encoding_detected = result['encoding']
print(f"Encoding detectado: {encoding_detected}")


#read
data_cancer = pd.read_csv("./spain_cancer.csv", encoding=encoding_detected, sep=';')

data_cancer.head()

Encoding detectado: UTF-8-SIG


Unnamed: 0,Año,Periodo,Parámetro,Nuevos casos detectados de cáncer
0,2022,Año,Andalucía,56727
1,2022,Año,Aragón,66567
2,2022,Año,Asturias,76467
3,2022,Año,Canarias,54518
4,2022,Año,Cantabria,68111


<br>

### Get Familiar with Data

**External semi-automatic defibrillators located outside the healthcare environment Dataset"**

In [5]:
#No null object, 7 columns
data_DESA.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1606 entries, 0 to 1605
Data columns (total 8 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   codequipo        1606 non-null   int64  
 1   solicitante      1606 non-null   object 
 2   ubicacion        1606 non-null   object 
 3   municipio        1606 non-null   object 
 4   provincia        1606 non-null   object 
 5   lat              1606 non-null   float64
 6   lon              1606 non-null   float64
 7   tipoInstalacion  1606 non-null   object 
dtypes: float64(2), int64(1), object(5)
memory usage: 100.5+ KB


In [6]:
data_DESA.shape

(1606, 8)

<br>

**New cases of cancer by autonomous communities in the last year**

In [7]:
#change name columns 
data_cancer.rename(columns={'Parámetro':'province', 'Nuevos casos detectados de cáncer':'cancer_cases'}, inplace=True)
data_cancer.head()

Unnamed: 0,Año,Periodo,province,cancer_cases
0,2022,Año,Andalucía,56727
1,2022,Año,Aragón,66567
2,2022,Año,Asturias,76467
3,2022,Año,Canarias,54518
4,2022,Año,Cantabria,68111


In [8]:
#Spain have 29 countries, so there are more rows
data_cancer.shape

(33, 4)

In [9]:
data_cancer


Unnamed: 0,Año,Periodo,province,cancer_cases
0,2022,Año,Andalucía,56727.0
1,2022,Año,Aragón,66567.0
2,2022,Año,Asturias,76467.0
3,2022,Año,Canarias,54518.0
4,2022,Año,Cantabria,68111.0
5,2022,Año,Castilla y León,75965.0
6,2022,Año,Castilla-La Mancha,60698.0
7,2022,Año,Catalunya,59449.0
8,2022,Año,Ceuta,45285.0
9,2022,Año,Comunidad Valenciana,6093.0


In [10]:
# We only need from row 0 to 18 included
df_cancer=data_cancer.loc[0:18,:]
df_cancer.shape

(19, 4)

In [11]:
df_cancer.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19 entries, 0 to 18
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Año           19 non-null     object
 1   Periodo       19 non-null     object
 2   province      19 non-null     object
 3   cancer_cases  19 non-null     object
dtypes: object(4)
memory usage: 736.0+ bytes


In [12]:
#change dtype to float
df_cancer.loc[:, 'cancer_cases'] = df_cancer['cancer_cases'].str.replace(',', '.').astype(float)

df_cancer.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19 entries, 0 to 18
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Año           19 non-null     object 
 1   Periodo       19 non-null     object 
 2   province      19 non-null     object 
 3   cancer_cases  19 non-null     float64
dtypes: float64(1), object(3)
memory usage: 736.0+ bytes


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_cancer.loc[:, 'cancer_cases'] = df_cancer['cancer_cases'].str.replace(',', '.').astype(float)
  df_cancer.loc[:, 'cancer_cases'] = df_cancer['cancer_cases'].str.replace(',', '.').astype(float)


In [13]:
#change province names to have the same as GeoJson
df_cancer['province'].unique()

array(['Andalucía', 'Aragón', 'Asturias', 'Canarias', 'Cantabria',
       'Castilla y León', 'Castilla-La Mancha', 'Catalunya', 'Ceuta',
       'Comunidad Valenciana', 'Extremadura', 'Galicia', 'Islas Baleares',
       'La Rioja', 'Madrid', 'Melilla', 'Murcia', 'Navarra', 'País Vasco'],
      dtype=object)

In [14]:
df_cancer.replace({'Andalucía':'Andalucia', 'Aragón':'Aragon', 'Castilla y León':'Castilla-Leon', 
                     'Comunidad Valenciana':'Valencia', 'Islas Baleares':'Baleares', 'País Vasco':'Pais Vasco' }, inplace=True)
df_cancer.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_cancer.replace({'Andalucía':'Andalucia', 'Aragón':'Aragon', 'Castilla y León':'Castilla-Leon',


Unnamed: 0,Año,Periodo,province,cancer_cases
0,2022,Año,Andalucia,567.27
1,2022,Año,Aragon,665.67
2,2022,Año,Asturias,764.67
3,2022,Año,Canarias,545.18
4,2022,Año,Cantabria,681.11


<br>

### Introduction to Folium 

In [15]:
#Galicia Map

Galicia_map=folium.Map(location=[42.755, -7.866111], zoom_start=8, tiles="OpenStreetMap")

Galicia_map

<br> 

### Map with Markers

Visualize locations of DESA in Galicia

In [16]:
# let's start again with a clean copy of the map 
Galicia_map=folium.Map(location=[42.755, -7.866111], zoom_start=8, tiles="OpenStreetMap")


# instantiate a mark cluster object for the DESA in the dataframe
DESA = plugins.MarkerCluster().add_to(Galicia_map)

# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(data_DESA.lat, data_DESA.lon, data_DESA.tipoInstalacion):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(DESA)

# display map
Galicia_map

<br> 

### Choropleth Maps 

Cases of cancer in 2022 by province in Spain


In [17]:
# geoJson file
spain_geo_path = './spain-communities.geojson'

In [52]:
scale=list(range(400, 801, 75))
scale

[400, 475, 550, 625, 700, 775]

In [55]:
# create a plain world map
spain_map = folium.Map(location=[37.5, -4], zoom_start=5, tiles='Cartodb positron')

# generate choropleth map using the total immigration of each country to Canada from 1980 to 2013
spain_map.choropleth(
    geo_data=spain_geo_path, 
    data=df_cancer,
    columns=['province', 'cancer_cases'],
    key_on='feature.properties.name',
    fill_color='OrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Cases of cancer per 10000 inhabitants in 2022',
    #threshold_scale=scale,
    reset=True
)


# display map
spain_map

In [51]:
df_cancer

Unnamed: 0,Año,Periodo,province,cancer_cases
0,2022,Año,Andalucia,567.27
1,2022,Año,Aragon,665.67
2,2022,Año,Asturias,764.67
3,2022,Año,Canarias,545.18
4,2022,Año,Cantabria,681.11
5,2022,Año,Castilla-Leon,759.65
6,2022,Año,Castilla-La Mancha,606.98
7,2022,Año,Catalunya,594.49
8,2022,Año,Ceuta,452.85
9,2022,Año,Valencia,609.3


<br>

# Author

<a>Eva Villar Álvarez</a>

2024/02/05

<br>

 **Change Log**

| Date (YYYY-MM-DD) | Version |      Change Description                                   | 
| ----------------- | ------- |---------------------------------------------------------- |