## Mapping tutorial with GeoPandas and Matplotlib

This tutorial serves to demonstrate how to create a simple choropleth map considering the five regional health administrations of Portugal (North, Center, Lisbon and Tagus Valley, Alentejo and Algarve). 

To this end, [GeoPandas](https://geopandas.org/) will be used to deal with geospatial data and [Matplotlib](https://matplotlib.org/) for plotting (make sure you have both packages installed, please). In addition, [pandas](https://pandas.pydata.org/) will also be used to manage COVID-19 data.

Currently, in addition to the map of mainland Portugal divided into the five regional health administrations in the [shapefile](https://en.wikipedia.org/wiki/Shapefile) format, a similar file with the autonomous regions is also available, as well as a version of each map in the [GeoJSON](https://en.wikipedia.org/wiki/GeoJSON) format. You can replace the file used in this tutorial with any of these files and create new charts! 

In [None]:
import geopandas
import matplotlib.pyplot as plt
import pandas as pd

%matplotlib inline

In [None]:
PATH_MAP = "../extra/mapas/portugal_continental/"
PATH_DATA = "../"

In [None]:
VARIABLE = "confirmados"
MAP_NAME = "Portugal Continental"

NOMENCLATURE = {
    MAP_NAME: "portugal_continental"
}

In [None]:
# A GeoDataFrame works like a pandas DataFrame.

df_map = geopandas.read_file(f"{PATH_MAP}/{NOMENCLATURE[MAP_NAME]}.shp")

In [None]:
df_map.head()

In [None]:
df_data = pd.read_csv(f"{PATH_DATA}data.csv")

In [None]:
df_data.tail()

In [None]:
# Let's consider confirmed cases from the most recent date available.

df_most_recent_date = df_data.tail(1)

In [None]:
df_most_recent_date

In [None]:
confirmed_col = {
    "Alentejo": df_most_recent_date["confirmados_arsalentejo"].item(),
    "Algarve": df_most_recent_date["confirmados_arsalgarve"].item(),
    "Centro": df_most_recent_date["confirmados_arscentro"].item(),
    "Norte": df_most_recent_date["confirmados_arsnorte"].item(), 
    "RLVT": df_most_recent_date["confirmados_arslvt"].item()
}

In [None]:
# For each regional health administration, let's add the number of confirmed cases in a new column.

df_map[VARIABLE] = df_map["CCDR"].map(confirmed_col)

In [None]:
# Finally, let's generate a choropleth map of a GeoDataFrame with Matplotlib.

fig, ax = plt.subplots(figsize=(15,6))

ax.set_title(f"Casos Confirmados em {MAP_NAME}: {df_most_recent_date['data'].item()}", loc="left", pad=12.0)
ax.axis('off')

df_map.plot(
    column=VARIABLE, 
    cmap='Blues', 
    ax=ax,
    legend=True,
    linewidth=0.5,
    edgecolor='0.8'
)

fig.tight_layout()
# plt.savefig('map.png', dpi=300, bbox_inches='tight')
plt.show()