# Coursera - Assignment 4 - COVID-19 Impact on the unemployment rate in Chile 

This notebook contains a brief study as part of my last assignment for the [Applied Plotting, Charting & Data Representation in Python](https://www.coursera.org/learn/python-plotting/home/welcome) course. It requires the visualization of two datasets from different sources. My main question is: how is COVID-19 affecting the unemployment rate in Chile? 

This question is not that simple to answer. The plot below shows some correlation but it does not say that the increase on the unemployment rate is only caused by the COVID-19 scenario. A deeper study would be necessary in order to get a better conclusion. This plot is only for exercises purposes. 

For this, I will use two datasets:

1) COVID-19 information provided by the Ministerio de Salud of Chile and compiled by the Ministry of Cience, Technology, Knowledge and Information of Chile. The compiled information is available in the [Datos-COVID19 GitHub repository](https://github.com/MinCiencia/Datos-COVID19) and is organized in Data Products. For this work, we use the Data Product 46. The link for the raw dataset is provided below:

https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/master/output/producto46/activos_vs_recuperados.csv

2) Unemployment rate in 2019/2020 published by the Banco Central de Chile:

https://si3.bcentral.cl/Bdemovil/BDE/Series/MOV_BD_ML3


In [5]:
%matplotlib notebook
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import requests

from pandas.plotting import register_matplotlib_converters

In [6]:
def read_active_cases_per_day():
    """ 
    Read dataset which contains the number of active cases per day in Chile. 
    
    Returns
    -------
    pd.DataFrame
    """
    url = ("https://raw.githubusercontent.com/MinCiencia/Datos-COVID19/"
           "master/output/producto46/activos_vs_recuperados.csv")
    
    df = pd.read_csv(url, index_col=0)
    df.index = pd.to_datetime(df.index, errors="coerce", format="%Y-%m-%d")
    
    return df


active_cases = read_active_cases_per_day()

In [7]:
def read_unemployment_rate():
    """
    Read the unemployment rate straight from the Banco Central de Chile. 
    The information is not published via .CSV but, instead, is a simple 
    HTML table. So this function will parse the table and clean it up 
    before returning the DataFrame.
    
    Parameters
    ----------
    DataFrame : unemployment rate in Chile per month in late 2019 up 
        to mid 2020.
    """    
    url = ("https://si3.bcentral.cl/Bdemovil/BDE/Series/MOV_BD_ML3")
    
    r = requests.get(url)
    
    df = pd.read_html(r.text, attrs={'id': 'datosSeries'})[0]
    df = df.rename(columns={0: 'month', 1: 'Unemployment Rate'})
    
    months = {'ene': 'Jan', 'feb': 'Feb', 'mar': 'Mar', 'abr': 'Apr', 'may': 'May', 'jun': 'Jun',
              'jul': 'Jul', 'ago': 'Aug', 'sep': 'Sep', 'oct': 'Oct', 'nov': 'Nov', 'dic': 'Dec'}

    for key, val in months.items():
        df['month'] = df['month'].str.replace(key, val, regex=False)            

    df['month'] = pd.to_datetime(df['month'].str.capitalize(), format="%b-%Y")
    df = df.set_index('month')
    
    return df
    
unemployment_rate = read_unemployment_rate()

In [11]:
def plot_active_cases_vs_unemployment_rate():
    """
    Plot the number of active cases of COVID-19 in Chile per day and the 
    unemployment rate per month.
    """
    fig, ax1 = plt.subplots(
        num="COVID-19 in Chile\n The pandemic scenario impact on unemployment rate",
        dpi=120)
    
    ax1.set_title("COVID-19 in Chile\n The pandemic scenario impact on unemployment rate")
    
    color = 'tab:red'
    ax1.plot(active_cases['activos'], color=color)
    ax1.set_xlabel('Year - Month')
    ax1.set_ylabel('Number of active cases per day', color=color)
    ax1.tick_params(axis='y', labelcolor=color, color=color, width=0.5)
    ax1.grid(axis='y', alpha=0.2, color=color, lw=0.5)
    ax1.set_ylim(-1000, 1.07 * np.max(active_cases['activos']))
    
    ax2 = ax1.twinx()

    color = 'tab:blue'

    bars = ax2.bar(unemployment_rate.index, unemployment_rate['Unemployment Rate'], 
                   width=20, color=color, alpha=0.5, label='Monthly Unemployment Rate')
    
    ax2.set_ylim(-0.17, 15.6)
    ax2.set_yticks([])
    
    _ = [ax1.spines[s].set_visible(False) for s in ['bottom', 'left', 'top', 'right']]
    _ = [ax2.spines[s].set_visible(False) for s in ['bottom', 'left', 'top', 'right']]
    
    for bar in bars:
        ax2.text(
            bar.get_x() + 0.5 * bar.get_width(), bar.get_height() + 0.2, 
            f"{bar.get_height()} %", fontsize=6, ha="center", color=color)
        
    fig.legend(loc='center', bbox_to_anchor=(0.5,0.85), ncol=2, frameon=False)
        
    plt.show()
    plt.savefig("Assignment4_bquint.png")
    
    
plot_active_cases_vs_unemployment_rate()

<IPython.core.display.Javascript object>