In this project, I analysed the impact of government measures (e.g. social distancing) on the evolution of the SARS-Cov2 (COVID-19) pandemic, including possible differences in the efficacy of the interventions of different countries in controlling the spread of the disease.
I used the country data on the number of cases and deaths from Covid-19 present on the WHO report and the data on the country measures to tackle the pandemic presen in the ACAPS COVID-19 government measures dataset by ACAPS.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
who_data = pd.read_csv("/kaggle/input/uncover/WHO/world-health-organization-who-situation-reports.csv", index_col=0)

In [None]:
interventions_data = pd.read_csv("/kaggle/input/uncover/HDE/acaps-covid-19-government-measures-dataset.csv", index_col=0)

In [None]:
n_cases = who_data.groupby('location').new_cases.sum()

In [None]:
new_cases = who_data.groupby('date').new_cases.sum()

In [None]:
n_deaths = who_data.groupby('location').new_deaths.sum()

In [None]:
new_deaths = who_data.groupby('date').new_deaths.sum()

In [None]:
n_interventions = interventions_data.groupby('country').country.count()

In [None]:
n_interventions_date = interventions_data.groupby('date_implemented').date_implemented.count()

In [None]:
plt.figure(figsize=(10,5))
interventions_vs_cases = sns.scatterplot(x=n_interventions, y=n_cases)
interventions_vs_cases.set(xlabel='nº of interventions', ylabel='nº of cases')
plt.show(interventions_vs_cases)

In [None]:
plt.figure(figsize=(10,5))
interventions_vs_deaths = sns.scatterplot(x=n_interventions, y=n_deaths)
interventions_vs_deaths.set(xlabel='nº of interventions', ylabel='nº of deaths')
plt.show(interventions_vs_deaths)

In [None]:
plt.figure(figsize=(10,5))
plt.title("Evolution of the number of new Covid-19 cases vs number of interventions")
ax = new_cases.plot(x="date", y="new_cases", legend=True, color="b")
ax2 = ax.twinx()
n_interventions_date.plot(x="date", y="n_interventions_date", ax=ax2, legend=False, color="y")
plt.xlabel("Date")

In [None]:
plt.figure(figsize=(10,5))
plt.title("Evolution of the number of new Covid-19 deaths vs number of interventions")
ax = new_deaths.plot(x="date", y="new_deaths", legend=True, color="b")
ax2 = ax.twinx()
n_interventions_date.plot(x="date", y="n_interventions_date", ax=ax2, legend=False, color="y")
plt.xlabel("Date")

To understand the relationship between the measures implemented by governments to control the Covid-19 pandemic and the impact of the pandemic in terms of both confirmed cases and deaths, I constructed scatter plots showing the relationship of the number of implemented measures with the number of confirmed cases and with the number of deaths per country. These plots suggest a possible correlation between the number of implemented measures and both confirmed cases and deaths i.e. in general the countries that implemented more control measures are the ones with more confirmed cases and deaths.
However, we should be cautious with this first observation as correlation doesn't imply causation and one likely explanation in this case is that countries most severely hitted by the outbreak responded to it by enforcing more control measures.
To have a clearer idea about that, we should compare the evolution of the number of control measures implemented by the governments and the number of confirmed new cases and deaths worldwide, which I did in subsequent steps.
Indeed, the analysis of the evolution of the number of confirmed new cases vs number of implemented measures shows that the evolution of the number of implemented measures follows closely that of the number of cases with the exception of an huge peak at early February, likely corresponding to the initial peak in cases in Wuhan, China. This strongly suggests that the high number of control measures like social distancing and the use of PPE was a consequence of the spread of the disease and not a contributing factor. This tendency is more visible when the comparison is with the number of deaths, probably resulting from the fact that this indicator tends to increase slower than the number of cases, as deaths from the disease often occur several weeks after the diagnosis.
To draw more definitive conclusions on the impact of the existing containment strategies, we will need more updated data on the evolution of the number of Covid-19 new cases and deaths beyond Mid-March to determine if the measures that governments took (e.g. social distancing and the use of PPE) were successful in flatten the curve of the pandemic both at a worldwide level and at a country or regional level.