## Disclaimer

This workbook contains my own personal playground with data from the 2020 SARS-COV-2 Pandemic.

I am neither a statistician nor doctor and no values or results in this workbook should be taken for further publication or dissemination in any other form.

The code in this workbook makes some assumptions which have a high probability of indroducting errors in the data (f.ex. merging and matching of country names). I am awere of the issues and have not fixed them, because my main goal of this workbook is to experimnent with Python, Pandas and the data from the virus outbreak. The results are not intended to have a high degree of correctness. It is a personal project, intended for personal use.

In [None]:
# Imports
import importlib

import requests
import pandas as pd
import seaborn as sns
import numpy as np
from functools import lru_cache
import matplotlib.pyplot as plt

# Helper functions are in lib.py (see the repository)
import lib
_ = importlib.reload(lib)

In [None]:
COUNTRY_POP = lib.fetch_populations()
pop_getter = lib.population_getter(COUNTRY_POP)

# Read the initial data-frames from the Johns Hopkins data set
BASE_URL = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series"
confirmed = pd.read_csv(BASE_URL + "/time_series_covid19_confirmed_global.csv")
deaths = pd.read_csv(BASE_URL + "/time_series_covid19_deaths_global.csv")
recovered = pd.read_csv(BASE_URL + "/time_series_covid19_recovered_global.csv")

# Convert to row-based time-series
confirmed_ts = lib.to_timeseries(confirmed, "confirmed")
deaths_ts = lib.to_timeseries(deaths, "deaths")
recovered_ts = lib.to_timeseries(recovered, "recovered")

with_deaths = pd.merge(confirmed_ts, deaths_ts, on=["Date", "Province/State", "Country/Region"])

# Merge "recovered" into the data-set. This new data-frame will contain all values
# (confirmed, recovered and deaths) for each country
with_recovered = pd.merge(with_deaths, recovered_ts, on=["Date", "Province/State", "Country/Region"])

# Add a new column containing the population of the country for that row
with_recovered["population"] = with_recovered.apply(lambda row: pop_getter(row, "Country/Region"), axis=1)

# Remove "Province/State" and sum the values, giving us only one entry per country
with_recovered.drop("Province/State", axis="columns")
with_recovered = with_recovered.groupby(by=["Date", "Country/Region", "population"]).sum()
with_recovered = with_recovered.reset_index()
with_recovered = with_recovered.set_index("Date")

# Calculate the values "per-capita" (per 1'000 inhabitants) for each country
with_recovered['confirmed_per_capita'] = with_recovered['confirmed'] / with_recovered['population'] * 1e3
with_recovered['deaths_per_capita'] = with_recovered['deaths'] / with_recovered['population'] * 1e3
with_recovered['recovered_per_capita'] = with_recovered['recovered'] / with_recovered['population'] * 1e3

# Split into one data-frame per country so we can manipulate each one separately
recombined = lib.prepare_for_plot(with_recovered, COUNTRY_POP)

In [None]:
# plot stuff

for_plot = recombined[
    (recombined['Country/Region'] == "Luxembourg") |
    (recombined['Country/Region'] == "US") |
    (recombined['Country/Region'] == "Germany") |
    (recombined['Country/Region'] == "Italy")
].sort_values("days_after_cutoff")
lib.plot(for_plot, "confirmed_per_capita")

In [None]:
lib.plot(for_plot, "smooth_delta_confirmed_per_capita")

# Playground

Everything below is "work-in-progress" playground

In [None]:
lux = recombined[recombined["Country/Region"] == "Luxembourg"].copy()
lux["i"] = lux["confirmed"] / lux["population"]
lux["s"] = (lux["population"] - lux["confirmed"]) / lux["population"]
lux["r"] = (lux["recovered"] + lux["deaths"]) / lux["population"]
transmission_rate = 1.0
lux["rate_of_infection"] = transmission_rate * lux["s"] * lux["i"]
lux

In [None]:
f, ax = plt.subplots(figsize=(15,10))
sns.set_palette("bright")
sns.lineplot(x=lux.index, y=lux["s"], data=lux, label="S")
sns.lineplot(x=lux.index, y=lux["i"], data=lux, label="I")
sns.lineplot(x=lux.index, y=lux["r"], data=lux, label="R")
plt.legend()