# Corona Cases

This Jupyter notebook contains a bar chart to visualize the corona cases either per country of worldwide. The underlying dataset is provided by [Johns Hopkins University](https://github.com/CSSEGISandData/COVID-19/). Click ```Cell -> Run All``` in the top menu to refresh the dataset.

## Import packages

- After the initial run, comment this line to minimize the execution time
```bash
#!jupyter nbextension enable --py widgetsnbextension
```

In [1]:
!jupyter nbextension enable --py widgetsnbextension

Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: [32mOK[0m


In [2]:
import pandas as pd
import plotly.graph_objects as go
from ipywidgets import widgets

## Prepare dataset

### Data cleansing

- Replaces NA/NaN values
- Drops not required columns (Province/State, Lat, Long)
- Aggregates the corona cases per country
- Calculates the worldwide corona cases
- Adds a status per case (Confirmed, Death, Recovered)

In [3]:
def data_cleansing(df, status):
    df=df.fillna("")
    df=df.drop(["Province/State", "Lat", "Long"], axis=1)
    df=df.rename(columns={"Country/Region": "Country"})
    df=df.groupby(["Country"], sort=True).sum() # Country will become the new index
    df.loc["Worldwide"] = df.sum(numeric_only=True)
    df.insert(0, "Status", status, allow_duplicates=True)
    df=df.reset_index().set_index(["Country", "Status"])
    return df

### Load confirmed cases

- Load confirmed corona cases
- Perform data cleaning
- Index "Country" + "Status"
- Print out preview

In [4]:
df_confirmed=pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv")
df_confirmed=data_cleansing(df_confirmed, "Confirmed")
df_confirmed.head(2)

Unnamed: 0_level_0,Unnamed: 1_level_0,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,...,3/10/20,3/11/20,3/12/20,3/13/20,3/14/20,3/15/20,3/16/20,3/17/20,3/18/20,3/19/20
Country,Status,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
Afghanistan,Confirmed,0,0,0,0,0,0,0,0,0,0,...,5,7,7,7,11,16,21,22,22,22
Albania,Confirmed,0,0,0,0,0,0,0,0,0,0,...,10,12,23,33,38,42,51,55,59,64


### Load death cases

- Load confirmed corona cases
- Perform data cleaning
- Index "Country" + "Status"
- Print out preview

In [5]:
df_death=pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Deaths.csv")
df_death=data_cleansing(df_death, "Death")
df_death.head(2)

Unnamed: 0_level_0,Unnamed: 1_level_0,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,...,3/10/20,3/11/20,3/12/20,3/13/20,3/14/20,3/15/20,3/16/20,3/17/20,3/18/20,3/19/20
Country,Status,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
Afghanistan,Death,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Albania,Death,0,0,0,0,0,0,0,0,0,0,...,0,1,1,1,1,1,1,1,2,2


### Load recovered cases

- Load recovered corona cases
- Perform data cleaning
- Index "Country" + "Status"
- Print out preview

In [6]:
df_recovered=pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Recovered.csv")
df_recovered=data_cleansing(df_recovered, "Recovered")
df_recovered.head(2)

Unnamed: 0_level_0,Unnamed: 1_level_0,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,...,3/10/20,3/11/20,3/12/20,3/13/20,3/14/20,3/15/20,3/16/20,3/17/20,3/18/20,3/19/20
Country,Status,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
Afghanistan,Recovered,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,1,1,1
Albania,Recovered,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Calculate active cases

- Calculate active corona cases (active = confirmed - death - recovered)
- Index "Country" + "Status"
- Print out preview

In [7]:
index_active = []
data_active = []
for country in df_confirmed.index.get_level_values(0).unique().tolist():
    index_active.append((country, "Active"))
    data_active.append(
        df_confirmed.loc[country,"Confirmed"]
            .subtract(df_death.loc[country,"Death"])
                .subtract(df_recovered.loc[country,"Recovered"]))

df_active = pd.DataFrame(
    data=data_active, 
    columns=df_confirmed.columns, 
    index=pd.MultiIndex.from_tuples(index_active)
).astype(int)
df_active.head(2)

Unnamed: 0,Unnamed: 1,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,...,3/10/20,3/11/20,3/12/20,3/13/20,3/14/20,3/15/20,3/16/20,3/17/20,3/18/20,3/19/20
Afghanistan,Active,0,0,0,0,0,0,0,0,0,0,...,5,7,7,7,11,16,20,21,21,21
Albania,Active,0,0,0,0,0,0,0,0,0,0,...,10,11,22,32,37,41,50,54,57,62


### Merge all cases

- Merge all data frames via index "Country" + "Status"
- Sort ascending by "Country" and "Status"
- Print out a preview

In [8]:
df=pd.concat([df_confirmed,df_death,df_recovered,df_active]).sort_values(by=["Country", "Status"])
df.head(8)

Unnamed: 0_level_0,Unnamed: 1_level_0,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,...,3/10/20,3/11/20,3/12/20,3/13/20,3/14/20,3/15/20,3/16/20,3/17/20,3/18/20,3/19/20
Country,Status,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
Afghanistan,Active,0,0,0,0,0,0,0,0,0,0,...,5,7,7,7,11,16,20,21,21,21
Afghanistan,Confirmed,0,0,0,0,0,0,0,0,0,0,...,5,7,7,7,11,16,21,22,22,22
Afghanistan,Death,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Afghanistan,Recovered,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,1,1,1
Albania,Active,0,0,0,0,0,0,0,0,0,0,...,10,11,22,32,37,41,50,54,57,62
Albania,Confirmed,0,0,0,0,0,0,0,0,0,0,...,10,12,23,33,38,42,51,55,59,64
Albania,Death,0,0,0,0,0,0,0,0,0,0,...,0,1,1,1,1,1,1,1,2,2
Albania,Recovered,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## Visualize dataset

- Create stacked bar chart to visualize recovered, active and death corona cases
- Confirmed corona cases are the sum of recovered, active and death ones (not visualized)
- Add a drop down to swithc beteen corona cases either per country or worldwide
- Hover over a bar to get the exact number of recovered, active or death corona cases per day

In [9]:
country="Worldwide"
fig = go.FigureWidget(data=[
                    go.Bar(name="Recovered", x=df.columns, y=df.loc[country,"Recovered"], marker_color="green"),
                    go.Bar(name="Active", x=df.columns, y=df.loc[country,"Active"], marker_color="red"),
                    go.Bar(name="Death", x=df.columns, y=df.loc[country,"Death"], marker_color="black")
                ],
                layout=go.Layout(plot_bgcolor = "#EEEEEE"))
fig.update_layout(barmode='stack')

dropdown = widgets.Dropdown(
    description='Corona cases:',
    value="Worldwide",
    options=df.index.get_level_values(0).unique().tolist(),
    style={'description_width': 'initial'}
)
def response(change):
    country=dropdown.value
    with fig.batch_update():
            fig.data[0].y = df.loc[country,"Recovered"]
            fig.data[1].y = df.loc[country,"Active"]
            fig.data[2].y = df.loc[country,"Death"]
dropdown.observe(response, names="value")

widgets.VBox([dropdown, fig])

VBox(children=(Dropdown(description='Corona cases:', index=154, options=('Afghanistan', 'Albania', 'Algeria', …