# Interactive Plots of COVID-19 Data
This is a notebook to interact with COVID-19 data using [Jupyter](https://jupyter.org/) and [Hvplot](https://hvplot.holoviz.org/). Currently we are focused on data from the US but may expand our analyses in the near future.

## Load Johns Hopkins COVID-19 Data
Here we load the COVID-19 confirmed case data from the [The Center for Systems Science and Engineering (CSSE)](https://systems.jhu.edu) at Johns Hopkins University. The CCSE COVID-19 [GitHub Repo](https://github.com/CSSEGISandData/COVID-19) has more information about these data and their sources.

In [None]:
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 1000)
import hvplot.pandas

In [None]:
dr='https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/'

In [None]:
src = dr + 'time_series_covid19_confirmed_global.csv'

In [None]:
src2 = dr + 'time_series_covid19_deaths_global.csv'

In [None]:
src3 = dr + 'time_series_covid19_recovered_global.csv'

In [None]:
df = pd.read_csv(src)
df.rename(columns={'Country/Region': 'country', 'Province/State': 'state',
                   'Lat': 'lat', 'Long': 'lon'}, inplace = True)
df = df[(df.state!='Diamond Princess') & 
        (df.state!='Grand Princess')].reset_index(drop=True)
df.columns = df.columns[0:4].append(pd.to_datetime(df.columns[4:]))
df

In [None]:
df2 = pd.read_csv(src2)
df2.rename(columns={'Country/Region': 'country', 'Province/State': 'state',
                   'Lat': 'lat', 'Long': 'lon'}, inplace = True)
df2 = df2[(df2.state!='Diamond Princess') & 
        (df2.state!='Grand Princess')].reset_index(drop=True)
df2.columns = df2.columns[0:4].append(pd.to_datetime(df2.columns[4:]))

In [None]:
df3 = pd.read_csv(src3)
df3.rename(columns={'Country/Region': 'country', 'Province/State': 'state',
                   'Lat': 'lat', 'Long': 'lon'}, inplace = True)
df3 = df3[(df3.state!='Diamond Princess') & 
        (df3.state!='Grand Princess')].reset_index(drop=True)
df3.columns = df3.columns[0:4].append(pd.to_datetime(df3.columns[4:]))

## Plot all cases on log scale
Below is a quick plot of all  cases on a logarithmic scale. 

Hvplot creates holoviews objects, and the `*` symbol means [overlay](http://holoviews.org/reference/containers/bokeh/Overlay.html).  See [holoviz plot customization](http://holoviews.org/user_guide/Customizing_Plots.html) for available options.  

In [None]:
def country(name='US'):
    conf = df[(df.country==name)]
    death = df2[(df2.country==name)]
    reco = df3[(df3.country==name)]
    opts = {'legend': True, 'logy': True, 'grid': True, 'width':950, 'height': 300,
        'title': f'Cases of COVID-19 in {name}', 'padding':0.1, 'xticks':10,
        'ylim':(1.0,1.0e3)}
    s = conf.iloc[:,4:].sum()
    s2 = death.iloc[:,4:].sum()
    s3 = reco.iloc[:,4:].sum()
    s.name = name + ' conf'
    s2.name = name + ' death'
    s3.name = name + ' reco'
    linec = s.hvplot(**opts)
    lined = s2.hvplot(**opts)
    liner = s3.hvplot(**opts)

    return linec, lined, liner

This is how we [slice columns](https://stackoverflow.com/questions/10665889/how-to-take-column-slices-of-dataframe-in-pandas) in Pandas.

In [None]:
usa = country(name='US')
china = country(name='China')
italy = country(name='Italy')
turkey = country(name='Turkey')
japan = country(name='Japan')

In [None]:
(usa[0] * china[0] * italy[0] * turkey[0] * japan[0]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [None]:
(usa[1] * china[1] * italy[1] * turkey[1] * japan[1]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e4), legend_position='top_left')

In [None]:
(china[0] * china[1] * china[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [None]:
(italy[0] * italy[1] * italy[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [None]:
(usa[0] * usa[1] * usa[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [None]:
(turkey[0] * turkey[1] * turkey[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [None]:
df=df.fillna("Total")
df2=df2.fillna("Total")
df3=df3.fillna("Total")
df

In [None]:
dfsub=df.loc[:,"country" ::] 
dfsub2=df2.loc[:,"country" ::] 
dfsub3=df3.loc[:,"country" ::] 
dfsub

In [None]:
df['country'] = df.apply(lambda x: (x.country,x.state), axis=1)
df2['country'] = df2.apply(lambda x: (x.country,x.state), axis=1)
df3['country'] = df3.apply(lambda x: (x.country,x.state), axis=1)
del df["state"]
del df2["state"]
del df3["state"]
df

In [None]:
dfm=pd.melt(df, id_vars=df.columns.values[0:3], var_name="Date", value_name="Value")
dfm2=pd.melt(df2, id_vars=df2.columns.values[0:3], var_name="Date", value_name="Value")
dfm3=pd.melt(df3, id_vars=df3.columns.values[0:3], var_name="Date", value_name="Value")
dfm

In [None]:
dfm.rename(columns = {'country':'id'}, inplace = True)
dfm2.rename(columns = {'country':'id'}, inplace = True)
dfm3.rename(columns = {'country':'id'}, inplace = True)

In [None]:
dfm.to_csv('covid_confirmed2.csv', index=False)
dfm2.to_csv('covid_deaths2.csv', index=False)
dfm3.to_csv('covid_recovered2.csv', index=False)
