# Interactive Plots of COVID-19 Data
This is a notebook to interact with COVID-19 data using [Jupyter](https://jupyter.org/) and [Hvplot](https://hvplot.holoviz.org/). Currently we are focused on data from the US but may expand our analyses in the near future.

## Load Johns Hopkins COVID-19 Data
Here we load the COVID-19 confirmed case data from the [The Center for Systems Science and Engineering (CSSE)](https://systems.jhu.edu) at Johns Hopkins University. The CCSE COVID-19 [GitHub Repo](https://github.com/CSSEGISandData/COVID-19) has more information about these data and their sources.

In [74]:
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 1000)
import hvplot.pandas

In [75]:
dr='https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/'

In [76]:
src = dr + 'time_series_covid19_confirmed_global.csv'

In [77]:
src2 = dr + 'time_series_covid19_deaths_global.csv'

In [78]:
src3 = dr + 'time_series_covid19_recovered_global.csv'

In [79]:
df = pd.read_csv(src)
df.rename(columns={'Country/Region': 'country', 'Province/State': 'state',
                   'Lat': 'lat', 'Long': 'lon'}, inplace = True)
df = df[(df.state!='Diamond Princess') & 
        (df.state!='Grand Princess')].reset_index(drop=True)
df.columns = df.columns[0:4].append(pd.to_datetime(df.columns[4:]))
df

Unnamed: 0,state,country,lat,lon,2020-01-22 00:00:00,2020-01-23 00:00:00,2020-01-24 00:00:00,2020-01-25 00:00:00,2020-01-26 00:00:00,2020-01-27 00:00:00,...,2020-03-24 00:00:00,2020-03-25 00:00:00,2020-03-26 00:00:00,2020-03-27 00:00:00,2020-03-28 00:00:00,2020-03-29 00:00:00,2020-03-30 00:00:00,2020-03-31 00:00:00,2020-04-01 00:00:00,2020-04-02 00:00:00
0,,Afghanistan,33.0,65.0,0,0,0,0,0,0,...,74,84,94,110,110,120,170,174,237,273
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,123,146,174,186,197,212,223,243,259,277
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,264,302,367,409,454,511,584,716,847,986
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,164,188,224,267,308,334,370,376,390,428
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,3,3,4,4,5,7,7,7,8,8
5,,Antigua and Barbuda,17.0608,-61.7964,0,0,0,0,0,0,...,3,3,7,7,7,7,7,7,7,9
6,,Argentina,-38.4161,-63.6167,0,0,0,0,0,0,...,387,387,502,589,690,745,820,1054,1054,1133
7,,Armenia,40.0691,45.0382,0,0,0,0,0,0,...,249,265,290,329,407,424,482,532,571,663
8,Australian Capital Territory,Australia,-35.4735,149.0124,0,0,0,0,0,0,...,39,39,53,62,71,77,78,80,84,87
9,New South Wales,Australia,-33.8688,151.2093,0,0,0,0,3,4,...,818,1029,1219,1405,1617,1791,2032,2032,2182,2298


In [80]:
df2 = pd.read_csv(src2)
df2.rename(columns={'Country/Region': 'country', 'Province/State': 'state',
                   'Lat': 'lat', 'Long': 'lon'}, inplace = True)
df2 = df2[(df2.state!='Diamond Princess') & 
        (df2.state!='Grand Princess')].reset_index(drop=True)
df2.columns = df2.columns[0:4].append(pd.to_datetime(df2.columns[4:]))

In [81]:
df3 = pd.read_csv(src3)
df3.rename(columns={'Country/Region': 'country', 'Province/State': 'state',
                   'Lat': 'lat', 'Long': 'lon'}, inplace = True)
df3 = df3[(df3.state!='Diamond Princess') & 
        (df3.state!='Grand Princess')].reset_index(drop=True)
df3.columns = df3.columns[0:4].append(pd.to_datetime(df3.columns[4:]))

## Plot all cases on log scale
Below is a quick plot of all  cases on a logarithmic scale. 

Hvplot creates holoviews objects, and the `*` symbol means [overlay](http://holoviews.org/reference/containers/bokeh/Overlay.html).  See [holoviz plot customization](http://holoviews.org/user_guide/Customizing_Plots.html) for available options.  

In [82]:
def country(name='US'):
    conf = df[(df.country==name)]
    death = df2[(df2.country==name)]
    reco = df3[(df3.country==name)]
    opts = {'legend': True, 'logy': True, 'grid': True, 'width':950, 'height': 300,
        'title': f'Cases of COVID-19 in {name}', 'padding':0.1, 'xticks':10,
        'ylim':(1.0,1.0e3)}
    s = conf.iloc[:,4:].sum()
    s2 = death.iloc[:,4:].sum()
    s3 = reco.iloc[:,4:].sum()
    s.name = name + ' conf'
    s2.name = name + ' death'
    s3.name = name + ' reco'
    linec = s.hvplot(**opts)
    lined = s2.hvplot(**opts)
    liner = s3.hvplot(**opts)

    return linec, lined, liner

This is how we [slice columns](https://stackoverflow.com/questions/10665889/how-to-take-column-slices-of-dataframe-in-pandas) in Pandas.

In [83]:
usa = country(name='US')
china = country(name='China')
italy = country(name='Italy')
turkey = country(name='Turkey')
japan = country(name='Japan')
germany = country(name='Germany')
korea = country(name='Korea, South')

In [84]:
(usa[0] * china[0] * italy[0] * turkey[0] * japan[0]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e6), legend_position='top_left')

In [85]:
(usa[1] * china[1] * italy[1] * turkey[1] * japan[1]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [86]:
(china[0] * china[1] * china[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [87]:
(italy[0] * italy[1] * italy[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [88]:
(usa[0] * usa[1] * usa[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [89]:
(turkey[0] * turkey[1] * turkey[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [90]:
(germany[0] * germany[1] * germany[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [91]:
(korea[0] * korea[1] * korea[2]).opts(title_format='Cases of COVID-19', ylim=(1.0,1.0e5), legend_position='top_left')

In [92]:
df=df.fillna("Total")
df2=df2.fillna("Total")
df3=df3.fillna("Total")
df

Unnamed: 0,state,country,lat,lon,2020-01-22 00:00:00,2020-01-23 00:00:00,2020-01-24 00:00:00,2020-01-25 00:00:00,2020-01-26 00:00:00,2020-01-27 00:00:00,...,2020-03-24 00:00:00,2020-03-25 00:00:00,2020-03-26 00:00:00,2020-03-27 00:00:00,2020-03-28 00:00:00,2020-03-29 00:00:00,2020-03-30 00:00:00,2020-03-31 00:00:00,2020-04-01 00:00:00,2020-04-02 00:00:00
0,Total,Afghanistan,33.0,65.0,0,0,0,0,0,0,...,74,84,94,110,110,120,170,174,237,273
1,Total,Albania,41.1533,20.1683,0,0,0,0,0,0,...,123,146,174,186,197,212,223,243,259,277
2,Total,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,264,302,367,409,454,511,584,716,847,986
3,Total,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,164,188,224,267,308,334,370,376,390,428
4,Total,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,3,3,4,4,5,7,7,7,8,8
5,Total,Antigua and Barbuda,17.0608,-61.7964,0,0,0,0,0,0,...,3,3,7,7,7,7,7,7,7,9
6,Total,Argentina,-38.4161,-63.6167,0,0,0,0,0,0,...,387,387,502,589,690,745,820,1054,1054,1133
7,Total,Armenia,40.0691,45.0382,0,0,0,0,0,0,...,249,265,290,329,407,424,482,532,571,663
8,Australian Capital Territory,Australia,-35.4735,149.0124,0,0,0,0,0,0,...,39,39,53,62,71,77,78,80,84,87
9,New South Wales,Australia,-33.8688,151.2093,0,0,0,0,3,4,...,818,1029,1219,1405,1617,1791,2032,2032,2182,2298


In [93]:
dfsub=df.loc[:,"country" ::] 
dfsub2=df2.loc[:,"country" ::] 
dfsub3=df3.loc[:,"country" ::] 
dfsub

Unnamed: 0,country,lat,lon,2020-01-22 00:00:00,2020-01-23 00:00:00,2020-01-24 00:00:00,2020-01-25 00:00:00,2020-01-26 00:00:00,2020-01-27 00:00:00,2020-01-28 00:00:00,...,2020-03-24 00:00:00,2020-03-25 00:00:00,2020-03-26 00:00:00,2020-03-27 00:00:00,2020-03-28 00:00:00,2020-03-29 00:00:00,2020-03-30 00:00:00,2020-03-31 00:00:00,2020-04-01 00:00:00,2020-04-02 00:00:00
0,Afghanistan,33.0,65.0,0,0,0,0,0,0,0,...,74,84,94,110,110,120,170,174,237,273
1,Albania,41.1533,20.1683,0,0,0,0,0,0,0,...,123,146,174,186,197,212,223,243,259,277
2,Algeria,28.0339,1.6596,0,0,0,0,0,0,0,...,264,302,367,409,454,511,584,716,847,986
3,Andorra,42.5063,1.5218,0,0,0,0,0,0,0,...,164,188,224,267,308,334,370,376,390,428
4,Angola,-11.2027,17.8739,0,0,0,0,0,0,0,...,3,3,4,4,5,7,7,7,8,8
5,Antigua and Barbuda,17.0608,-61.7964,0,0,0,0,0,0,0,...,3,3,7,7,7,7,7,7,7,9
6,Argentina,-38.4161,-63.6167,0,0,0,0,0,0,0,...,387,387,502,589,690,745,820,1054,1054,1133
7,Armenia,40.0691,45.0382,0,0,0,0,0,0,0,...,249,265,290,329,407,424,482,532,571,663
8,Australia,-35.4735,149.0124,0,0,0,0,0,0,0,...,39,39,53,62,71,77,78,80,84,87
9,Australia,-33.8688,151.2093,0,0,0,0,3,4,4,...,818,1029,1219,1405,1617,1791,2032,2032,2182,2298


In [94]:
df['country'] = df.apply(lambda x: (x.country,x.state), axis=1)
df2['country'] = df2.apply(lambda x: (x.country,x.state), axis=1)
df3['country'] = df3.apply(lambda x: (x.country,x.state), axis=1)
del df["state"]
del df2["state"]
del df3["state"]
df

Unnamed: 0,country,lat,lon,2020-01-22 00:00:00,2020-01-23 00:00:00,2020-01-24 00:00:00,2020-01-25 00:00:00,2020-01-26 00:00:00,2020-01-27 00:00:00,2020-01-28 00:00:00,...,2020-03-24 00:00:00,2020-03-25 00:00:00,2020-03-26 00:00:00,2020-03-27 00:00:00,2020-03-28 00:00:00,2020-03-29 00:00:00,2020-03-30 00:00:00,2020-03-31 00:00:00,2020-04-01 00:00:00,2020-04-02 00:00:00
0,"(Afghanistan, Total)",33.0,65.0,0,0,0,0,0,0,0,...,74,84,94,110,110,120,170,174,237,273
1,"(Albania, Total)",41.1533,20.1683,0,0,0,0,0,0,0,...,123,146,174,186,197,212,223,243,259,277
2,"(Algeria, Total)",28.0339,1.6596,0,0,0,0,0,0,0,...,264,302,367,409,454,511,584,716,847,986
3,"(Andorra, Total)",42.5063,1.5218,0,0,0,0,0,0,0,...,164,188,224,267,308,334,370,376,390,428
4,"(Angola, Total)",-11.2027,17.8739,0,0,0,0,0,0,0,...,3,3,4,4,5,7,7,7,8,8
5,"(Antigua and Barbuda, Total)",17.0608,-61.7964,0,0,0,0,0,0,0,...,3,3,7,7,7,7,7,7,7,9
6,"(Argentina, Total)",-38.4161,-63.6167,0,0,0,0,0,0,0,...,387,387,502,589,690,745,820,1054,1054,1133
7,"(Armenia, Total)",40.0691,45.0382,0,0,0,0,0,0,0,...,249,265,290,329,407,424,482,532,571,663
8,"(Australia, Australian Capital Territory)",-35.4735,149.0124,0,0,0,0,0,0,0,...,39,39,53,62,71,77,78,80,84,87
9,"(Australia, New South Wales)",-33.8688,151.2093,0,0,0,0,3,4,4,...,818,1029,1219,1405,1617,1791,2032,2032,2182,2298


In [95]:
dfm=pd.melt(df, id_vars=df.columns.values[0:3], var_name="Date", value_name="Value")
dfm2=pd.melt(df2, id_vars=df2.columns.values[0:3], var_name="Date", value_name="Value")
dfm3=pd.melt(df3, id_vars=df3.columns.values[0:3], var_name="Date", value_name="Value")
dfm

Unnamed: 0,country,lat,lon,Date,Value
0,"(Afghanistan, Total)",33.000000,65.000000,2020-01-22,0
1,"(Albania, Total)",41.153300,20.168300,2020-01-22,0
2,"(Algeria, Total)",28.033900,1.659600,2020-01-22,0
3,"(Andorra, Total)",42.506300,1.521800,2020-01-22,0
4,"(Angola, Total)",-11.202700,17.873900,2020-01-22,0
...,...,...,...,...,...
18427,"(Botswana, Total)",-22.328500,24.684900,2020-04-02,4
18428,"(Burundi, Total)",-3.373100,29.918900,2020-04-02,3
18429,"(Sierra Leone, Total)",8.460555,-11.779889,2020-04-02,2
18430,"(Netherlands, Bonaire, Sint Eustatius and Saba)",12.178400,-68.238500,2020-04-02,2


In [96]:
dfm.rename(columns = {'country':'id'}, inplace = True)
dfm2.rename(columns = {'country':'id'}, inplace = True)
dfm3.rename(columns = {'country':'id'}, inplace = True)

In [97]:
dfm.to_csv('covid_conf.csv', index=False)
dfm2.to_csv('covid_deat.csv', index=False)
dfm3.to_csv('covid_reco.csv', index=False)


In [98]:
dfx = pd.read_csv(src)
dfx.rename(columns={'Country/Region': 'country', 'Province/State': 'state',
                   'Lat': 'lat', 'Long': 'lon'}, inplace = True)
dfx = dfx[(dfx.state!='Diamond Princess') & 
        (dfx.state!='Grand Princess')].reset_index(drop=True)
dfx.columns = dfx.columns[0:4].append(pd.to_datetime(dfx.columns[4:]))

In [99]:
dfi=dfx.set_index('country')
dfi

Unnamed: 0_level_0,state,lat,lon,2020-01-22 00:00:00,2020-01-23 00:00:00,2020-01-24 00:00:00,2020-01-25 00:00:00,2020-01-26 00:00:00,2020-01-27 00:00:00,2020-01-28 00:00:00,...,2020-03-24 00:00:00,2020-03-25 00:00:00,2020-03-26 00:00:00,2020-03-27 00:00:00,2020-03-28 00:00:00,2020-03-29 00:00:00,2020-03-30 00:00:00,2020-03-31 00:00:00,2020-04-01 00:00:00,2020-04-02 00:00:00
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Afghanistan,,33.0,65.0,0,0,0,0,0,0,0,...,74,84,94,110,110,120,170,174,237,273
Albania,,41.1533,20.1683,0,0,0,0,0,0,0,...,123,146,174,186,197,212,223,243,259,277
Algeria,,28.0339,1.6596,0,0,0,0,0,0,0,...,264,302,367,409,454,511,584,716,847,986
Andorra,,42.5063,1.5218,0,0,0,0,0,0,0,...,164,188,224,267,308,334,370,376,390,428
Angola,,-11.2027,17.8739,0,0,0,0,0,0,0,...,3,3,4,4,5,7,7,7,8,8
Antigua and Barbuda,,17.0608,-61.7964,0,0,0,0,0,0,0,...,3,3,7,7,7,7,7,7,7,9
Argentina,,-38.4161,-63.6167,0,0,0,0,0,0,0,...,387,387,502,589,690,745,820,1054,1054,1133
Armenia,,40.0691,45.0382,0,0,0,0,0,0,0,...,249,265,290,329,407,424,482,532,571,663
Australia,Australian Capital Territory,-35.4735,149.0124,0,0,0,0,0,0,0,...,39,39,53,62,71,77,78,80,84,87
Australia,New South Wales,-33.8688,151.2093,0,0,0,0,3,4,4,...,818,1029,1219,1405,1617,1791,2032,2032,2182,2298


In [100]:
a=dfi.loc['US'][-1:-5:-1]
(a[0]-a[1])/a[1]

0.1409791350317755

In [101]:
a=dfi.loc['Turkey'][-1:-5:-1]
(a[0]-a[1])/a[1]

0.15664264302570316

In [102]:
a=dfi.loc['Indonesia'][-1:-5:-1]
(a[0]-a[1])/a[1]

0.06738223017292785