## Objectives
1. Summarize how interactive plots can be useful to Decision Makers
2. Differentiate between exploratory data visualization and data visualization to illustrates analysis results
3. Use Dash to create interactive plots

## Description
In this case study, you will provide timely, useful feedback to global leaders regarding the spread of COVID-19. Every country's leadership is trying to decide national policy on quarantine, social distancing, wearing face masks, and potential national shutdown. Utilizing daily updated timeseries data from Johns Hopkins Center for Systems Science and Engineering's GitHub [site](https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series), you will create useful visualizations for the number of confirmed COVID-19 cases and deaths similar to this [study](https://91-divoc.com/pages/covid-visualization/).

# 1. Setup COVID Dataset

In [114]:
# Import packages for data manipulation and visualization
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

### Load and Inspect Data
[Read](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) the two COVID-19 global .csv files using the URLs above to DataFrames named `cases` and `deaths`, respectively. Additionally, read the `population.csv` file to a DataFrame named `population`. Remember, the file must be in the same directory as this Jupyter Notebook or you must specify the entire file path. Inspect the first five rows of the `cases`.

In [115]:
cases = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
deaths = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv')
population = pd.read_csv('population_global.csv')
cases.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,2/28/21,3/1/21,3/2/21,3/3/21,3/4/21,3/5/21,3/6/21,3/7/21,3/8/21,3/9/21
0,,Afghanistan,33.93911,67.709953,0,0,0,0,0,0,...,55714,55733,55759,55770,55775,55827,55840,55847,55876,55876
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,107167,107931,108823,109674,110521,111301,112078,112897,113580,114209
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,113092,113255,113430,113593,113761,113948,114104,114234,114382,114543
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,10866,10889,10908,10948,10976,10998,11019,11042,11069,11089
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,20807,20854,20882,20923,20981,21026,21055,21086,21108,21114


# 2. Manipulating our Data Into Tidy Data

In [116]:
cases = cases.rename(columns={"Country/Region": "country"})
deaths = deaths.rename(columns={"Country/Region": "country"})

In [117]:
country_cases = cases.drop(['Province/State', 'Lat', 'Long'], axis=1)
country_deaths = deaths.drop(['Province/State', 'Lat', 'Long'], axis=1)

In [118]:
country_cases = country_cases.groupby('country').agg(sum)
country_deaths = country_deaths.groupby('country').agg(sum)
country_population = population.groupby('country').agg(sum)
country_cases.head()

Unnamed: 0_level_0,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,...,2/28/21,3/1/21,3/2/21,3/3/21,3/4/21,3/5/21,3/6/21,3/7/21,3/8/21,3/9/21
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Afghanistan,0,0,0,0,0,0,0,0,0,0,...,55714,55733,55759,55770,55775,55827,55840,55847,55876,55876
Albania,0,0,0,0,0,0,0,0,0,0,...,107167,107931,108823,109674,110521,111301,112078,112897,113580,114209
Algeria,0,0,0,0,0,0,0,0,0,0,...,113092,113255,113430,113593,113761,113948,114104,114234,114382,114543
Andorra,0,0,0,0,0,0,0,0,0,0,...,10866,10889,10908,10948,10976,10998,11019,11042,11069,11089
Angola,0,0,0,0,0,0,0,0,0,0,...,20807,20854,20882,20923,20981,21026,21055,21086,21108,21114


In [119]:
country_cases = country_cases.join(country_population.population)
country_deaths = country_deaths.join(country_population.population)

In [120]:
cases_tidy = country_cases.reset_index().melt(id_vars=['country', 'population'],
                                              var_name='date',
                                              value_name='cases'
                                              )
deaths_tidy = country_deaths.reset_index().melt(id_vars=['country', 'population'],
                                               var_name='date',
                                               value_name='deaths'
                                               )

In [121]:
# change date column datatype from object to datetime
cases_tidy.date = pd.to_datetime(cases_tidy.date)
deaths_tidy.date = pd.to_datetime(deaths_tidy.date)

In [122]:
df = cases_tidy.join(deaths_tidy['deaths'])

# 3. Interactive Plots

In [123]:
import plotly.express as px

In [124]:
fig = px.scatter(df.loc[df.date == "2021-03-09"], x="cases", y="deaths", size="population", 
                 hover_name="country", log_x=True)
fig.show()

In [125]:
df.loc[df.country == "US"]

Unnamed: 0,country,population,date,cases,deaths
178,US,329584842,2020-01-22,1,0
370,US,329584842,2020-01-23,1,0
562,US,329584842,2020-01-24,2,0
754,US,329584842,2020-01-25,2,0
946,US,329584842,2020-01-26,5,0
...,...,...,...,...,...
78514,US,329584842,2021-03-05,28894908,522877
78706,US,329584842,2021-03-06,28952970,524362
78898,US,329584842,2021-03-07,28993873,525033
79090,US,329584842,2021-03-08,29038631,525752


In [126]:
fig = px.line(df.loc[df.country == "US"], x="date", y="cases", title="US Cases Trend")
fig.show()

# 4. Dash Examples
https://dash-gallery.plotly.host/Portal/