# Objectives
1. Summarize how interactive plots can be useful to Decision Makers
2. Differentiate between exploratory data visualization and data visualization to illustrates analysis results
3. Use Dash to create interactive plots

# COVID Case Study

In [1]:
# Import packages for data manipulation and visualization
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px

### Load and Inspect Data
[Read](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) the two COVID-19 global .csv files using the URLs above to DataFrames named `cases` and `deaths`, respectively. Additionally, read the `population.csv` file to a DataFrame named `population`. Remember, the file must be in the same directory as this Jupyter Notebook or you must specify the entire file path. Inspect the first five rows of the `cases`.

**Part 1 has already been completed for you.  Run all the cells up to Part 2.**

In [2]:
cases = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
deaths = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv')
population = pd.read_csv('population.csv')
cases.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,4/6/21,4/7/21,4/8/21,4/9/21,4/10/21,4/11/21,4/12/21,4/13/21,4/14/21,4/15/21
0,,Afghanistan,33.93911,67.709953,0,0,0,0,0,0,...,56779,56873,56943,57019,57144,57160,57242,57364,57492,57534
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,126936,127192,127509,127795,128155,128393,128518,128752,128959,129128
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,117879,118004,118116,118251,118378,118516,118645,118799,118975,119142
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,12328,12363,12409,12456,12497,12545,12581,12614,12641,12641
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,22885,23010,23108,23242,23331,23457,23549,23697,23841,23951


# 2. Manipulating our Data Into Tidy Data

In [3]:
cases = cases.rename(columns={"Country/Region": "country"})
deaths = deaths.rename(columns={"Country/Region": "country"})

In [4]:
country_cases = cases.drop(['Province/State', 'Lat', 'Long'], axis=1)
country_deaths = deaths.drop(['Province/State', 'Lat', 'Long'], axis=1)

In [5]:
country_cases = country_cases.groupby('country').agg(sum)
country_deaths = country_deaths.groupby('country').agg(sum)
country_population = population.groupby('country').agg(sum)
country_cases.head()

Unnamed: 0_level_0,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,...,4/6/21,4/7/21,4/8/21,4/9/21,4/10/21,4/11/21,4/12/21,4/13/21,4/14/21,4/15/21
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Afghanistan,0,0,0,0,0,0,0,0,0,0,...,56779,56873,56943,57019,57144,57160,57242,57364,57492,57534
Albania,0,0,0,0,0,0,0,0,0,0,...,126936,127192,127509,127795,128155,128393,128518,128752,128959,129128
Algeria,0,0,0,0,0,0,0,0,0,0,...,117879,118004,118116,118251,118378,118516,118645,118799,118975,119142
Andorra,0,0,0,0,0,0,0,0,0,0,...,12328,12363,12409,12456,12497,12545,12581,12614,12641,12641
Angola,0,0,0,0,0,0,0,0,0,0,...,22885,23010,23108,23242,23331,23457,23549,23697,23841,23951


In [6]:
country_cases = country_cases.join(country_population.population)
country_deaths = country_deaths.join(country_population.population)

In [7]:
cases_tidy = country_cases.reset_index().melt(id_vars=['country', 'population'],
                                              var_name='date',
                                              value_name='cases'
                                              )
deaths_tidy = country_deaths.reset_index().melt(id_vars=['country', 'population'],
                                               var_name='date',
                                               value_name='deaths'
                                               )

In [8]:
# change date column datatype from object to datetime
cases_tidy.date = pd.to_datetime(cases_tidy.date)
deaths_tidy.date = pd.to_datetime(deaths_tidy.date)

In [9]:
df = cases_tidy.join(deaths_tidy['deaths'])
df

Unnamed: 0,country,population,date,cases,deaths
0,Afghanistan,32225560,2020-01-22,0,0
1,Albania,2845955,2020-01-22,0,0
2,Algeria,43000000,2020-01-22,0,0
3,Andorra,77543,2020-01-22,0,0
4,Angola,31127674,2020-01-22,0,0
...,...,...,...,...,...
86395,Vietnam,96208984,2021-04-15,2758,35
86396,West Bank and Gaza,4976684,2021-04-15,276407,2937
86397,Yemen,29825968,2021-04-15,5657,1097
86398,Zambia,17885422,2021-04-15,90532,1230


# 2. Static Interactive Plots

Before we start creating our visualizations, we have to do a little bit more prep work.  For the remainder of this lab, we will use the top 10 countries in COVID cases that we found in the Pandas Case Study.  First, we'll make a copy of our original DataFrame.  Next, in order for our interactive plots to work later on, we must first convert our dates from datetime objects to strings.  Finally, we will filter out the top 10 countries and the observations for the first day of each month from a copy of the original DataFrame.  **Run the cell below.**

In [None]:
top_countries = ['Brazil', 'France', 'Germany', 'India', 'Italy', 'Russia', 'Spain', 'Turkey', 'US', 'United Kingdom']
df2 = df.copy()
df2.date = df2.date.dt.strftime('%Y-%m-%d')
top10 = df2.loc[df2.country.isin(top_countries) & (df2.date.str[-2:] == '01')]
top10

**Q2.1** Filter `top10` by the date `2022-04-01` (note that this is now a string value) and assign this DataFrame to `top10_date`.  Using `top10_date`, create a static interactive bubble plot with `population` on the x-axis, `cases` on the y-axis, color set to `country`, and size set to `deaths`.

In [None]:
# bubble chart of cases


**Q2.2** Using `top10_date`, create two geographical scatter plots with the following parameter-column pairings:
- locations - `country`
- locationmode - `country names`
- size - `cases` for the first plot, `deaths` for the second plot
- color - `country`

In [None]:
# geo scatter plot of cases


In [None]:
# geo scatter plot of deaths


# 3. Dynamic Interactive Plots

While the visualizations above give some pretty good insight, we can definitely improve upon these visualizations by leveraging our 'date' column and observing how the counts of cases and deaths evolve over time in our top 10 countries.

For the interactive bar plots, you'll notice that I have include some extra code.  These two lines adjust the timing for each frame and the timing between frames, each time measured in milliseconds.  You can view these parameters for yourself by entering `fig.layout` after you have assigned a plot to `fig`.

**Q3.1**  Using the `top10` DataFrame, create two interactive bar plots with the following parameter-column/value pairs and assign the plot to `fig`:
- x-axis - `country`
- y-axis - `cases` for the first plot, `deaths` for the second plot
- animation_frame - `date`
- animation_group - `country`
- range_y - 0 to 90000000 for the first plot, 0 to 1000000 for the second plot

In [None]:
# interactive bar plot of cases per country


# changing animation settings
fig.layout.updatemenus[0].buttons[0].args[1]['frame']['duration'] = 200
fig.layout.updatemenus[0].buttons[0].args[1]['transition']['duration'] = 0
fig.show()

In [None]:
# interactive bar plot of deaths per country


# changing animation settings
fig.layout.updatemenus[0].buttons[0].args[1]['frame']['duration'] = 200
fig.layout.updatemenus[0].buttons[0].args[1]['transition']['duration'] = 0
fig.show()

**Q3.2** Using `top10`, create two interactive geographical plots with the following parameter-column pairs:
- locations, animation_group, color, and hover_name - `country`
- locationmode - `country names`
- size - `cases` for the first plot, `deaths` for the second plot

In [None]:
# interactive geographical plot of cases per country


In [None]:
# interactive geographical plot of deaths per country


**Q3.3**  A choropleth map is essentially a heatmap with a geographical map overlay.  It allows analysts to show statistical information within the context of a visual representation of the global region of concern.  Using `top10`, we'll now create a choropleth map.

Create two choropleth maps using the following parameter-column pairs:
- locations, hover_name - `country`
- locationmode - `country names`
- color - `cases` for the first map, `deaths` for the second map
- animation_frame - `date`

In [None]:
# choropleth map of cases per country


In [None]:
# choropleth map of deaths per country
