Create an animated choropleth plot using plotly that analyzes a seven-day moving average of cases for some geographic unit and sub-unit (e.g. USA and states)

Create a second, non-animated, choropleth plot that shows cumulative cases or vaccinations per 100,000 people for the most recent date in the data file.Important: Your decision: Analyze either cases or vaccinations, depending on your data source.

Requirements:

* Find appropriate data source that includes new COVID-19 cases per day for the geographic region. (Use a direct link, not downloaded file.)

* Find a data source that estimates the population for the geographic region. (Direct link not downloaded file)

* Load both to a pandas dataframe

* Calculate cumulative cases per 100,000 population for the sub-region (i.e., state)

* Calculate 7-day moving average of new cases. (You might need to research methods in pandas.)

* PLOT 1: Plot 7-day moving average of cases on Plotly plot and animate by day (older dates on left of slider)

* PLOT 2: Create a separate plot of cumulative cases per 100,000 population. This should be for the maximum date in the dataframe and should not be animated.

* Plots will include relevant title and hover text.

* Colors will be continous scale of your choice.

In [None]:
# installing necessary packages
#! pip install pandas
#! pip install plotly

In [5]:
# import necessary python packages
import pandas as pd
import plotly.express as px

In [18]:
# Load Covid data 
# Data set is from https://github.com/nytimes/covid-19-data

us_df = pd.read_csv('https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv')

In [19]:
# view first few row
us_df.head()

Unnamed: 0,date,state,fips,cases,deaths
0,2020-01-21,Washington,53,1,0
1,2020-01-22,Washington,53,1,0
2,2020-01-23,Washington,53,1,0
3,2020-01-24,Illinois,17,1,0
4,2020-01-24,Washington,53,1,0


In [20]:
# Information about dataframe
us_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32094 entries, 0 to 32093
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   date    32094 non-null  object
 1   state   32094 non-null  object
 2   fips    32094 non-null  int64 
 3   cases   32094 non-null  int64 
 4   deaths  32094 non-null  int64 
dtypes: int64(3), object(2)
memory usage: 1.2+ MB


In [21]:
# Convert date type into datetime

us_df["date"] = pd.to_datetime(us_df["date"])

In [22]:
us_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32094 entries, 0 to 32093
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   date    32094 non-null  datetime64[ns]
 1   state   32094 non-null  object        
 2   fips    32094 non-null  int64         
 3   cases   32094 non-null  int64         
 4   deaths  32094 non-null  int64         
dtypes: datetime64[ns](1), int64(3), object(1)
memory usage: 1.2+ MB


In [23]:
# Need to calculate 7 day moving average ie weekly 

us_df["week"] = us_df["date"].dt.strftime("%Y-%U")

In [24]:
us_df.head()

Unnamed: 0,date,state,fips,cases,deaths,week
0,2020-01-21,Washington,53,1,0,2020-03
1,2020-01-22,Washington,53,1,0,2020-03
2,2020-01-23,Washington,53,1,0,2020-03
3,2020-01-24,Illinois,17,1,0,2020-03
4,2020-01-24,Washington,53,1,0,2020-03


In [17]:
us_df.shape

(1790620, 7)

In [27]:
us_df = us_df.sort_values(by=['state', 'date'])
df_us_week = us_df.groupby(['state', 'fips', 'week']).first().reset_index()
df_us_week
df_us_week.head()

Unnamed: 0,state,fips,week,date,cases,deaths
0,Alabama,1,2020-10,2020-03-13,6,0
1,Alabama,1,2020-11,2020-03-15,23,0
2,Alabama,1,2020-12,2020-03-22,157,0
3,Alabama,1,2020-13,2020-03-29,830,5
4,Alabama,1,2020-14,2020-04-05,1840,45


In [28]:
df_us_week['cases'].max(), df_us_week['cases'].min()

(4749498, 1)

In [31]:
from urllib.request import urlopen
import json
with urlopen('https://raw.githubusercontent.com/jgoodall/us-maps/master/geojson/state.geo.json') as response:
    states = json.load(response)

In [32]:
df_us_week = df_us_week.sort_values(by=['week'])
fig = px.choropleth(df_us_week, geojson=states, locations='fips', color='cases',
                           color_continuous_scale=px.colors.sequential.OrRd,
                           title = "seven-day moving average of cases",
                           scope="usa",
                           animation_frame="week",
                          )
fig["layout"].pop("updatemenus")
fig.show()

KeyboardInterrupt: 

In [None]:
us100k=us100k.loc[us100k['cases'] <= 100000]
fig = px.choropleth(us100k, geojson=counties, locations='fips', color='cases',
                           color_continuous_scale=px.colors.sequential.OrRd,
                           title = "cumulative cases per 100,000 people",
                           scope="usa",

                          )
fig["layout"].pop("updatemenus")
fig.show()