Create an animated choropleth plot using plotly that analyzes a seven-day moving average of cases for some geographic unit and sub-unit (e.g. USA and states)

Create a second, non-animated, choropleth plot that shows cumulative cases or vaccinations per 100,000 people for the most recent date in the data file.Important: Your decision: Analyze either cases or vaccinations, depending on your data source.

Requirements:

* Find appropriate data source that includes new COVID-19 cases per day for the geographic region. (Use a direct link, not downloaded file.)

* Find a data source that estimates the population for the geographic region. (Direct link not downloaded file)

* Load both to a pandas dataframe

* Calculate cumulative cases per 100,000 population for the sub-region (i.e., state)

* Calculate 7-day moving average of new cases. (You might need to research methods in pandas.)

* PLOT 1: Plot 7-day moving average of cases on Plotly plot and animate by day (older dates on left of slider)

* PLOT 2: Create a separate plot of cumulative cases per 100,000 population. This should be for the maximum date in the dataframe and should not be animated.

* Plots will include relevant title and hover text.

* Colors will be continous scale of your choice.

In [None]:
# installing necessary packages
#! pip install pandas
#! pip install plotly

In [5]:
# import necessary python packages
import pandas as pd
import plotly.express as px

In [41]:
# Load Covid data 
# Data set is from https://github.com/nytimes/covid-19-data

covid_cases_df = pd.read_csv('https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv')

population_df = pd.read_csv('https://github.com/thePortus/wikidata-historic-state-populations/raw/master/data_files/2_wikidata_processed_results.csv')

In [34]:
covid_cases_df.head()

Unnamed: 0,date,state,fips,cases,deaths
0,2020-01-21,Washington,53,1,0
1,2020-01-22,Washington,53,1,0
2,2020-01-23,Washington,53,1,0
3,2020-01-24,Illinois,17,1,0
4,2020-01-24,Washington,53,1,0


In [43]:
population_df = population_df.loc[population_df['Year'] == 2018]

In [45]:
population_df.shape

(50, 5)

In [59]:
# view first few row
covid_cases_df.head()

Unnamed: 0,date,state,fips,cases,deaths
0,2020-01-21,Washington,53,1,0
1,2020-01-22,Washington,53,1,0
2,2020-01-23,Washington,53,1,0
3,2020-01-24,Illinois,17,1,0
4,2020-01-24,Washington,53,1,0


In [60]:
# Information about dataframe
covid_cases_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32094 entries, 0 to 32093
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   date    32094 non-null  object
 1   state   32094 non-null  object
 2   fips    32094 non-null  int64 
 3   cases   32094 non-null  int64 
 4   deaths  32094 non-null  int64 
dtypes: int64(3), object(2)
memory usage: 1.2+ MB


In [63]:
# Convert date type into datetime
covid_cases_df["date"] = pd.to_datetime(covid_cases_df["date"])

In [64]:
covid_cases_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32094 entries, 0 to 32093
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   date    32094 non-null  datetime64[ns]
 1   state   32094 non-null  object        
 2   fips    32094 non-null  int64         
 3   cases   32094 non-null  int64         
 4   deaths  32094 non-null  int64         
dtypes: datetime64[ns](1), int64(3), object(1)
memory usage: 1.2+ MB


In [65]:
# Need to calculate 7 day moving average ie weekly 
covid_cases_df["week"] = covid_cases_df["date"].dt.strftime("%Y-%U")

In [67]:
covid_cases_df.shape

(32094, 6)

In [70]:
# calculate 7 days Moving average
covid_cases_df['7ma'] = covid_cases_df.groupby('state').cases.transform(lambda c: c.rolling(7).mean())

In [71]:
covid_cases_df.head()

Unnamed: 0,date,state,fips,cases,deaths,week,7ma
0,2020-01-21,Washington,53,1,0,2020-03,
1,2020-01-22,Washington,53,1,0,2020-03,
2,2020-01-23,Washington,53,1,0,2020-03,
3,2020-01-24,Illinois,17,1,0,2020-03,
4,2020-01-24,Washington,53,1,0,2020-03,


In [72]:
covid_cases_df['cases'].max(), covid_cases_df['cases'].min()

(4775779, 1)

In [73]:
covid_cases_df['7ma'].max(), covid_cases_df['7ma'].min()

(4756293.714285715, 1.0)

In [74]:
# Mapping state with their state code(for vizualization)
us_state_to_abbrev = {
    "Alabama": "AL",
    "Alaska": "AK",
    "Arizona": "AZ",
    "Arkansas": "AR",
    "California": "CA",
    "Colorado": "CO",
    "Connecticut": "CT",
    "Delaware": "DE",
    "Florida": "FL",
    "Georgia": "GA",
    "Hawaii": "HI",
    "Idaho": "ID",
    "Illinois": "IL",
    "Indiana": "IN",
    "Iowa": "IA",
    "Kansas": "KS",
    "Kentucky": "KY",
    "Louisiana": "LA",
    "Maine": "ME",
    "Maryland": "MD",
    "Massachusetts": "MA",
    "Michigan": "MI",
    "Minnesota": "MN",
    "Mississippi": "MS",
    "Missouri": "MO",
    "Montana": "MT",
    "Nebraska": "NE",
    "Nevada": "NV",
    "New Hampshire": "NH",
    "New Jersey": "NJ",
    "New Mexico": "NM",
    "New York": "NY",
    "North Carolina": "NC",
    "North Dakota": "ND",
    "Ohio": "OH",
    "Oklahoma": "OK",
    "Oregon": "OR",
    "Pennsylvania": "PA",
    "Rhode Island": "RI",
    "South Carolina": "SC",
    "South Dakota": "SD",
    "Tennessee": "TN",
    "Texas": "TX",
    "Utah": "UT",
    "Vermont": "VT",
    "Virginia": "VA",
    "Washington": "WA",
    "West Virginia": "WV",
    "Wisconsin": "WI",
    "Wyoming": "WY",
    "District of Columbia": "DC",
    "American Samoa": "AS",
    "Guam": "GU",
    "Northern Mariana Islands": "MP",
    "Puerto Rico": "PR",
    "United States Minor Outlying Islands": "UM",
    "U.S. Virgin Islands": "VI",
}

In [75]:
covid_cases_df["state_code"] = covid_cases_df["state"].map(us_state_to_abbrev)

In [76]:
covid_cases_df.head()

Unnamed: 0,date,state,fips,cases,deaths,week,7ma,state_code
0,2020-01-21,Washington,53,1,0,2020-03,,WA
1,2020-01-22,Washington,53,1,0,2020-03,,WA
2,2020-01-23,Washington,53,1,0,2020-03,,WA
3,2020-01-24,Illinois,17,1,0,2020-03,,IL
4,2020-01-24,Washington,53,1,0,2020-03,,WA


In [77]:
# Downloading geojson for the states
from urllib.request import urlopen
import json
with urlopen('https://raw.githubusercontent.com/jgoodall/us-maps/master/geojson/state.geo.json') as response:
    states = json.load(response)

#### Plot 7-day moving average of cases on Plotly plot and animate by day

In [None]:
df = covid_cases_df.sort_values(by=['week'])
fig = px.choropleth(df, geojson=states, locations='state_code', color='7ma',
                        locationmode="USA-states",
                        color_continuous_scale=px.colors.sequential.OrRd,
                        title = "7 day moving average of covid cases on US States",
                        scope="usa",
                        animation_frame="week",
                    )
fig["layout"].pop("updatemenus")
fig.show()

In [1]:
# calculate cumlative cases

In [None]:
us100k=us100k.loc[us100k['cases'] <= 100000]
fig = px.choropleth(us100k, geojson=counties, locations='fips', color='cases',
                           color_continuous_scale=px.colors.sequential.OrRd,
                           title = "cumulative cases per 100,000 people",
                           scope="usa",

                          )
fig["layout"].pop("updatemenus")
fig.show()