## Minnesota State COVID Response Analysis
This notebook contains the work to identify associations between the Minnesota state governmental response and the COVID-19 case count throughout the pandemic.


## Data Cleanup
As with most data mining projects, we will need to clean up the given data file in order to focus on the goal at hand. The "all-states-history.csv" file is a dataset of U.S. COVID-19 cases and deaths dating from the start of the pandemic to 11/29/20 and was sourced from [The Covid Tracking Project](https://covidtracking.com/data). We are analyzing 3 periods throughout this timeline:

- Early Breakout (Early March -> May)
- Summer (June -> August)
- Fall/Present (September -> Late November)

We will divide up the data into 3 different frames according to these periods.

In [1]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import squarify
import seaborn as sns

In [2]:
data = pd.read_csv('all-states-history.csv')
data.head()

Unnamed: 0,date,state,dataQualityGrade,death,deathConfirmed,deathIncrease,deathProbable,hospitalized,hospitalizedCumulative,hospitalizedCurrently,...,totalTestResults,totalTestResultsIncrease,totalTestsAntibody,totalTestsAntigen,totalTestsPeopleAntibody,totalTestsPeopleAntigen,totalTestsPeopleViral,totalTestsPeopleViralIncrease,totalTestsViral,totalTestsViralIncrease
0,2020-11-29,AK,A,121.0,121.0,0,,722.0,722.0,159.0,...,1006180.0,7126,,,,,,0,1006180.0,7126
1,2020-11-29,AL,A,3577.0,3245.0,5,332.0,24670.0,24670.0,1609.0,...,1579713.0,5811,,,71698.0,,,0,1579713.0,5811
2,2020-11-29,AR,A+,2470.0,2265.0,21,205.0,8843.0,8843.0,1030.0,...,1675828.0,10243,,21856.0,,135709.0,,0,1675828.0,10243
3,2020-11-29,AS,D,0.0,,0,,,,,...,1988.0,0,,,,,,0,1988.0,0
4,2020-11-29,AZ,A+,6634.0,6148.0,10,486.0,25568.0,25568.0,2458.0,...,2236325.0,18441,363824.0,,,,2236325.0,18441,,0


Cleaning up data to only include Minnesota instances and the appropriate attributes

In [65]:
columns_to_show = ['date','state','death','deathConfirmed','deathIncrease','hospitalizedCurrently','hospitalizedIncrease','negative'
                   ,'negativeIncrease','positive','positiveIncrease','totalTestResults','totalTestResultsIncrease']

clean_data = data[data['state'] == 'MN']
clean_data = clean_data[columns_to_show]
clean_data = clean_data.iloc[::-1]
#clean_data = clean_data.reset_index(drop=True)
clean_data['date'] = clean_data['date'].astype('datetime64[ns]')
clean_data = clean_data.set_index('date')
clean_data.head(21)
columns_to_sum = clean_data[['deathIncrease','hospitalizedIncrease','negativeIncrease','positiveIncrease','totalTestResultsIncrease']]
weekly_data = columns_to_sum.resample('W', label='right', closed='right').sum()
weekly_data = weekly_data.reset_index()
weekly_data.head(39)
for i in range(39):
    weekly_data[]

Unnamed: 0,date,deathIncrease,hospitalizedIncrease,negativeIncrease,positiveIncrease,totalTestResultsIncrease
0,2020-03-08,0,0,12,1,13
1,2020-03-15,0,0,1339,126,1465
2,2020-03-22,1,12,3124,221,3345
3,2020-03-29,8,63,14055,376,14431
4,2020-04-05,20,127,9524,492,10016
5,2020-04-12,41,159,9940,647,10587
6,2020-04-19,64,213,8258,1050,9308
7,2020-04-26,138,255,12873,2598,15471
8,2020-05-03,147,370,24272,3899,28171
9,2020-05-10,159,458,27676,4141,31817


In [4]:
## Breaking down clean data into each period (earliest days at bottom of dataset)

early_breakout_data = clean_data[183:]

summer_data = clean_data[90:182]

fall_data = clean_data[0:90]

bins = pd.cut(early_breakout_data['positiveIncrease'],4)

print(bins)


10273    (-0.973, 243.25]
10329     (243.25, 486.5]
10385     (486.5, 729.75]
10441     (486.5, 729.75]
10497     (729.75, 973.0]
               ...       
14782    (-0.973, 243.25]
14833    (-0.973, 243.25]
14884    (-0.973, 243.25]
14935    (-0.973, 243.25]
14981    (-0.973, 243.25]
Name: positiveIncrease, Length: 86, dtype: category
Categories (4, interval[float64]): [(-0.973, 243.25] < (243.25, 486.5] < (486.5, 729.75] < (729.75, 973.0]]


## Analysis

Important MN Stats:

- Population (mn.gov estimate): 5,680,337
- Land Area (estimate): 79,610.08 sq. mi.
- Population Density: 71.35 people/sq. mi.




Early Breakout Period: