# Is the Indiana COVID-19 Positive Test Count Exponential?

## Package Dependencies

In addition to the Python standard library `math` package, we make use of Pandas and `statsmodels`.

In [1]:
import math

In [2]:
import pandas as pd

In [3]:
import statsmodels.api as sm

## Import data

The Excel spreadsheet was downloaded on November 11, 2020 at 3:12 pm EST.
    
https://hub.mph.in.gov/dataset/covid-19-case-data/resource/46b310b9-2f29-4a51-90dc-3886d9cf4ac1?view_id=6f6a3bc5-7901-4f5e-89a4-b60b9f6160be

We use `openpyxl` instead of the default `xlrd` because of a compatibility bug with `ElementTree` in Python 3.9.

In [4]:
indiana_covid = pd.read_excel('covid_report.xlsx', engine='openpyxl')

## Preprocess Data

Data are broken out by county, gender, and age range.  Roll them back up by date to get statewide totals.

In [5]:
daily_count = indiana_covid[['DATE', 'COVID_COUNT']].groupby('DATE').sum()

Compute a seven-day rolling average over the full data set.  We'll keep the about a month's worth of data (30 days).  To test for exponential growth, we use the old tried-and-true method of taking the log of both sides of $y = e^x$.

In [6]:
roll_avg_log = daily_count.rolling(min_periods=0, window=7).mean()[-30:]['COVID_COUNT'].apply(math.log)

## Fitting the Model

Take as $x$ the number of days since the first day in the list, October 12, 2020.

In [7]:
X = [i for i in range(len(roll_avg_log))]

We pull out the y-values from the Pandas series.

In [8]:
y = roll_avg_log.values

We do an ordinary least squares regression on the data and get a summary of the model.

In [9]:
model= sm.OLS(y, sm.add_constant(X)).fit()

In [10]:
model.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.975
Model:,OLS,Adj. R-squared:,0.974
Method:,Least Squares,F-statistic:,1078.0
Date:,"Wed, 11 Nov 2020",Prob (F-statistic):,6.75e-24
Time:,22:16:33,Log-Likelihood:,46.139
No. Observations:,30,AIC:,-88.28
Df Residuals:,28,BIC:,-85.48
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,7.2758,0.019,379.642,0.000,7.237,7.315
x1,0.0373,0.001,32.827,0.000,0.035,0.040

0,1,2,3
Omnibus:,6.776,Durbin-Watson:,0.145
Prob(Omnibus):,0.034,Jarque-Bera (JB):,2.962
Skew:,0.486,Prob(JB):,0.227
Kurtosis:,1.806,Cond. No.,33.0


## Extrapolating with the Model

Now define a Python function that computes this model back in exponential form.

In [11]:
def mymodel(x):
    return math.exp(0.0373 * x + 7.2758)

Compute the value of the model for a few days out to a month in the future.

In [12]:
[(i, mymodel(i)) for i in range(30, 61, 3)]

[(30, 4424.002020874059),
 (33, 4947.80835029241),
 (36, 5533.6338807518405),
 (39, 6188.821748602075),
 (42, 6921.584524989604),
 (45, 7741.107157819993),
 (48, 8657.66210215311),
 (51, 9682.738082153954),
 (54, 10829.184098589138),
 (57, 12111.37048695723),
 (60, 13545.369045064877)]