# Examples for covidApp

This notebook is for some useful examples about the analytical tools that covidApp has on the Python side. 

### 1. Statistical Analysis - OLS operations

There are some pre-defined OLS operations in Analytics.py. Each function has the following parameters: 

`categories, dateTo, region, rollingAverageInDays, lagInDays, dateFrom`.

These parameters are as follows: 

`categories`: A list of all the categories you would like to analyze. A list of the availible categories can be found by running `Analytics.categories()`, but the applicable categories here are only the numerical ones

`dateTo`: The ending date of the operation, in dateInt form - ex: `20210111` is equal to January 11 2021

`region`: The region code of the region you'd like to do analysis on.
 
`rollingAverageInDays`: The rolling average of the analysis

`lagInDays`: The lag between the comparison of the two categories

`dateFrom`: optional argument specifying a start date. The None argument will default to the earliest covid date on record.

`return`: regressionResults object; regression statistics can be called from this object.


In [1]:
import Analytics

results = Analytics.OLSBase(['death','positive'], 20210110, 'NY', 7, 3, dateFrom=None)

results.summary()

0,1,2,3
Dep. Variable:,positive,R-squared:,0.666
Model:,OLS,Adj. R-squared:,0.665
Method:,Least Squares,F-statistic:,604.7
Date:,"Mon, 11 Jan 2021",Prob (F-statistic):,3.62e-74
Time:,19:38:49,Log-Likelihood:,-4004.3
No. Observations:,305,AIC:,8013.0
Df Residuals:,303,BIC:,8020.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-2.238e+04,1.97e+04,-1.137,0.256,-6.11e+04,1.64e+04
death,20.8416,0.848,24.591,0.000,19.174,22.509

0,1,2,3
Omnibus:,107.214,Durbin-Watson:,0.001
Prob(Omnibus):,0.0,Jarque-Bera (JB):,249.277
Skew:,1.757,Prob(JB):,7.41e-55
Kurtosis:,5.695,Cond. No.,65400.0


In [2]:
results = Analytics.OLSBase(['death','positive'], 20210110, 'NY', 3, 10, dateFrom=20201118)

results.summary()

0,1,2,3
Dep. Variable:,positive,R-squared:,0.985
Model:,OLS,Adj. R-squared:,0.985
Method:,Least Squares,F-statistic:,2614.0
Date:,"Mon, 11 Jan 2021",Prob (F-statistic):,2.34e-37
Time:,19:38:49,Log-Likelihood:,-454.76
No. Observations:,41,AIC:,913.5
Df Residuals:,39,BIC:,917.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-2.519e+06,6.58e+04,-38.257,0.000,-2.65e+06,-2.39e+06
death,121.7203,2.381,51.129,0.000,116.905,126.536

0,1,2,3
Omnibus:,2.79,Durbin-Watson:,0.034
Prob(Omnibus):,0.248,Jarque-Bera (JB):,2.267
Skew:,-0.445,Prob(JB):,0.322
Kurtosis:,2.268,Cond. No.,716000.0


### 2. Dataframe Operations

Using much of the same parameters, `Main.formatDataFrame(categories, dateTo, region, dateFrom=None)` will return a dataframe of the data.

In [6]:
import Main

df = Main.formatDataFrame(['death','positive'], 20210110, 'NY', dateFrom=None)

df

Unnamed: 0,date,positive,death
0,2020-03-02,0,0.0
1,2020-03-03,1,0.0
2,2020-03-04,1,0.0
3,2020-03-05,3,0.0
4,2020-03-06,25,0.0
...,...,...,...
309,2021-01-05,1041028,30802.0
310,2021-01-06,1057676,30965.0
311,2021-01-07,1075312,31164.0
312,2021-01-08,1094144,31329.0


In [7]:
import Main

df = Main.formatDataFrame(['all'], 20210110, 'NY', dateFrom=None)

df

Unnamed: 0,index,date,state,positive,probableCases,negative,pending,totalTestResultsSource,totalTestResults,hospitalizedCurrently,...,posNeg,deathIncrease,hospitalizedIncrease,hash,commercialScore,negativeRegularScore,negativeScore,positiveScore,score,grade
0,315,2020-03-02,NY,0,0,0.0,0.0,totalTestEncountersViral,0,0.0,...,0,0,0,babe4eb188d2b1f7c4a7132f6aa7014dea998439,0,0,0,0,0,
1,314,2020-03-03,NY,1,0,0.0,0.0,totalTestEncountersViral,1,0.0,...,1,0,0,81e8b47758fdbdaf680958f231ad1a9818ee2f73,0,0,0,0,0,
2,313,2020-03-04,NY,1,0,9.0,24.0,totalTestEncountersViral,10,0.0,...,10,0,0,2b937a6a36868be82607e1b3f7bdd9ae80284233,0,0,0,0,0,
3,312,2020-03-05,NY,3,0,27.0,24.0,totalTestEncountersViral,30,0.0,...,30,0,0,bf95092ee12a921939937e34150c8cd3495b52d6,0,0,0,0,0,
4,311,2020-03-06,NY,25,0,97.0,236.0,totalTestEncountersViral,122,0.0,...,122,0,0,80c201ed89285d6f7d806fe1f33bd4172578ebd0,0,0,0,0,0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
309,6,2021-01-05,NY,1041028,0,25094838.0,0.0,totalTestEncountersViral,26135866,8590.0,...,26135866,154,0,0c31decd8559097d27b6da7cef458ef193f86e09,0,0,0,0,0,
310,5,2021-01-06,NY,1057676,0,25276006.0,0.0,totalTestEncountersViral,26333682,8665.0,...,26333682,163,0,aebaa06b8389cca3d87f0c4427a32ba031cc025b,0,0,0,0,0,
311,4,2021-01-07,NY,1075312,0,25496920.0,0.0,totalTestEncountersViral,26572232,8548.0,...,26572232,199,0,eb2d08a08127cca443dbd723c6c59ded6283d4f6,0,0,0,0,0,
312,3,2021-01-08,NY,1094144,0,25721991.0,0.0,totalTestEncountersViral,26816135,8561.0,...,26816135,165,0,01c3bfa6c6ca58fd783895a678684295a694cae8,0,0,0,0,0,


Here's a list of all the columns in this df: