# Cohort Analyses

## References

|Link|Notes|
|---|---|
|https://finance.yahoo.com/news/hell-coming-mathematical-proof-185019616.html|Basic spread models|
|https://www.insidermonkey.com/blog/donald-trump-has-3-3-percent-chance-of-dying-from-coronavirus-817781/|Basic risk mdoel|

## Imports and Setup

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import geopandas as gpd
from ipypb import track

## Data

Data Sources (keep it simple and focus on building the models)
- https://covidtracking.com/api/
- New York Times

In [2]:
nytCountyData = "C:\\Users\\garfink2\\Documents\\data\\covid\\covid-19-data\\us-counties.csv"
shp_USCounties = "C:\\Users\\garfink2\\Documents\\data\\basemap\\tl_2017_us_county\\tl_2017_us_county.shp"
shp_USStates = "C:\\Users\\garfink2\\Documents\\data\\basemap\\tl_2017_us_state\\tl_2017_us_state.shp"

In [3]:
df_nytCountyData = pd.read_csv(nytCountyData)
gdf_USCounties = gpd.read_file(shp_USCounties)
gdf_USStates = gpd.read_file(shp_USStates)

In [12]:
df_nytCountyData.head()

Unnamed: 0,date,county,state,fips,cases,deaths
0,2020-01-21,Snohomish,Washington,53061.0,1,0
1,2020-01-22,Snohomish,Washington,53061.0,1,0
2,2020-01-23,Snohomish,Washington,53061.0,1,0
3,2020-01-24,Cook,Illinois,17031.0,1,0
4,2020-01-24,Snohomish,Washington,53061.0,1,0


In [10]:
pd.to_numeric(df_nytCountyData["fips"],errors="coerce")

0        53061.0
1        53061.0
2        53061.0
3        17031.0
4        53061.0
          ...   
15831    56029.0
15832    56033.0
15833    56037.0
15834    56039.0
15835    56043.0
Name: fips, Length: 15836, dtype: float64

In [7]:
df_nytCountyData["fips"] = df_nytCountyData["fips"].astype("int")

ValueError: Cannot convert non-finite values (NA or inf) to integer

In [5]:
gdf_USCounties.head()

Unnamed: 0,STATEFP,COUNTYFP,COUNTYNS,GEOID,NAME,NAMELSAD,LSAD,CLASSFP,MTFCC,CSAFP,CBSAFP,METDIVFP,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
0,31,39,835841,31039,Cuming,Cuming County,6,H1,G4020,,,,A,1477641638,10701538,41.9158651,-96.7885168,"POLYGON ((-97.01952 42.00410, -97.01952 42.004..."
1,53,69,1513275,53069,Wahkiakum,Wahkiakum County,6,H1,G4020,,,,A,680956787,61588406,46.2946377,-123.4244583,"POLYGON ((-123.43639 46.23820, -123.44759 46.2..."
2,35,11,933054,35011,De Baca,De Baca County,6,H1,G4020,,,,A,6016761648,29147345,34.3592729,-104.3686961,"POLYGON ((-104.56739 33.99757, -104.56772 33.9..."
3,31,109,835876,31109,Lancaster,Lancaster County,6,H1,G4020,339.0,30700.0,,A,2169252486,22867561,40.7835474,-96.6886584,"POLYGON ((-96.91060 40.95841, -96.91060 40.958..."
4,31,129,835886,31129,Nuckolls,Nuckolls County,6,H1,G4020,,,,A,1489645186,1718484,40.1764918,-98.0468422,"POLYGON ((-98.27367 40.08940, -98.27367 40.089..."


## Assumptions

In [None]:
rate_fatality = 0.8/100 # https://finance.yahoo.com/news/hell-coming-mathematical-proof-185019616.html
days_to_onset = 5
days_to_hospitalization = 10
days_to_outcome = 24

## Analyses

Estimating cohort risk

- Assume $P_{{infection}}$ is homogenous and homogenous both spatially and across population subsets
- Specify assumed rates for $P_{{{{hospitalization}}_{{age}}}}$ and $P_{{{{death}}_{{age}}}}$
- Specify assumed rates for $P_{{{{absence}}_{{non-illness}}}}$