### Estimate of Covid-19 Hospitalizations and Hospital Capacity

#### Aaron McAdie

The folks at the New York times just open sourced their [county-level Covid-19 case dataset](https://github.com/nytimes/covid-19-data), and i

In [1]:
import requests
import re
from bs4 import BeautifulSoup
import pandas as pd
import altair as alt

#### Dataset 1 - NYTimes Cases, thanks NYT!

In [2]:
cases = pd.read_csv('https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv')

In [3]:
cases.head()

Unnamed: 0,date,county,state,fips,cases,deaths
0,2020-01-21,Snohomish,Washington,53061.0,1,0
1,2020-01-22,Snohomish,Washington,53061.0,1,0
2,2020-01-23,Snohomish,Washington,53061.0,1,0
3,2020-01-24,Cook,Illinois,17031.0,1,0
4,2020-01-24,Snohomish,Washington,53061.0,1,0


In [4]:
cases.dtypes

date       object
county     object
state      object
fips      float64
cases       int64
deaths      int64
dtype: object

In [5]:
cases = cases.assign(
    date = pd.to_datetime(cases['date']),
    fips = cases['fips'].astype(pd.Int32Dtype())
)

Cases are cumulative, we want new cases each day to estimate the hospital case load

In [6]:
cases['cases_shifted'] = (
    cases.groupby(['county', 'state'])
    .cases
    .shift(1)
    .fillna(0)
    .astype(int)
)

In [7]:
cases['cases_new'] = cases['cases'] - cases['cases_shifted']

Check logic by plotting King County data

In [28]:
king = cases.query('county == "King" & state == "Washington"')

In [46]:
base = alt.Chart(king[['date', 'cases', 'cases_new']]).encode(alt.X('monthdate(date):O', title = 'Date'))
bars = base.mark_bar(color = '#65799b').encode(y = 'cases_new', tooltip = 'cases_new')
line = base.mark_line(color = '#e23e57').encode(y = 'cases', tooltip = 'cases')

(bars + line).properties(title = 'King County Cumulative and Daily Covid-19 Cases').interactive()