<a href="https://colab.research.google.com/github/matthewberry/uiuc_com_dsp/blob/master/DSP_COVID_hospital_resources.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Simulating impact of COVID-19 on hospital resources in British Columbia

*This notebook is based on [work](https://github.com/nammnjoshii/Simulating-impact-of-COVID-19-restrictions-relaxation-on-hospital-resources-in-British-Columbia/blob/master/Sim%20Covid19.ipynb) published by Adnan Beg, Nammn Joshii, Teguh Samudra, and Pratibha Thakur.*


The authors of this notebook model many interrelated aspects of the pandemic. As you proceed, pay particular attention to the assumptions employed. Some are called out explicitly in the text, whereas others are evident only in reading the code. Consider the strength of the assumptions--which might be worth improving, and how? Also consider how the simulation could be adapted to examine other ways COVID-19 has stressed the healthcare system.

## Introduction 

The COVID-19 pandemic has stressed healthcare systems around the world in many ways. This notebook considers one stressor, the demand for ventilators.

British Columbia registered its first COVID-19 case on January 21st, 2020. By the time this notebook was created on April 11, 2020, case count there had increased exponentially to 1445. During that same period, Italy's own exponential case growth had overwhelmed hospitals, worsening patient outcomes.

## Objective

To predict the number of COVID-19 patients who will require critical care in British Columbia, Canada, in order to support resource planning for management of the pandemic. 

## Assumptions

- The simulation uses data from April 11, 2020.
- An infected person might or might not become critically ill. Pre-existing conditions increase the likelihood of critical illness.
- The limiting hospital resource will be ventilator-capable critical care beds.
- All critical patients will need a ventilator-capable critical care bed for 14 days.
- A COVID-19 case is closed when the infected person recovers or dies.
- A person who recovers from COVID-19 cannot be infected again.

## Approach

- We first examine the current situation in British Columbia with respect to confirmed coronavirus cases and hospital resources available. 
- We then simulate age and pre-existing health conditions to determine the criticality of the patients. 
- Finally, we assess the critical-patient demand on hospital resources under three scenarios. 

## Sources

[1] https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.html

[2] https://resources-covid19canada.hub.arcgis.com/pages/demographics

[3] https://www.indexmundi.com/canada/age_structure.html

[4] https://www.worldometers.info/coronavirus/coronavirus-age-sex-demographics/

[5] https://ici.radio-canada.ca/info/2020/coronavirus-covid-19-pandemie-cas-carte-maladie-symptomes-propagation/

[6] https://www.theglobeandmail.com/canada/british-columbia/article-bc-should-have-enough-beds-and-ventilators-for-covid-19-patients/

# Import Packages

The cell below will import packages used throughout the simulation. You can ignore its output.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import random
from scipy.stats import truncnorm
from collections import Counter
import math
import seaborn as sns
import requests
import IPython.display as Disp

# Import Dataset

The cell below loads case data from a CSV file and drops portions of the table that won't be used in the simulation.

The `covid_bc.info()` command at the end will print summary information about the trimmed table.

In [None]:
raw = pd.read_csv("https://drive.google.com/uc?id=10cTQ0fNLWI4k0K8JG7QaTMa84YWYBHvj")
drop_col = ['pruid', 'prnameFR', 'numprob', 'percentrecover', 'ratetested']
covid_bc = raw[raw['prname'] == "British Columbia"].drop(drop_col, axis = 1)
covid_bc.info()

As of April 11th, 2020, there were 1,445 confirmed positive COVID-19 cases in British Columbia. 963 of these cases were closed (905 in recovery, and 58 in death).

On that same day, British Columbia had 348 ventilator-capable critical care beds available for COVID-19 patients.

The next few cells will plot the progression of some of these quantities over the first weeks of the pandemic.

_Note: If you're viewing this notebook in dark mode and can't read labels on the plots, try switching to light mode. From the menu near the top-left corner of the screen, click on Tools and select Settings. In the Site section of the settings window, select "light" as the Theme and then press the save button._ 


In [None]:
plt.figure(figsize=(15,6))
plt.plot(covid_bc['date'], covid_bc['numconf'])
plt.title("Number of Confirmed Cases")
plt.xticks(rotation = 45)
plt.show()

In [None]:
plt.figure(figsize=(15,6))
plt.plot(covid_bc['date'], covid_bc['numdeaths'])
plt.title("Number of Deaths")
plt.xticks(rotation = 45)
plt.show()

In [None]:
plt.figure(figsize=(15,6))
plt.plot(covid_bc['date'], covid_bc['numtested'])
plt.title("Number of Tested Patients")
plt.xticks(rotation=45)
plt.show()

In [None]:
plt.figure(figsize=(15,6))
plt.plot(covid_bc['date'], covid_bc['numtoday'])
plt.title("Number of New Cases")
plt.xticks(rotation=45)
plt.show()

In [None]:
plt.figure(figsize=(15,6))
plt.plot(covid_bc['numconf'], covid_bc['numtoday'])
plt.title('Number of Total Cases vs Number of Daily Cases')
plt.xlabel("Total Cases")
plt.ylabel("Daily Cases")
plt.show()

The next cell extends the table with a new column named 'todaytest', which will contain the number of tests each day. It also calculates basic statistics of the new column, which will be used later to simulate testing. Finally, it prints the basic statistics.

In [None]:
covid_bc['todaytest'] = covid_bc['numtested'].diff()
DAILY_TEST_AVG = np.mean(covid_bc['todaytest'])
DAILY_TEST_STD = np.floor(np.std(covid_bc['todaytest']))
DAILY_TEST_MAX = np.max(covid_bc['todaytest'])
print("On average, there are {} tests administered every day".format(np.floor(DAILY_TEST_AVG)))
print("with a standard deviation of {}".format(DAILY_TEST_STD))
print("and a maximum of {} tests".format(DAILY_TEST_MAX))

# Create Population Pyramid

COVID-19 is found in people of all ages, but critical cases are more likely among people who are older or have pre-existing conditions. Code in this section will build a model population with a range of ages and pre-existing conditions. 

First, using the age group information found in [2] as a reference, we will approximate the age distribution of Vancouver residents. The image below shows a population pyramid for Canada as a whole.

In [None]:
img_url = 'https://drive.google.com/uc?id=1UNR1-7-6RVuWcPo9Yky3-yUfhkjr5fQr'
Disp.Image(requests.get(img_url).content)

## Define Age Groups

Note we will not separate our simulated population by gender.

In [None]:
np.random.seed(60)
# Age group = 0-4
AGE_GRP1_SIZE = 228784
group1_dist = list(np.floor(np.random.uniform(0, 4, size=AGE_GRP1_SIZE)))

# Age group = 5-9
AGE_GRP2_SIZE = 236493
group2_dist = list(np.floor(np.random.uniform(5, 9, size=AGE_GRP2_SIZE)))

# Age Group = 10-14
AGE_GRP3_SIZE = 240928
group3_dist = list(np.floor(np.random.uniform(10, 14, size=AGE_GRP3_SIZE)))

# Age Group = 15-19 
AGE_GRP4_SIZE = 255862
group4_dist = list(np.floor(np.random.uniform(15, 19, size=AGE_GRP4_SIZE)))

# Age Group = 20-24
AGE_GRP5_SIZE = 324323
group5_dist = list(np.floor(np.random.uniform(20, 24, size=AGE_GRP5_SIZE)))

# Age Group 25-29
AGE_GRP6_SIZE = 339551
group6_dist = list(np.floor(np.random.uniform(25, 29, size=AGE_GRP6_SIZE)))

# Age Group 30-34 
AGE_GRP7_SIZE = 343054
group7_dist = list(np.floor(np.random.uniform(30, 34, size=AGE_GRP7_SIZE)))

# Age Group 35-39
AGE_GRP8_SIZE = 341241
group8_dist = list(np.floor(np.random.uniform(35, 39, size=AGE_GRP8_SIZE)))

# Age Group 40-44
AGE_GRP9_SIZE = 314422
group9_dist = list(np.floor(np.random.uniform(40, 44, size=AGE_GRP9_SIZE)))

# Age Group 45-49
AGE_GRP10_SIZE = 322176
group10_dist = list(np.floor(np.random.uniform(45, 49, size=AGE_GRP10_SIZE)))

# Age Group 50-54 
AGE_GRP11_SIZE = 329941
group11_dist = list(np.floor(np.random.uniform(50, 54, size=AGE_GRP11_SIZE)))

#Age Group 55-59
AGE_GRP12_SIZE = 358588
group12_dist = list(np.floor(np.random.uniform(55, 59, size=AGE_GRP12_SIZE)))

# Age Group 60-64
AGE_GRP13_SIZE = 337108
group13_dist = list(np.floor(np.random.uniform(60, 64, size=AGE_GRP13_SIZE)))

# Age 65-69
AGE_GRP14_SIZE = 292760
group14_dist = list(np.floor(np.random.uniform(65, 69, size=AGE_GRP14_SIZE)))

# Age 70-74
AGE_GRP15_SIZE = 244029
group15_dist = list(np.floor(np.random.uniform(70, 74, size=AGE_GRP15_SIZE)))

# Age Group 75-79
AGE_GRP16_SIZE = 168673
group16_dist = list(np.floor(np.random.uniform(75, 79, size=AGE_GRP16_SIZE)))

# Age Group 80-84
AGE_GRP17_SIZE = 114197
group17_dist = list(np.floor(np.random.uniform(80, 84, size=AGE_GRP17_SIZE)))

# Age Group 85+
AGE_GRP18_SIZE = 120270
group18_dist = list(np.floor(np.random.uniform(85, 89, size=AGE_GRP18_SIZE)))

# Total Population
TOTAL_POPULATION =  AGE_GRP1_SIZE + AGE_GRP2_SIZE + AGE_GRP3_SIZE + AGE_GRP4_SIZE + AGE_GRP5_SIZE + AGE_GRP6_SIZE + AGE_GRP7_SIZE + AGE_GRP8_SIZE + AGE_GRP9_SIZE + AGE_GRP10_SIZE + AGE_GRP11_SIZE + AGE_GRP12_SIZE + AGE_GRP13_SIZE + AGE_GRP14_SIZE + AGE_GRP15_SIZE + AGE_GRP16_SIZE + AGE_GRP17_SIZE + AGE_GRP18_SIZE 
TOTAL_POPULATION

In [None]:
age_combined_list = group1_dist + group2_dist + group3_dist + group4_dist + group5_dist + group6_dist + group7_dist + group8_dist + group9_dist + group10_dist + group11_dist + group12_dist + group13_dist + group14_dist + group15_dist + group16_dist + group17_dist + group18_dist

In [None]:
plt.hist(age_combined_list, bins=list(range(0, 105, 5)))
plt.xlabel("Age")
plt.ylabel("Population")
plt.title("Vancouver Population Pyramid")
plt.show()

# Simulate Pre-Existing Conditions

Using the information found in [4], we will simulate a binomial draw for pre-existing conditions. 

Each condition will be represented by a boolean array (using values 0 and 1) that will be merged with our age data.

## Probabilities

Note that the probabilities used in our model are not dependent on age.

In [None]:
CANCER_PROB = 0.0046
HYPER_PROB = 0.3
DIA_PROB = 0.1
CARDIO_PROB = 0.069
RESPIR_PROB = 0.1
np.random.seed(60)

In [None]:
have_cancer = list(np.random.binomial(1, CANCER_PROB, TOTAL_POPULATION))
Counter(have_cancer)

In [None]:
have_hypertension = list(np.random.binomial(1, HYPER_PROB, TOTAL_POPULATION))
Counter(have_hypertension)

In [None]:
have_diabetes = list(np.random.binomial(1, DIA_PROB, TOTAL_POPULATION))
Counter(have_diabetes)

In [None]:
have_cardio = list(np.random.binomial(1, CARDIO_PROB, TOTAL_POPULATION))
Counter(have_cardio)

In [None]:
have_respir = list(np.random.binomial(1, RESPIR_PROB, TOTAL_POPULATION))
Counter(have_respir)

## Combined Simulated Data

The cell below will create a single data frame with one row for each person and one column for each of the risk factors.

In [None]:
df = pd.DataFrame({
    'age': age_combined_list,
    'have_cancer': have_cancer,
    'have_cardio': have_cardio,
    'have_diabetes': have_diabetes,
    'have_hypertension': have_hypertension,
    'have_respir': have_respir
})

After merging the ages with the pre-existing conditions, the first few rows of the data frame look like this:

In [None]:
df.head()

# Mortality Approximation

We begin by plotting the probability of death as a function of age, based on [4].

In [None]:
# Create a list containing integers from 0 to 100 to represent age of patients
age_range = list(np.arange(100))
# Create a list that represents mortality rate for different age groups
# Source: [4]
age_death_rate = np.repeat(np.array([0, 0.2, 0.4, 1.3, 3.6, 8.0, 14.8]),
                           [10, 30, 10, 10, 10, 10, 20])
plt.plot(age_range,age_death_rate)
plt.xlabel("Age")
plt.ylabel("% Chance of Death")
plt.title("Probability of Death by Age")
plt.show()

In our COVID-19 simulation, we will define a critical patient as someone who has a high probability of dying as a function of their age and number of pre-existing conditions. 

The figure above only shows this probability as a function of age. We will next define a function that does two things:

1.   Approximates the graph above for the case of zero pre-existing conditions.
2.   Increases the probability of death for each additional pre-existing condition present.

##Approximation with an exponential function

The cell below defines a function that returns a probability of dying based on age and number of pre-existing conditions.

In [None]:
import math
def get_death_prob(age, num_preexisting_conditions):
    """
    Calculates the probability of dying based on an individual's age and number
    of pre-existing conditions.
    
    Arguments:
        age (float): The individual's age.
        num_preexisting_conditions (int): The number of pre-existing conditions
            the individual has.
    
    Returns:
        float: The probability of dying.
    """
    return (2.5*(1+num_preexisting_conditions))*math.exp(0.02*age-0.12) - 2.183

## Probability of Death as a Function of Age Only

The cell below will plot the death probability from [4] along with the death probability from our function for the case of zero pre-existing conditions.

In [None]:
death_rate_model = [get_death_prob(age, 0) for age in age_range]
plt.plot(age_range, age_death_rate)
plt.plot(age_range, death_rate_model)
plt.ylabel("% Chance of Death")
plt.title("Probability of Death by Age")
plt.legend(('Actual Percentage', 'Approximation'))
plt.show()

## Probability of Death Including Number of Pre-Existing Conditions

The cell below adds curves for one, two, three, four, and five pre-existing conditions.

In [None]:
death_rate_model = [get_death_prob(age, 0) for age in age_range]
death_rate_model1 = [get_death_prob(age, 1) for age in age_range]
death_rate_model2 = [get_death_prob(age, 2) for age in age_range]
death_rate_model3 = [get_death_prob(age, 3) for age in age_range]
death_rate_model4 = [get_death_prob(age, 4) for age in age_range]
death_rate_model5 = [get_death_prob(age, 5) for age in age_range]
plt.plot(age_range, age_death_rate)
plt.plot(age_range, death_rate_model)
plt.plot(age_range, death_rate_model1)
plt.plot(age_range, death_rate_model2)
plt.plot(age_range, death_rate_model3)
plt.plot(age_range, death_rate_model4)
plt.plot(age_range, death_rate_model5)
plt.xlabel("Age")
plt.ylabel("% Chance of Death")
plt.legend(('Actual Percentage', 
            'Pre-Existing Cond = 0', 'Pre-Existing Cond = 1', 
            'Pre-Existing Cond = 2', 'Pre-Existing Cond = 3', 
            'Pre-Existing Cond = 4', 'Pre-Existing Cond = 5'))
plt.title("Probability of Death by Age")
plt.show()

As the number of pre-existing conditions increases, so does the probability that the patient will die.

# Data Sampling

Before running a simulation on the full dataset, which will take some time, let's work with a reduced set containing 1000 rows drawn at random from the simulated data frame.

In [None]:
df_sampled = df.sample(n=1000, random_state=60)
df_sampled.head()

The function we defined in the previous section assumes that the probability of death depends not on the specific pre-existing condition or conditions that an individual has, but rather only on the **number** of pre-existing conditions. In this step, we will sum the number of pre-existing conditions for each individual.

In [None]:
# sum the number of pre-condition
df_sampled['total_preCond'] = df_sampled['have_cancer'] + df_sampled[
    'have_cardio'] + df_sampled['have_diabetes'] + df_sampled[
        'have_hypertension'] + df_sampled['have_respir']
sns.catplot(x='total_preCond', kind='count', data=df_sampled)
plt.xlabel('Number of Pre-Existing Conditions')
plt.ylabel('Number of People')
plt.show()

Keep in mind these counts are for our reduced set of 1000 rows.

The next cell defines a function that simulates critical patients in a population, based on each person's age and number of pre-existing conditions.

In [None]:

def calculate_critical_patients(df, age_col, preexist_cond_col):
    """
    Given a data frame that contains age and number of pre-existing conditions,
    returns a data frame that adds a new column named 'is_critical' that has value
    0 for individuals who will not die in the simulation and 1 for individuals
    at risk of dying in the simulation.

    Arguments:
        df (pandas.DataFrame): A data frame containing the population data.
        age_col (str): The name of the column in `df` containing ages.
        preexist_cond_col (str): The name of the column in `df` containing the
            number of pre-existing conditions.
    """
    
    return_val = df.copy()
    return_val['is_critical'] = return_val.apply(lambda row: \
        np.random.binomial(1, get_death_prob(row[age_col], row[preexist_cond_col]) / 100, size=1)[0], \
        axis=1)

    return return_val

The next cell uses the newly defined function to simulate critical patients among our sample of the population.

In [None]:
df_sampled_critical = calculate_critical_patients(df_sampled, 'age', 'total_preCond')

## Sampled Data Result

The next cell will plot our critical and non-critical patients by age.

In [None]:
sns.swarmplot('is_critical', 'age', data=df_sampled_critical)
plt.title('1000 Samples')
plt.show()

In the swarm plot above, the dots represent single patients in the sampled group. Blue dots represent people who were simulated to recover without ICU care if positive for COVID-19, while yellow dots represent people who were simulated to need ICU care if positive for COVID-19. 

The people in yellow constitute our critical group.

We now need to extend this analysis to the entire population of Vancouver.

# Analysis on the Entire Population

Working with the entire population now, we will sum the pre-existing conditions for each person and plot the data.

In [None]:
# create a new column containing the number of pre-existing conditions
df['total_preCond'] = df['have_cancer'] + df['have_cardio'] + df[
    'have_diabetes'] + df['have_hypertension'] + df['have_respir']

sns.catplot(x='total_preCond', kind='count', data=df)
plt.xlabel('Number of Pre-Existing Conditions')
plt.ylabel('Number of People (Millions)')
plt.show()

The exact counts from the histogram above are as follows:

In [None]:
Counter(df['total_preCond'])

The next cell models the critical patients within the entire population. It will take about 3 minutes to complete.

In [None]:
np.random.seed(60)
df_critical = calculate_critical_patients(df, 'age', 'total_preCond')

## Critical Patients

Let's plot the number of non-critical and critical patients in our simulation of the entire population.

In [None]:
sns.catplot(x='is_critical', kind='count', data=df_critical)
plt.title("Non-Critial and Critical Patients")
plt.show()

Here are the precise counts:

In [None]:
Counter(df_critical['is_critical'])

The next five cells define functions that will be used in the subsequent modeling. Read the comments and the code as you run each of them.

In [None]:
def simulate_counts(mean, stdev, maximum, size):
    """This function simulates a truncated distribution that has a range of [minimum, maximum],
    with mean and std. dev"""

    minimum = 0
    # truncnorm requires shape parameters a and b
    param_a = (minimum - mean) / stdev
    param_b = (maximum - mean) / stdev

    variates = truncnorm.rvs(param_a, param_b, loc=mean, scale=stdev, size=size)

    return [int(x) for x in variates]

In [None]:
def simulate_testing(df, period, infected_fraction_initial, infected_fraction_final,
                     daily_test_avg, daily_test_std, daily_test_max):
    """This function simulates testing over a period of days.

    Arguments:
        df (pandas.DataFrame): A data frame as returned by calculate_critical_patients.
        period (int): The number of days to simulate.
        infected_fraction_initial (float): The fraction of the population that is
            infected at the beginning of the simulation.
        infected_fraction_final (float): The fraction of the population that is
            infected at the end of the simulation. The fraction will increase linearly,
            day by day.
        daily_test_avg (float): The mean of the number of tests conducted per day.
        daily_test_std (float): The standard deviation of the number of tests
            conducted per day.
        daily_test_max (float): The max of the number of tests conducted per day.

    Returns:
        pandas.DataFrame: A data frame copied from the input `df` with additional
            columns 'testing_day' and 'is_positive'.
    """
    df_output = df.copy()
    df_output['testing_day'] = None
    df_output['is_positive'] = None
    # set column types
    df_output.testing_day = df_output.testing_day.astype('Int64')
    df_output.is_positive = df_output.is_positive.astype('Int64')

    infected_fraction = infected_fraction_initial
    infected_fraction_step = (infected_fraction_final - infected_fraction_initial) / (1.0 * period)
    
    test_counts = simulate_counts(daily_test_avg, daily_test_std, daily_test_max, period)
    for testing_day in range(period):
        test_count = test_counts[testing_day]
        
        # in this simulation, a person will be tested only once during `period`
        # for this day, select a random subset of the untested people to
        # be tested now
        df_untested = df_output[df_output['testing_day'].isna()]
        patient_ids_today = df_untested.sample(test_count).index.values

        # simulate the test results
        test_results = np.random.binomial(1, infected_fraction, size=test_count)

        # record the test results
        df_output.loc[patient_ids_today, 'testing_day'] = testing_day
        df_output.loc[patient_ids_today, 'is_positive'] = test_results
    
        infected_fraction += infected_fraction_step #FIXME move to bottom
    
    return df_output

In [None]:
def print_summary(df_simulated):
    """
    Given a data frame as returned by simulate_testing, prints a summary of
    test results and critical cases.
    """
    print("Over the next {} days, there will be a total of {} tests administered.".format(\
        df_simulated['testing_day'].max(), df_simulated['testing_day'].count()))

    df_positive = df_simulated[df_simulated['is_positive'] == 1]
    df_critical = df_positive[df_positive['is_critical'] == 1]

    print("Of the total administered tests, {} ({}%) will be positive.".format(\
        df_positive.shape[0],\
        round(df_positive.shape[0] / df_simulated['testing_day'].count() * 100, 3)))

    print("Of the total positive tests, {} ({}%) will be critical.".format(\
        df_critical.shape[0],\
        round(df_critical.shape[0] / df_positive.shape[0] *100, 3)))

In [None]:
def calculate_daily_aggregations(df_simulated):
    """
    Given a data frame as returned by simulate_testing, creates and returns a new
    data frame with daily aggregations.
    """
    df_positive = df_simulated[df_simulated['is_positive'] == 1]
    df_aggs = df_positive.groupby(['testing_day'])[['is_positive', 'is_critical']].sum()
    # at this point, there will be no nans in df_aggs, and we'll need to
    # convert from Int64 to int64 for cumsum()
    df_aggs.is_positive = df_aggs.is_positive.astype('int64')
    df_aggs.is_critical = df_aggs.is_critical.astype('int64')
    # rename the aggregation columns
    df_aggs.rename(columns={'is_positive': 'today_positive', 'is_critical': 'today_critical'}, inplace=True)
    df_aggs['cumulative_positive'] = df_aggs['today_positive'].cumsum()
    df_aggs['cumulative_critical'] = df_aggs['today_critical'].cumsum()
    return df_aggs

In [None]:
def simulate_case_closures_and_icu_burden(df_daily, death_avg, death_std, death_max,
                                          recov_avg, recov_std, recov_max,
                                          initial_icu_cases):
    """
    Given a data frame as returned by calculate_daily_aggregations, creates a copy
    of the data frame that also includes new columns with simulated daily deaths,
    recoveries, and ICU burden.

    Arguments:
        df_daily (pandas.DataFrame): A data frame as returned by
            calculate_daily_aggregations.
        death_avg (float): The mean of the number of deaths per day.
        death_std (float): The standard deviation of the number of deaths per day.
        death_max (float): The max of the number of deaths per day.
        recov_avg (float): The mean of the number of recoveries per day.
        recov_std (float): The standard deviation of the number of recoveries per day.
        recov_max (float): The max of the number of recoveries per day.
        initial_icu_cases (int): The number of ICU cases on day 0.

    Returns:
        pandas.DataFrame: A data frame copied from the input `df` with additional
            columns 'today_recovered', 'today_dead', 'today_closed',
            'cumulative_closed', and 'today_icu'.
    """
    df_output = df_daily.copy()
    df_output['today_recovered'] = simulate_counts(RECOV_AVG, RECOV_STD, RECOV_MAX, df_daily1.shape[0])
    df_output['today_dead'] = simulate_counts(DEATH_AVG, DEATH_STD, DEATH_MAX, df_daily1.shape[0])

    df_output['today_closed'] = df_output['today_recovered'] + df_output['today_dead']
    df_output['cumulative_closed'] = df_output['today_closed'].cumsum()
    df_output['today_icu'] = initial_icu_cases + \
        df_output['cumulative_critical'] - df_output['cumulative_closed']

    return df_output

# Simulate scenarios

Now we will simulate three scenarios representing different levels of social distancing.

All simulations will share the constants defined in the next cell.

In [None]:
# constants used in all scenarios
DEATH_AVG = 3.29
DEATH_STD = 1.5
DEATH_MAX = 5
RECOV_AVG = 0.99
RECOV_STD = 0.62
RECOV_MAX = 2.37
INITIAL_ICU_CASES = 63

#### Scenario 1: Restrictions are relaxed but not completely removed

In this scenario, we assume 3% of the population is infected now and 6% of the population will be infected on day 45.

The next cell will simulate the results of testing over the 45-day period. It will then print a table showing positivity and criticality counts.

In [None]:
# Simulate the daily cases for Scenario 1

np.random.seed(123)

df_simulated1 = simulate_testing(df_critical, 45, 0.03, 0.06, DAILY_TEST_AVG,
                                 DAILY_TEST_STD, DAILY_TEST_MAX)

df_simulated1.groupby(['is_positive', 'is_critical']).size()

The next cell will print a summary of the scenario's results.

In [None]:
print_summary(df_simulated1)

The next cell will aggregate the number of cases and critical case for each day.

In [None]:
df_daily1 = calculate_daily_aggregations(df_simulated1)

The next cell will simulate daily ICU cases, recoveries, and deaths.

In [None]:
np.random.seed(123)

df_daily1 = simulate_case_closures_and_icu_burden(\
    df_daily1, DEATH_AVG, DEATH_STD, DEATH_MAX, RECOV_AVG, RECOV_STD, RECOV_MAX,\
    INITIAL_ICU_CASES)
df_daily1

#### Scenario 2: Restrictions stay the same for the next 20 days

In this scenario, we assume 3% of the population is infected now and 0.5% of the population will be infected on day 45.

It will follow the same steps as scenario 1 but use different parameters.

In [None]:
# Simulate the daily cases for Scenario 2

np.random.seed(123)

df_simulated2 = simulate_testing(df_critical, 45, 0.03, 0.005, DAILY_TEST_AVG,
                                 DAILY_TEST_STD, DAILY_TEST_MAX)

df_simulated2.groupby(['is_positive', 'is_critical']).size()

In [None]:
print_summary(df_simulated2)

In [None]:
df_daily2 = calculate_daily_aggregations(df_simulated2)

In [None]:
np.random.seed(123)

df_daily2 = simulate_case_closures_and_icu_burden(\
    df_daily2, DEATH_AVG, DEATH_STD, DEATH_MAX, RECOV_AVG, RECOV_STD, RECOV_MAX,\
    INITIAL_ICU_CASES)
df_daily2

#### Scenario 3: Restrictions are eliminated

In this scenario, we assume 3% of the population is infected now and 20% of the population will be infected on day 45.

This scenario will also use the same steps as scenario 1 with new parameters.

In [None]:
# Simulate the daily cases for Scenario 3

np.random.seed(123)

df_simulated3 = simulate_testing(df_critical, 45, 0.03, 0.2, DAILY_TEST_AVG,
                                 DAILY_TEST_STD, DAILY_TEST_MAX)

df_simulated3.groupby(['is_positive', 'is_critical']).size()

In [None]:
print_summary(df_simulated3)

In [None]:
df_daily3 = calculate_daily_aggregations(df_simulated3)

In [None]:
np.random.seed(123)

df_daily3 = simulate_case_closures_and_icu_burden(\
    df_daily3, DEATH_AVG, DEATH_STD, DEATH_MAX, RECOV_AVG, RECOV_STD, RECOV_MAX,\
    INITIAL_ICU_CASES)
df_daily3

#### Comparison of scenarios

The next two cells generate plots comparing the three scenarios we have modeled.


In [None]:

plt.figure(figsize=(8,6))
plt.plot(df_daily1.index, df_daily1['cumulative_positive'], label="Scenario 1")
plt.plot(df_daily2.index, df_daily2['cumulative_positive'], label="Scenario 2")
plt.plot(df_daily3.index, df_daily3['cumulative_positive'], label="Scenario 3")
plt.xlabel("Days from Today")
plt.ylabel("Cumulative Positive Patients")
plt.title("Positive Patients")
plt.legend(loc="center left", bbox_to_anchor=(1,0.5))
plt.show()

In [None]:

plt.figure(figsize=(8,6))
plt.plot(df_daily1.index, df_daily1['today_icu'], label="Scenario 1")
plt.plot(df_daily2.index, df_daily2['today_icu'], label="Scenario 2")
plt.plot(df_daily3.index, df_daily3['today_icu'], label="Scenario 3")
# plot horizontal line representing available hospital resources
plt.plot(df_daily3.index, [348] * df_daily3.shape[0], 'k-', lw=1.5)
plt.annotate('Available Hospital Resources', xy=(1, 360), xycoords='data')
plt.title('Scenario Analysis: Critical Patients vs Hospital Resources Available')
plt.ylabel('Number of Critical Patients')
plt.xlabel("Days from Today")
plt.legend(loc="center left", bbox_to_anchor=(1,0.5))
plt.show()

In scenario 3, the number of critical patients exceeds the number of currently available ventilator-capable critical care beds. This modeling could guide decisions on public policy--for example, rules about social distancing--and on equipment acquisition and allocation.