## Quick demo of sim_chime_scenario_runner.py

Location: https://github.com/misken/c19/tree/master/mychime/scenario_runner

**sim_chime_scenario_runner.py** is a simple Python module for working with the penn_chime model
that: 

* assumes that you've pip installed `penn_chime` per https://github.com/CodeForPhilly/chime/pull/249 from a local clone of the chime repo
* allows running simulations from command line (like **cli.py** in penn_chime)
* is importable so can also run simulations via function call
* includes a few additional command line (or passable) arguments, including:
  - standard CHIME input config filename is a required input
  - a scenario name (prepended to output filenames)
  - output path
* after a simulation scenario is run, a results dictionary is created that contains:
  - the scenario name
  - the standard admits, census, and sim_sir_w_date dataframes
  - the dispositions dataframe
  - a dictionary containing the input parameters
  - a dictionary containing important intermediate variable values such as beta, doubling_time, ...
* writes out the results 
  - dataframes to csv
  - dictionaries to json
* (WIP) runs multiple scenarios corresponding to user specified ranges for one or more input variables.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns; sns.set()
import matplotlib.pyplot as plt

In [2]:
%matplotlib inline

## Example 1 - run from command line
Note that the config filename is a required argument. Here's what that file looks like for this scenario:

    --current-hospitalized 658
    --doubling-time 3.61
    --hospitalized-day 7
    --hospitalized-rate 0.025
    --icu-days 9
    --icu-rate 0.0075
    --market_share 0.32
    --infectious-days 14
    --n-days 120
    --relative-contact-rate 0.31
    --population 5026226
    --ventilated-day 10
    --ventilated-rate 0.005

In [5]:
# scenario = 'test_from_command_line'
!python sim_chime_scenario_runner.py tests/cli_inputs_semi_dt3.61.cfg --scenario test_from_command_line --output-path ./output/

2020-04-01 14:32:51,541 - penn_chime.models - INFO - Using doubling_time: 3.61
2020-04-01 14:32:51,557 - penn_chime.models - INFO - Set i_day = 34
2020-04-01 14:32:51,557 - penn_chime.models - INFO - Estimated date_first_hospitalized: 2020-02-27; current_date: 2020-04-01; i_day: 34
2020-04-01 14:32:51,557 - penn_chime.models - INFO - len(np.arange(-i_day, n_days+1)): 155
2020-04-01 14:32:51,557 - penn_chime.models - INFO - len(raw_df): 155
Scenario: test_from_command_line


Input parameters
--------------------------------------------------
{
    "current_hospitalized": 658,
    "relative_contact_rate": 0.31,
    "hospitalized": [
        0.025,
        7
    ],
    "icu": [
        0.0075,
        9
    ],
    "ventilated": [
        0.005,
        10
    ],
    "region": null,
    "population": 5026226,
    "current_date": "2020-04-01",
    "date_first_hospitalized": "2020-02-27",
    "doubling_time": 3.61,
    "infectious_days": 14.0,
    "market_share": 0.32,
    "max_y_axis": null

## Example 2 - run from function call
The basic steps are:

* import the `sim_chime_scenario_runner` module
* specify scenario name (if you don't, default is current datetime)
* create a `penn_chime.Parameters` object from the input config file using `create_params_from_file`
* call `sim_chime` to run the simulation and return results dictionary
* do whatever you want with the results
  - csv and json outputs just happen for command line use as in penn_chime cli.py
  - `write_results` function will write out all dataframes (to csv) and dicts (to json)
  - or selectively do whatever you want with components of the results dictionary

In [6]:
import sim_chime_scenario_runner as runner

In [7]:
scenario = 'test_from_jupyter_import'
p = runner.create_params_from_file("tests/cli_inputs_semi_dt3.61.cfg")

Let's look at the parameter values.

In [8]:
vars(p)

{'current_hospitalized': 658,
 'relative_contact_rate': 0.31,
 'hospitalized': Disposition(rate=0.025, days=7),
 'icu': Disposition(rate=0.0075, days=9),
 'ventilated': Disposition(rate=0.005, days=10),
 'region': None,
 'population': 5026226,
 'current_date': datetime.date(2020, 4, 1),
 'date_first_hospitalized': None,
 'doubling_time': 3.61,
 'infectious_days': 14.0,
 'market_share': 0.32,
 'max_y_axis': None,
 'n_days': 120,
 'recovered': 0,
 'labels': {'hospitalized': 'Hospitalized',
  'icu': 'ICU',
  'ventilated': 'Ventilated',
  'day': 'Day',
  'date': 'Date',
  'susceptible': 'Susceptible',
  'infected': 'Infected',
  'recovered': 'Recovered'},
 'dispositions': {'hospitalized': Disposition(rate=0.025, days=7),
  'icu': Disposition(rate=0.0075, days=9),
  'ventilated': Disposition(rate=0.005, days=10)}}

Run the simulation and capture the results.

In [9]:
model, results = runner.sim_chime(scenario, p)

2020-04-01 14:32:51,741 - penn_chime.models - INFO - Using doubling_time: 3.61
2020-04-01 14:32:51,773 - penn_chime.models - INFO - Set i_day = 34
2020-04-01 14:32:51,773 - penn_chime.models - INFO - Estimated date_first_hospitalized: 2020-02-27; current_date: 2020-04-01; i_day: 34
2020-04-01 14:32:51,774 - penn_chime.models - INFO - len(np.arange(-i_day, n_days+1)): 155
2020-04-01 14:32:51,774 - penn_chime.models - INFO - len(raw_df): 155


Here are the keys in the `results` dictionary.

In [10]:
results.keys()

dict_keys(['scenario', 'input_params_dict', 'intermediate_variables_dict', 'sim_sir_w_date_df', 'dispositions_df', 'admits_df', 'census_df'])

Let's check out a few of the dataframes to make sure they contain what we think they contain.

In [11]:
results['admits_df'].head()

Unnamed: 0,day,date,hospitalized,icu,ventilated
0,-34,2020-02-27,,,
1,-33,2020-02-28,0.283108,0.084932,0.056622
2,-32,2020-02-29,0.343034,0.10291,0.068607
3,-31,2020-03-01,0.415643,0.124693,0.083129
4,-30,2020-03-02,0.503619,0.151086,0.100724


In [12]:
results['admits_df'][30:45]

Unnamed: 0,day,date,hospitalized,icu,ventilated
30,-4,2020-03-28,72.84383,21.853149,14.568766
31,-3,2020-03-29,87.924841,26.377452,17.584968
32,-2,2020-03-30,106.042456,31.812737,21.208491
33,-1,2020-03-31,127.768758,38.330627,25.553752
34,0,2020-04-01,153.76552,46.129656,30.753104
35,1,2020-04-02,127.504933,38.25148,25.500987
36,2,2020-04-03,142.287727,42.686318,28.457545
37,3,2020-04-04,158.634826,47.590448,31.726965
38,4,2020-04-05,176.674125,53.002237,35.334825
39,5,2020-04-06,196.534309,58.960293,39.306862


In [13]:
results['census_df'].head()

Unnamed: 0,day,date,hospitalized,icu,ventilated
0,-34,2020-02-27,,,
1,-33,2020-02-28,1.0,1.0,1.0
2,-32,2020-02-29,1.0,1.0,1.0
3,-31,2020-03-01,2.0,1.0,1.0
4,-30,2020-03-02,2.0,1.0,1.0


In [14]:
results['census_df'][30:45]

Unnamed: 0,day,date,hospitalized,icu,ventilated
30,-4,2020-03-28,310.0,104.0,72.0
31,-3,2020-03-29,375.0,126.0,87.0
32,-2,2020-03-30,453.0,152.0,105.0
33,-1,2020-03-31,547.0,183.0,127.0
34,0,2020-04-01,659.0,221.0,153.0
35,1,2020-04-02,737.0,249.0,173.0
36,2,2020-04-03,819.0,279.0,194.0
37,3,2020-04-04,904.0,312.0,218.0
38,4,2020-04-05,993.0,347.0,243.0
39,5,2020-04-06,1084.0,384.0,270.0


In [15]:
results['sim_sir_w_date_df'].head()

Unnamed: 0,day,date,susceptible,infected,recovered
0,-34,2020-02-27,5026101.0,125.0,0.0
1,-33,2020-02-28,5026066.0,151.459955,8.928571
2,-32,2020-02-29,5026023.0,183.520642,19.74714
3,-31,2020-03-01,5025971.0,222.367416,32.855757
4,-30,2020-03-02,5025908.0,269.43644,48.739144


In [16]:
results['sim_sir_w_date_df'][30:45]

Unnamed: 0,day,date,susceptible,infected,recovered
30,-4,2020-03-28,4973717.0,39230.091324,13279.337147
31,-3,2020-03-29,4962726.0,47418.547125,16081.486527
32,-2,2020-03-30,4949471.0,57286.815104,19468.525608
33,-1,2020-03-31,4933500.0,69165.994505,23560.440972
34,0,2020-04-01,4914279.0,83446.256336,28500.869151
35,1,2020-04-02,4898341.0,93423.9261,34461.316033
36,2,2020-04-03,4880555.0,104536.754348,41134.453611
37,3,2020-04-04,4860725.0,116899.196593,48601.364636
38,4,2020-04-05,4838641.0,130633.519598,56951.30725
39,5,2020-04-06,4814074.0,145869.342577,66282.272935


In [17]:
results['dispositions_df'].head()

Unnamed: 0,day,date,hospitalized,icu,ventilated
0,-34,2020-02-27,1.0,0.3,0.2
1,-33,2020-02-28,1.283108,0.384932,0.256622
2,-32,2020-02-29,1.626142,0.487843,0.325228
3,-31,2020-03-01,2.041785,0.612536,0.408357
4,-30,2020-03-02,2.545405,0.763621,0.509081


Here's the intermediate variables dictionary.

In [18]:
results['intermediate_variables_dict']

{'intrinsic_growth_rate': 0.21167963995855832,
 'gamma': 0.07142857142857142,
 'beta': 5.6327600935024925e-08,
 'r_naught': 3.963514959419816,
 'r_t': 2.734825321999673,
 'doubling_time_t': 5.933509014640464}

Finally, here are the inputs we used. Note that, since we input the doubling time, the first hospitalized date is estimated by `penn_chime.SimSirModel`. You'll also see that it's a datetime and json hates that. So, when the dictionary gets written to a json file, the date is stringified.

In [19]:
results['input_params_dict']

{'current_hospitalized': 658,
 'relative_contact_rate': 0.31,
 'hospitalized': Disposition(rate=0.025, days=7),
 'icu': Disposition(rate=0.0075, days=9),
 'ventilated': Disposition(rate=0.005, days=10),
 'region': None,
 'population': 5026226,
 'current_date': datetime.date(2020, 4, 1),
 'date_first_hospitalized': datetime.date(2020, 2, 27),
 'doubling_time': 3.61,
 'infectious_days': 14.0,
 'market_share': 0.32,
 'max_y_axis': None,
 'n_days': 120,
 'recovered': 0,
 'labels': {'hospitalized': 'Hospitalized',
  'icu': 'ICU',
  'ventilated': 'Ventilated',
  'day': 'Day',
  'date': 'Date',
  'susceptible': 'Susceptible',
  'infected': 'Infected',
  'recovered': 'Recovered'},
 'dispositions': {'hospitalized': Disposition(rate=0.025, days=7),
  'icu': Disposition(rate=0.0075, days=9),
  'ventilated': Disposition(rate=0.005, days=10)}}

Write out all the results. Dataframes go to csv and dictionaries to json.

In [20]:
output_path = './output/' # default is current working directory
print("Writing out all results to {} for scenario --> {}".format(output_path, scenario))
runner.write_results(results, scenario, output_path)

Writing out all results to ./output/ for scenario --> test_from_jupyter_import


## Example 3 - run several scenarios for range of input values
I'm still working on this, but see the function `sim_chimes()` (plural) for the basic idea. I loop over an array of values for the social distancing parameter, run `sim_chime()` (singular) for each, and gather outputs in a big list of results dictionaries.