## Quick demo of sim_chime_scenario_runner

Location: https://github.com/misken/c19/tree/master/mychime/sim_chime_scenario_runner

**sim_chime_scenario_runner.py** is a simple Python module for working with the penn_chime model
that: 

* assumes that you've pip installed `penn_chime` per https://github.com/CodeForPhilly/chime/pull/249 from a local clone of the chime repo
* [OPTIONAL] You can do a `pip install .` from the directory containing setup.py if you want to install into a virtual environment
* allows running simulations from command line (like cli.py in penn_chime)
* is importable so can also run simulations via function call
* includes a few additional command line (or passable) arguments, including:
  - standard CHIME input config filename is a required input
  - a scenario name (prepended to output filenames)
  - output path
* after a simulation scenario is run, a results dictionary is created that contains:
  - the scenario name
  - the standard admits, census, and sim_sir_w_date dataframes
  - the dispositions dataframe
  - a dictionary containing the input parameters
  - a dictionary containing important intermediate variable values such as beta, doubling_time, ...
* writes out the results 
  - dataframes to csv
  - dictionaries to json
* (WIP) runs multiple scenarios corresponding to user specified ranges for one or more input variables.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns; sns.set()
import matplotlib.pyplot as plt

In [2]:
%matplotlib inline

## Example 1 - run script from command line
If you just want to run the sim_chime_scenario_runner.py script directly, just put the file where ever you want. Note that the config filename is a required argument. Here's what that file looks like for this scenario:

**Note** - Since I'm trying to demo both running this as a local script and running it as an installed command line app, I stuck the sim_chime_scenario_runner.py into the demos/src/ subdirectory to prevent namespace havoc. 

    --current-hospitalized 802
    --mitigation-date 2020-03-21
    --current-date 2020-03-27
    --doubling-time 3.61
    --hospitalized-day 7
    --hospitalized-rate 0.025
    --icu-days 9
    --icu-rate 0.0075
    --market-share 0.32
    --infectious-days 14
    --n-days 120
    --relative-contact-rate 0.31
    --population 5026226
    --ventilated-day 10
    --ventilated-rate 0.005

In [3]:
# scenario = 'test_from_command_line'
!python src/sim_chime_scenario_runner.py dt361.cfg --scenario test_script --output-path ./output/

2020-04-07 07:57:20,102 - penn_chime.models - INFO - Using doubling_time: 3.61
2020-04-07 07:57:20,117 - penn_chime.models - INFO - Set i_day = 35
2020-04-07 07:57:20,117 - penn_chime.models - INFO - Estimated date_first_hospitalized: 2020-02-21; current_date: 2020-03-27; i_day: 35
2020-04-07 07:57:20,117 - penn_chime.models - INFO - len(np.arange(-i_day, n_days+1)): 156
2020-04-07 07:57:20,117 - penn_chime.models - INFO - len(raw_df): 156
Scenario: test_script


Input parameters
--------------------------------------------------
{
    "current_hospitalized": 802,
    "mitigation_date": "2020-03-21",
    "current_date": "2020-03-27",
    "date_first_hospitalized": "2020-02-21",
    "doubling_time": 3.61,
    "infectious_days": 14,
    "market_share": 0.32,
    "n_days": 120,
    "relative_contact_rate": 0.31,
    "population": 5026226,
    "hospitalized": [
        0.025,
        7
    ],
    "icu": [
        0.0075,
        9
    ],
    "ventilated": [
        0.005,
        10
    ],

Now let's run the CHIME CLI and make sure we get the same outputs. We should, because I'm just calling CHIME functions.

In [4]:
!penn_chime --file dt361.cfg

2020-04-07 07:57:33,711 - penn_chime.models - INFO - Using doubling_time: 3.61
2020-04-07 07:57:33,726 - penn_chime.models - INFO - Set i_day = 35
2020-04-07 07:57:33,726 - penn_chime.models - INFO - Estimated date_first_hospitalized: 2020-02-21; current_date: 2020-03-27; i_day: 35
2020-04-07 07:57:33,726 - penn_chime.models - INFO - len(np.arange(-i_day, n_days+1)): 156
2020-04-07 07:57:33,726 - penn_chime.models - INFO - len(raw_df): 156


In [5]:
!diff ./output/test_script_admits.csv 2020-03-27_projected_admits.csv
!diff ./output/test_script_census.csv 2020-03-27_projected_census.csv
!diff ./output/test_script_sim_sir_w_date.csv 2020-03-27_sim_sir_w_date.csv

Confirm no differences in output files.

## Example 2 - use runner CLI from command line
If you pip installed sim_chime_scenario_runner, you can run it like this. 

In [6]:
# scenario = 'test_from_command_line'
!sim_chime_scenario_runner dt361.cfg --scenario test_runner_cli --output-path ./output/

2020-04-07 07:57:57,687 - penn_chime.models - INFO - Using doubling_time: 3.61
2020-04-07 07:57:57,702 - penn_chime.models - INFO - Set i_day = 35
2020-04-07 07:57:57,702 - penn_chime.models - INFO - Estimated date_first_hospitalized: 2020-02-21; current_date: 2020-03-27; i_day: 35
2020-04-07 07:57:57,702 - penn_chime.models - INFO - len(np.arange(-i_day, n_days+1)): 156
2020-04-07 07:57:57,702 - penn_chime.models - INFO - len(raw_df): 156
Scenario: test_runner_cli


Input parameters
--------------------------------------------------
{
    "current_hospitalized": 802,
    "mitigation_date": "2020-03-21",
    "current_date": "2020-03-27",
    "date_first_hospitalized": "2020-02-21",
    "doubling_time": 3.61,
    "infectious_days": 14,
    "market_share": 0.32,
    "n_days": 120,
    "relative_contact_rate": 0.31,
    "population": 5026226,
    "hospitalized": [
        0.025,
        7
    ],
    "icu": [
        0.0075,
        9
    ],
    "ventilated": [
        0.005,
        10
  

## Example 3 - run from function call
The basic steps are:

* import the `sim_chime_scenario_runner` module
* specify scenario name (if you don't, default is current datetime)
* create a `penn_chime.Parameters` object from the input config file using `create_params_from_file`
* call `sim_chime` to run the simulation and return results dictionary
* do whatever you want with the results
  - csv and json outputs just happen for command line use as in penn_chime cli.py
  - `write_results` function will write out all dataframes (to csv) and dicts (to json)
  - or selectively do whatever you want with components of the results dictionary

In [7]:
import sim_chime_scenario_runner as runner

In [8]:
scenario = 'test_from_import'
p = runner.create_params_from_file("dt361.cfg")

Let's look at the parameter values.

In [9]:
vars(p)

{'current_hospitalized': 802,
 'mitigation_date': datetime.date(2020, 3, 21),
 'current_date': datetime.date(2020, 3, 27),
 'infectious_days': 14,
 'market_share': 0.32,
 'n_days': 120,
 'relative_contact_rate': 0.31,
 'population': 5026226,
 'hospitalized': Disposition(rate=0.025, days=7),
 'icu': Disposition(rate=0.0075, days=9),
 'ventilated': Disposition(rate=0.005, days=10),
 'date_first_hospitalized': None,
 'doubling_time': 3.61,
 'max_y_axis': None,
 'recovered': 0,
 'region': None,
 'labels': {'hospitalized': 'Hospitalized',
  'icu': 'ICU',
  'ventilated': 'Ventilated',
  'day': 'Day',
  'date': 'Date',
  'susceptible': 'Susceptible',
  'infected': 'Infected',
  'recovered': 'Recovered'},
 'dispositions': {'hospitalized': Disposition(rate=0.025, days=7),
  'icu': Disposition(rate=0.0075, days=9),
  'ventilated': Disposition(rate=0.005, days=10)}}

Run the simulation and capture the results.

In [10]:
model, results = runner.sim_chime(scenario, p)

2020-04-07 07:58:26,253 - penn_chime.models - INFO - Using doubling_time: 3.61
2020-04-07 07:58:26,281 - penn_chime.models - INFO - Set i_day = 35
2020-04-07 07:58:26,281 - penn_chime.models - INFO - Estimated date_first_hospitalized: 2020-02-21; current_date: 2020-03-27; i_day: 35
2020-04-07 07:58:26,282 - penn_chime.models - INFO - len(np.arange(-i_day, n_days+1)): 156
2020-04-07 07:58:26,283 - penn_chime.models - INFO - len(raw_df): 156


Here are the keys in the `results` dictionary.

In [11]:
results.keys()

dict_keys(['scenario', 'input_params_dict', 'intermediate_variables_dict', 'sim_sir_w_date_df', 'dispositions_df', 'admits_df', 'census_df'])

Let's check out a few of the dataframes to make sure they contain what we think they contain.

In [12]:
results['admits_df'].head()

Unnamed: 0,day,date,hospitalized,icu,ventilated
0,-35,2020-02-21,,,
1,-34,2020-02-22,0.283108,0.084932,0.056622
2,-33,2020-02-23,0.343034,0.10291,0.068607
3,-32,2020-02-24,0.415643,0.124693,0.083129
4,-31,2020-02-25,0.503619,0.151086,0.100724


In [13]:
results['admits_df'][30:45]

Unnamed: 0,day,date,hospitalized,icu,ventilated
30,-5,2020-03-22,50.262243,15.078673,10.052449
31,-4,2020-03-23,56.334877,16.900463,11.266975
32,-3,2020-03-24,63.11772,18.935316,12.623544
33,-2,2020-03-25,70.68776,21.206328,14.137552
34,-1,2020-03-26,79.128756,23.738627,15.825751
35,0,2020-03-27,88.531406,26.559422,17.706281
36,1,2020-03-28,98.993392,29.698018,19.798678
37,2,2020-03-29,110.619252,33.185776,22.12385
38,3,2020-03-30,123.520022,37.056007,24.704004
39,4,2020-03-31,137.812585,41.343776,27.562517


In [14]:
results['census_df'].head()

Unnamed: 0,day,date,hospitalized,icu,ventilated
0,-35,2020-02-21,,,
1,-34,2020-02-22,0.283108,0.084932,0.056622
2,-33,2020-02-23,0.626142,0.187843,0.125228
3,-32,2020-02-24,1.041785,0.312536,0.208357
4,-31,2020-02-25,1.545405,0.463621,0.309081


In [15]:
results['census_df'][30:45]

Unnamed: 0,day,date,hospitalized,icu,ventilated
30,-5,2020-03-22,287.386451,96.763016,67.135454
31,-4,2020-03-23,320.415374,108.89251,75.775653
32,-3,2020-03-24,355.328051,122.051714,85.218551
33,-2,2020-03-25,391.890664,136.266256,95.505362
34,-1,2020-03-26,429.744468,151.54337,106.669922
35,0,2020-03-27,468.371979,167.865248,118.735195
36,1,2020-03-28,507.056154,185.18078,131.708844
37,2,2020-03-29,567.413164,203.395387,145.577704
38,3,2020-03-30,634.598309,222.358629,160.300929
39,4,2020-03-31,709.293175,248.623731,175.801603


In [16]:
results['sim_sir_w_date_df'].head()

Unnamed: 0,day,date,susceptible,infected,recovered
0,-35,2020-02-21,5026101.0,125.0,0.0
1,-34,2020-02-22,5026066.0,151.459955,8.928571
2,-33,2020-02-23,5026023.0,183.520642,19.74714
3,-32,2020-02-24,5025971.0,222.367416,32.855757
4,-31,2020-02-25,5025908.0,269.43644,48.739144


In [17]:
results['sim_sir_w_date_df'][30:45]

Unnamed: 0,day,date,susceptible,infected,recovered
30,-5,2020-03-22,4976539.0,36407.392909,13279.337147
31,-4,2020-03-23,4969497.0,40848.724435,15879.865212
32,-3,2020-03-24,4961608.0,45820.673366,18797.631243
33,-2,2020-03-25,4952772.0,51383.73817,22070.536484
34,-1,2020-03-26,4942881.0,57604.565698,25740.803496
35,0,2020-03-27,4931814.0,64556.37961,29855.415331
36,1,2020-03-28,4919440.0,72319.383652,34466.585303
37,2,2020-03-29,4905613.0,80981.119945,39632.255564
38,3,2020-03-30,4890173.0,90636.757032,45416.621275
39,4,2020-03-31,4872946.0,101389.27613,51890.675348


In [18]:
results['dispositions_df'].head()

Unnamed: 0,day,date,hospitalized,icu,ventilated
0,-35,2020-02-21,1.0,0.3,0.2
1,-34,2020-02-22,1.283108,0.384932,0.256622
2,-33,2020-02-23,1.626142,0.487843,0.325228
3,-32,2020-02-24,2.041785,0.612536,0.408357
4,-31,2020-02-25,2.545405,0.763621,0.509081


Here's the intermediate variables dictionary.

In [19]:
results['intermediate_variables_dict']

{'intrinsic_growth_rate': 0.21167963995855832,
 'gamma': 0.07142857142857142,
 'beta': 5.6327600935024925e-08,
 'r_naught': 3.963514959419816,
 'r_t': 2.734825321999673,
 'doubling_time_t': 5.933509014640464}

Finally, here are the inputs we used. Note that, since we input the doubling time, the first hospitalized date is estimated by `penn_chime.SimSirModel`. You'll also see that it's a datetime and json hates that. So, when the dictionary gets written to a json file, the date is stringified.

In [20]:
results['input_params_dict']

{'current_hospitalized': 802,
 'mitigation_date': datetime.date(2020, 3, 21),
 'current_date': datetime.date(2020, 3, 27),
 'infectious_days': 14,
 'market_share': 0.32,
 'n_days': 120,
 'relative_contact_rate': 0.31,
 'population': 5026226,
 'hospitalized': Disposition(rate=0.025, days=7),
 'icu': Disposition(rate=0.0075, days=9),
 'ventilated': Disposition(rate=0.005, days=10),
 'date_first_hospitalized': datetime.date(2020, 2, 21),
 'doubling_time': 3.61,
 'max_y_axis': None,
 'recovered': 0,
 'region': None,
 'labels': {'hospitalized': 'Hospitalized',
  'icu': 'ICU',
  'ventilated': 'Ventilated',
  'day': 'Day',
  'date': 'Date',
  'susceptible': 'Susceptible',
  'infected': 'Infected',
  'recovered': 'Recovered'},
 'dispositions': {'hospitalized': Disposition(rate=0.025, days=7),
  'icu': Disposition(rate=0.0075, days=9),
  'ventilated': Disposition(rate=0.005, days=10)}}

Write out all the results. Dataframes go to csv and dictionaries to json.

In [21]:
output_path = './output/' # default is current working directory
print("Writing out all results to {} for scenario --> {}".format(output_path, scenario))
runner.write_results(results, scenario, output_path)

Writing out all results to ./output/ for scenario --> test_from_import


## Example 3 - run several scenarios for range of input values
I'm still working on this, but see the function `sim_chimes()` (plural) for the basic idea. I loop over an array of values for the social distancing parameter, run `sim_chime()` (singular) for each, and gather outputs in a big list of results dictionaries.

<font size="2">This material is made available under the [MIT License](https://opensource.org/licenses/MIT).</font>