## Quick demo of sim_chime_scenario_runner

Location: https://github.com/misken/c19/tree/master/mychime/sim_chime_scenario_runner

**sim_chime_scenario_runner.py** is a simple Python module for working with the penn_chime model
that: 

* assumes that you've pip installed `penn_chime` per https://github.com/CodeForPhilly/chime/pull/249 from a local clone of the chime repo
* You should do a `pip install .` from the directory containing setup.py to install into a virtual environment
* allows running simulations from command line (like cli.py in penn_chime)
* is importable so can also run simulations via function call
* includes a few additional command line (or passable) arguments, including:
  - standard CHIME input config filename is a required input
  - a scenario name (prepended to output filenames)
  - output path
* after a simulation scenario is run, a results dictionary is created that contains:
  - the scenario name
  - the standard admits, census, and sim_sir_w_date dataframes
  - the dispositions dataframe
  - a dictionary containing the input parameters
  - a dictionary containing important intermediate variable values such as beta, doubling_time, ...
* writes out the results 
  - dataframes to csv
  - dictionaries to json
* (WIP) runs multiple scenarios corresponding to user specified ranges for one or more input variables.

In [27]:
import pandas as pd
import numpy as np
import seaborn as sns; sns.set()
import matplotlib.pyplot as plt

In [28]:
%matplotlib inline

## Example 1 - use runner CLI from command line
After you've pip installed sim_chime_scenario_runner, you can run it like this. Here's what the semi_0408_fddt_inf14_1mit.cfg file looks like. It's a standard `penn_chime` config (aka parameters) file.

        --population 5026226
        --market-share 0.30
        --current-hospitalized 1096
        --doubling-time 3.290295710120783
        --mitigation-date 2020-03-21
        --current-date 2020-04-08
        --relative-contact-rate 0.54
        --hospitalized-rate 0.025
        --icu-rate 0.0075
        --ventilated-rate 0.005
        --infectious-days 14
        --hospitalized-day 7
        --icu-days 10
        --ventilated-day 10
        --n-days 120
        --recovered 0


In [29]:
# scenario = 'test_from_command_line'
!sim_chime_scenario_runner semi_0408_fddt_inf14_1mit.cfg --scenario test_runner_cli --output-path ./output/

2020-04-29 08:47:46,352 - penn_chime.model.parameters - INFO - Using file: semi_0408_fddt_inf14_1mit.cfg
2020-04-29 08:47:46,354 - penn_chime.model.sir - INFO - Using doubling_time: 3.290295710120783
2020-04-29 08:47:46,374 - penn_chime.model.sir - INFO - Estimated date_first_hospitalized: 2020-02-21; current_date: 2020-04-08; i_day: 47
2020-04-29 08:47:46,376 - penn_chime.model.sir - INFO - len(np.arange(-i_day, n_days+1)): 168
2020-04-29 08:47:46,376 - penn_chime.model.sir - INFO - len(raw_df): 168
Scenario: test_runner_cli


Input parameters
--------------------------------------------------
{
    "current_date": "2020-04-08",
    "current_hospitalized": 1096,
    "date_first_hospitalized": null,
    "doubling_time": 3.290295710120783,
    "hospitalized": [
        7,
        0.025
    ],
    "icu": [
        10,
        0.0075
    ],
    "infectious_days": 14,
    "market_share": 0.3,
    "max_y_axis": null,
    "mitigation_date": "2020-03-21",
    "n_days": 120,
    "population": 

Now let's run the CHIME CLI and make sure we get the same outputs. We should, because I'm just calling CHIME functions.

In [30]:
!penn_chime --parameters semi_0408_fddt_inf14_1mit.cfg

2020-04-29 08:47:46,855 - penn_chime.model.parameters - INFO - Using file: semi_0408_fddt_inf14_1mit.cfg
2020-04-29 08:47:46,856 - penn_chime.model.sir - INFO - Using doubling_time: 3.290295710120783
2020-04-29 08:47:46,876 - penn_chime.model.sir - INFO - Estimated date_first_hospitalized: 2020-02-21; current_date: 2020-04-08; i_day: 47
2020-04-29 08:47:46,879 - penn_chime.model.sir - INFO - len(np.arange(-i_day, n_days+1)): 168
2020-04-29 08:47:46,879 - penn_chime.model.sir - INFO - len(raw_df): 168


In [31]:
!diff ./output/test_runner_cli_admits.csv 2020-04-08_projected_admits.csv
!diff ./output/test_runner_cli_census.csv 2020-04-08_projected_census.csv
!diff ./output/test_runner_cli_sim_sir_w_date.csv 2020-04-08_sim_sir_w_date.csv

Confirm no differences in output files.

## Example 2 - run from function call
The basic steps are:

* import the `sim_chime_scenario_runner` module
* specify scenario name (if you don't, default is current datetime)
* create a `penn_chime.Parameters` object from the input config file using `create_params_from_file`
* call `sim_chime` to run the simulation and return results dictionary
* do whatever you want with the results
  - csv and json outputs just happen for command line use as in penn_chime cli.py
  - `write_results` function will write out all dataframes (to csv) and dicts (to json)
  - or selectively do whatever you want with components of the results dictionary

In [32]:
import sim_chime_scenario_runner as runner

In [33]:
scenario = 'test_from_import'
p = runner.create_params_from_file("semi_0408_fddt_inf14_1mit.cfg")

2020-04-29 08:47:47,497 - penn_chime.model.parameters - INFO - Using file: semi_0408_fddt_inf14_1mit.cfg


Let's look at the parameter values.

In [34]:
vars(p)

{'current_date': datetime.date(2020, 4, 8),
 'current_hospitalized': 1096,
 'date_first_hospitalized': None,
 'doubling_time': 3.290295710120783,
 'hospitalized': Disposition(days=7, rate=0.025),
 'icu': Disposition(days=10, rate=0.0075),
 'infectious_days': 14,
 'market_share': 0.3,
 'max_y_axis': None,
 'mitigation_date': datetime.date(2020, 3, 21),
 'n_days': 120,
 'population': 5026226,
 'region': None,
 'relative_contact_rate': 0.54,
 'recovered': 0,
 'ventilated': Disposition(days=10, rate=0.005),
 'labels': {'admits_hospitalized': 'admits_hospitalized',
  'admits_icu': 'admits_icu',
  'admits_ventilated': 'admits_ventilated',
  'census_hospitalized': 'census_hospitalized',
  'census_icu': 'census_icu',
  'census_ventilated': 'census_ventilated',
  'day': 'day',
  'date': 'date',
  'susceptible': 'susceptible',
  'infected': 'infected',
  'recovered': 'recovered'},
 'dispositions': {'hospitalized': Disposition(days=7, rate=0.025),
  'icu': Disposition(days=10, rate=0.0075),
  've

Run the simulation and capture the results.

In [35]:
model, results = runner.sim_chime(scenario, p)

2020-04-29 08:47:47,525 - penn_chime.model.sir - INFO - Using doubling_time: 3.290295710120783
2020-04-29 08:47:47,556 - penn_chime.model.sir - INFO - Estimated date_first_hospitalized: 2020-02-21; current_date: 2020-04-08; i_day: 47
2020-04-29 08:47:47,561 - penn_chime.model.sir - INFO - len(np.arange(-i_day, n_days+1)): 168
2020-04-29 08:47:47,562 - penn_chime.model.sir - INFO - len(raw_df): 168


Here are the keys in the `results` dictionary.

In [36]:
results.keys()

dict_keys(['result_type', 'scenario', 'input_params_dict', 'important_variables_dict', 'sim_sir_w_date_df', 'dispositions_df', 'admits_df', 'census_df', 'adm_cen_wide_df', 'adm_cen_long_df'])

Let's check out a few of the dataframes to make sure they contain what we think they contain.

In [37]:
results['admits_df'].head()

Unnamed: 0,day,date,admits_hospitalized,admits_icu,admits_ventilated
0,-47,2020-02-21,,,
1,-46,2020-02-22,0.305926,0.091778,0.061185
2,-45,2020-02-23,0.377662,0.113299,0.075532
3,-44,2020-02-24,0.466217,0.139865,0.093243
4,-43,2020-02-25,0.575534,0.17266,0.115107


In [38]:
results['admits_df'][30:45]

Unnamed: 0,day,date,admits_hospitalized,admits_icu,admits_ventilated
30,-17,2020-03-22,61.361992,18.408598,12.272398
31,-16,2020-03-23,65.373651,19.612095,13.07473
32,-15,2020-03-24,69.624887,20.887466,13.924977
33,-14,2020-03-25,74.126852,22.238056,14.82537
34,-13,2020-03-26,78.890764,23.667229,15.778153
35,-12,2020-03-27,83.927837,25.178351,16.785567
36,-11,2020-03-28,89.249191,26.774757,17.849838
37,-10,2020-03-29,94.865748,28.459724,18.97315
38,-9,2020-03-30,100.788122,30.236437,20.157624
39,-8,2020-03-31,107.026486,32.107946,21.405297


In [39]:
results['census_df'].head()

Unnamed: 0,day,date,census_hospitalized,census_icu,census_ventilated
0,-47,2020-02-21,0.0,0.0,0.0
1,-46,2020-02-22,0.305926,0.091778,0.061185
2,-45,2020-02-23,0.683588,0.205076,0.136718
3,-44,2020-02-24,1.149806,0.344942,0.229961
4,-43,2020-02-25,1.72534,0.517602,0.345068


In [40]:
results['census_df'][30:45]

Unnamed: 0,day,date,census_hospitalized,census_icu,census_ventilated
30,-17,2020-03-22,475.057661,165.689871,110.459914
31,-16,2020-03-23,501.875957,179.128048,119.418699
32,-15,2020-03-24,524.004485,192.402339,128.268226
33,-14,2020-03-25,539.649424,205.254915,136.83661
34,-13,2020-03-26,546.57562,217.355538,144.903692
35,-12,2020-03-27,542.01418,228.284982,152.189988
36,-11,2020-03-28,522.555175,237.515165,158.343443
37,-10,2020-03-29,556.05893,244.385519,162.923679
38,-9,2020-03-30,591.473402,248.075172,165.383448
39,-8,2020-03-31,628.875,247.570659,165.047106


In [41]:
results['sim_sir_w_date_df'].head()

Unnamed: 0,day,date,susceptible,infected,recovered
0,-47,2020-02-21,5026093.0,133.333333,0.0
1,-46,2020-02-22,5026052.0,164.599682,9.52381
2,-45,2020-02-23,5026002.0,203.197507,21.28093
3,-44,2020-02-24,5025939.0,250.84571,35.795037
4,-43,2020-02-25,5025863.0,309.666091,53.712588


In [42]:
results['sim_sir_w_date_df'][30:45]

Unnamed: 0,day,date,susceptible,infected,recovered
30,-17,2020-03-22,4940895.0,63007.448073,22323.9244
31,-16,2020-03-23,4932178.0,67223.402872,26824.456405
32,-15,2020-03-24,4922895.0,71705.049544,31626.128039
33,-14,2020-03-25,4913011.0,76466.840526,36747.917292
34,-13,2020-03-26,4902492.0,81523.691892,42209.834472
35,-12,2020-03-27,4891302.0,86890.949367,48032.955322
36,-11,2020-03-28,4879402.0,92584.345087,54239.451705
37,-10,2020-03-29,4866753.0,98619.94397,60852.619211
38,-9,2020-03-30,4853315.0,105014.078554,67896.900924
39,-8,2020-03-31,4839045.0,111783.271055,75397.906535


In [43]:
results['dispositions_df'].head()

Unnamed: 0,day,date,ever_hospitalized,ever_icu,ever_ventilated
0,-47,2020-02-21,1.0,0.3,0.2
1,-46,2020-02-22,1.305926,0.391778,0.261185
2,-45,2020-02-23,1.683588,0.505076,0.336718
3,-44,2020-02-24,2.149806,0.644942,0.429961
4,-43,2020-02-25,2.72534,0.817602,0.545068


Here's the intermediate variables dictionary.

In [44]:
results['important_variables_dict']

OrderedDict([('result_type', 'simsir'),
             ('scenario', 'test_from_import'),
             ('intrinsic_growth_rate', 0.23449761617967013),
             ('doubling_time', 3.290295710120783),
             ('gamma', 0.07142857142857142),
             ('beta', 6.086759793291545e-08),
             ('r_naught', 4.282966626515382),
             ('r_t', 1.9701646481970752),
             ('doubling_time_t', 10.345191984919985)])

Finally, here are the inputs we used. Note that, since we input the doubling time, the first hospitalized date is estimated by `penn_chime.SimSirModel`. You'll also see that it's a datetime and json hates that. So, when the dictionary gets written to a json file, the date is stringified.

In [45]:
results['input_params_dict']

{'current_date': datetime.date(2020, 4, 8),
 'current_hospitalized': 1096,
 'date_first_hospitalized': None,
 'doubling_time': 3.290295710120783,
 'hospitalized': Disposition(days=7, rate=0.025),
 'icu': Disposition(days=10, rate=0.0075),
 'infectious_days': 14,
 'market_share': 0.3,
 'max_y_axis': None,
 'mitigation_date': datetime.date(2020, 3, 21),
 'n_days': 120,
 'population': 5026226,
 'region': None,
 'relative_contact_rate': 0.54,
 'recovered': 0,
 'ventilated': Disposition(days=10, rate=0.005),
 'labels': {'admits_hospitalized': 'admits_hospitalized',
  'admits_icu': 'admits_icu',
  'admits_ventilated': 'admits_ventilated',
  'census_hospitalized': 'census_hospitalized',
  'census_icu': 'census_icu',
  'census_ventilated': 'census_ventilated',
  'day': 'day',
  'date': 'date',
  'susceptible': 'susceptible',
  'infected': 'infected',
  'recovered': 'recovered'},
 'dispositions': {'hospitalized': Disposition(days=7, rate=0.025),
  'icu': Disposition(days=10, rate=0.0075),
  've

Write out all the results. Dataframes go to csv and dictionaries to json.

In [46]:
output_path = './output/' # default is current working directory
print("Writing out all results to {} for scenario --> {}".format(output_path, scenario))
runner.write_results(results, scenario, output_path)

Writing out all results to ./output/ for scenario --> test_from_import


## Example 3 - run several scenarios for range of input values
I'm still working on this, but see the function `sim_chimes()` (plural) for the basic idea. I loop over an array of values for the social distancing parameter, run `sim_chime()` (singular) for each, and gather outputs in a big list of results dictionaries.

<font size="2">This material is made available under the [MIT License](https://opensource.org/licenses/MIT).</font>

## Additional features added since this demo created

I've since added a few more modeling features such as:

* the ability to use a file of mitigation dates and associated relative contact rates. 
* similarly, can have dynamic market share values,
* can include actual census and admit data for easy comparison to projections.

See the other demos for details.