<h1 style="text-align: center;">Modelling Disease Dynamics with <span style="font-family:Courier; color: blue">vivarium</span></h1>

<p style="text-align: center;">James Collins</p>
<p style="text-align: center;">May 9, 2019</p>

Thanks Christine for ...

Hi everyone. 

I'm ... 

I'm the lead engineer on ...

Today we'll be talking about ...

This is an open demo, feel free to ask questions

# Agenda

Let's lay out the agenda

**Slide**

We'll start with a quick overview of what the vivarium framework is

Won't go into any detail. In the next few months I'll give some longer talks on the framework architecture and the underlying methodology. Today, we want to get to a quick demo of some of the things we've been doing.

**Slide**

We sit on top of an incredibly rich data source for public health modeling and we've done a ton of work to integrate it into our simulations. I'll give a brief overview of some of the public health modeling tools we've built using vivarium.

**Slide**

Finally, we'll build up an individual-based disease model.  We'll start with a simple birth-death model.  Then throw in a disease and a risk and see how that changes things.  Then I'll show you some example outputs from full scale models.

1. What is <span style="font-family:Courier; color: blue">vivarium</span>?

2. How do you use <span style="font-family:Courier; color: blue">vivarium</span> to model public health dynamics?

3. A "relatively" simple disease model example.

<h1>What is <span style="font-family:Courier; color: blue">vivarium</span>?</h1>

<h2><span style="font-family:Courier; color: blue">vivarium</span> is a discrete-time Monte Carlo simulation framework<h2>

First, what is vivarium?

**Slide**

That's a lot of things.  Let's look at it in pieces.

<h2><span style="font-family:Courier; color: blue">vivarium</span> is a <span style="color: red">discrete-time</span> Monte Carlo simulation framework<h2>

First it's a discrete time framework.  This means time in the simulation proceeds in discrete chunks (e.g. a second, a day, a month).  

**Slide**

Think of X as a snapshot of the world and our model as some (possibly time-dependent) function.  To progress to the next snapshot we take a step forward in time by evaluating our function on the current state of the world.

**Slide**

This is the fundamental nature of many kinds of simulation models. What's important is that we decide what delta t is before we evaluate our model and move to the next step. We'll ask questions like does a person get sick in the next month?  Do they die in the next month?  etc.

$$
X_{t+\Delta t} = f(X_t, t, \Delta t)
$$

<img src="model_loop.png" style="display: block; margin-left: auto; margin-right: auto; width: 40%">

<h2><span style="font-family:Courier; color: blue">vivarium</span> is a discrete-time <span style="color: red">Monte Carlo simulation</span> framework<h2>

How many of you are familiar with Monte Carlo methods?  Someone want to take a stab at defining them?

Using random numbers to inform stuff we don't know for sure about.

**Slide**

I had a professor in a science class tell me that any measurement without an estimation of uncertainty is totally useless.  We spend a ton of time trying to inform policy makers to help them make good decisions about what to do.  That gives us a tremendous amout of power, so accurately representing how certain or uncertain we are is an ethical responsibility.

**Slide**

In the context of our simulations, we use Monte Carlo methods to capture all the exogenous randomness in the model.  E.g. I may know your age/sex/bmi/smoking history, but I don't necessarily know how frequently you eat hamburgers or whether you go to political protests or whether there's mold in your workplace or eight million other things.  We capture all that exogenous heterogeneity with random numbers.


Note: We also deal with parameter uncertainty, and I'll talk about that a bit at the end if there's time.

- We care about uncertainty

- We care about <span style="color: green">stochastic</span> uncertainty.

<h2><span style="font-family:Courier; color: blue">vivarium</span> is a discrete-time Monte Carlo simulation <span style="color: red">framework</span><h2>

Frameworks are funny things in software.

**Slide**

Web and Gui frameworks allow users to build applications for end users.

**Slide**

It provides a bunch of features that are generally useful in simulation modeling like state management, data interpolation, a clock, etc, as well as an environment in which to run those simulations.  

If that seems abstract, it's because it is.  Let's get concrete.

- Frameworks are tools for building tools

- <span style="font-family:Courier; color: blue">vivarium</span> is a tool for building tools for building simulation models.

<h1>How do you use <span style="font-family:Courier; color: blue">vivarium</span> to model public health dynamics?</h1>

Well according to what we just learned, we use vivarium to build public health modeling tools.

<h2>Use <span style="font-family:Courier; color: blue">vivarium_public_health</span>!</h2>

vph is a suite of public health specific modeling components built to think about public health phenomena in the same way that GBD does. It includes:

- Components that capture demographic aspects of a modeled population (the starting population characteristics, birth, death, disability).

- Components that capture how diseases affect the population and how those diseases affect mortality and disability.

- Components that model risk exposure and how that risk exposure contributes to disease incidence and mortality.

- Components that measure and report what's going on in the simulated population.

<h1 style="text-align: center;">Let's build a model!</h1>

In [None]:
!cat birth_death.yaml

Here's an example model specification.  

Key pieces: 

components
- population
- mortality
- fertility

configuration
- input data
- time span and step size
- population characteristics

Data that informs this model:

- GBD population estimates
- GBD live births by sex -> Crude birth rate
- GBD all cause mortality rate


In [None]:
%matplotlib inline
from vivarium.interface import setup_simulation_from_model_specification
import matplotlib.pyplot as plt
import pandas as pd

sim = setup_simulation_from_model_specification('birth_death.yaml')

Let's do a little exploration.

## Examining the population

In [None]:
pop = sim.get_population()
pop.head()

In [None]:
def plot_population(population):
    plt.figure(figsize=(15, 8))

    population.age.hist(bins=50)
    plt.xlabel('age', fontsize=16)
    plt.ylabel('count', fontsize=16)
    plt.show()
    
plot_population(pop)

In [None]:
pop.sex.value_counts()

In [None]:
pop.alive.value_counts()

# Running the simulation

In [None]:
sim.run()

# Examining the final population

In [None]:
pop_final = sim.get_population()
plot_population(pop_final)

In [None]:
plot_population(pop_final[pop_final.alive == 'dead'])

For experts in the room, anything surprising about this death pattern?

# Let's add a risk and a disease

In [None]:
!cat disease_model.yaml

Closed cohort, moved up age since this is an adult disease

In [None]:
sim = setup_simulation_from_model_specification("disease_model.yaml")


In [None]:
pop = sim.get_population()

pop.head()

In [None]:
plot_population(pop)

In [None]:
def hist_bmi(ctx):
    population = ctx.get_population()
    population = population.loc[population.alive == 'alive']
    bmi = ctx.get_value('high_body_mass_index_in_adults.exposure')

    plt.figure(figsize=(15, 8))

    bmi(population.index).hist(bins=50)
    plt.xlabel('bmi', fontsize=16)
    plt.ylabel('count', fontsize=16)
    plt.show()

hist_bmi(sim)

In [None]:
def plot_ihd_vs_bmi(ctx):
    population = ctx.get_population()
    population = population.loc[population.alive == 'alive']
    bmi = ctx.get_value('high_body_mass_index_in_adults.exposure')
    ihd = ctx.get_value('ischemic_heart_disease.incidence_rate')

    plt.figure(figsize=(15, 8))
    plt.scatter(bmi(population.index), ihd(population.index) * 365/28)
    plt.xlabel('BMI', fontsize=20)
    plt.ylabel('IHD incidence rate', fontsize=20)
    plt.xlim(10, 80)
    plt.ylim(0, 0.04)

    plt.show()
    
plot_ihd_vs_bmi(sim)

In [None]:
sim.run()

In [None]:
final_pop = sim.get_population()
plot_population(final_pop)

In [None]:
hist_bmi(sim)

In [None]:
plot_ihd_vs_bmi(sim)

# What do results actually look like?

<img src="ischemic_stroke.png">

<img src="individual.png">

<img src="maternal_intervention.png">

<h1 style="text-align: center;">Thanks!</h1>

<h2>Vivarium Engineering Team</h2>

Kate Wilson

Cody Horst

<h2>Simulation Science Research Team</h2>

Abie Flaxman

Christine Allen

Nathaniel Blair-Stahn

Kelly Compton

Yongquan Xie

Yaqi Wang

