![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

## To use this slideshow:
- Run All, using the menu item: Kernel/Restart & Run All
- Return to this top cell
- click on "Slideshow" menu item above, that looks like this:
![](images/SlideIcon.png)

## Mathematical Modeling

### August 4, 2020 with Laura G Funderburk, She/her

### Mentors (by last name)

- Lisa Cao, She/her
- Annie Li, She/her
- Sophie MacDonald, She/her
- Alyaa Mohamed
- Rajan Patel, He/him
- Alex Tennant, He/him 

## Session II

In this session, we’ll implement the “<b>S</b>usceptible, <b>E</b>xposed, <b>I</b>nfected and <b>R</b>ecovered” (<b>SEIR</b>) model used in epidemiology, the study of how disease occurs in populations. 


## Recap: What is a Mathematical Model

A mathematical model is a description of a system using <b>mathematical concepts</b> and <b>mathematical language</b>.

You can think of a math model as a tool to help us describe what we believe about the workings of phenomena in the world. 

<b>We use the language of mathematics to express our beliefs.</b>

<b>We use mathematics (theoretical and numerical analysis) to evaluate the model, and get insights about the original phenomenon.</b>

### Building Models: Our Road Map for The Course

|Topic | Session |
|-|-|
|<font color=#000000><b>Choose what phenomenon you want to model|1|</b></font>
|<font color=#000000><b>What assumptions are you making about the phenomenon|1|</b></font>   
|<font color=#000000><b>Use a flow diagram to help you determine the structure of your model|1|</b></font>
|<font color=#1f78b4><b>Choose equations|2|</b></font>
|<font color=#1f78b4><b>Implement equations using Python|2|</b></font>
|<font color=#1f78b4><b>Solve equations|2|</b></font>
|<font color=#000000><b>Study the behaviour of the model|3|</b></font>
|<font color=#000000><b>Test the model|3|</b></font>
|<font color=#000000><b>Use the model|3|</b></font>


## Recap: Our assumptions

1. Mode of transmission of the disease from person to person is through contact ("contact transmission") between a person who interacts with an infectious person. 
    
2. Once a person comes into contact with the pathogen, there is a period of time (called the latency period) in which they are infected, but cannot infect others (yet!). 

3. Population is not-constant (that is, people are born and die as time goes by).

 

## Recap: Our assumptions


4. A person in the population is either one of:
    - <b>S</b>usceptible, i.e. not infected but not yet exposed, 
    - <b>E</b>xposed to the infection, i.e. exposed to the virus, but not yet infectious, 
    - <b>I</b>nfectious, and 
    - <b>R</b>ecovered from the infection. 
    
5. People can die by "natural causes" during any of the stages. We assume an additional cause of death associated with the infectious stage.  

## Your task:

Mentors, undergraduates and high school students will be broken down into groups. 

High school students, with the help of mentors, will explain the flow diagram that we worked on this morning to the undergraduate students, and how this relates to our assumptions. 

This is the first of three exercises in which you will collaborate together to design and implement a model for the COVID 19 outbreak. 

## Recap: Flow diagram

How does a person move from one stage into another? In other words, how does a person go from susceptible to exposed, to infected, to recovered? 

$\Delta$: Per-capita birth rate.

$\mu$: Per-capita natural death rate.

$\alpha$: Virus-induced average fatality rate.

$\beta$: Probability of disease transmission per contact (dimensionless) times the number of contacts per unit time.

$\epsilon$: Rate of progression from exposed to infectious (the reciprocal is the incubation period).

$\gamma$: Recovery rate of infectious individuals (the reciprocal is the infectious period).



## Recap: Flow diagram


$$\stackrel{\Delta N} {\longrightarrow} \text{S} \stackrel{\beta\frac{S}{N} I}{\longrightarrow} \text{E} \stackrel{\epsilon}{\longrightarrow} \text{I}  \stackrel{\gamma}{\longrightarrow} \text{R}$$
$$\hspace{1.1cm} \downarrow \mu \hspace{0.6cm} \downarrow \mu  \hspace{0.5cm} \downarrow \mu, \alpha  \hspace{0.1cm} \downarrow \mu $$

In [None]:
# Break into groups 
# Ask high school students to share with undergraduates what they did, i.e. flow diagram 
# Emphasis on Collaboration and team work 

## Exercise: Going from the diagram to a system of equations

Let's take a look at how we express "moving" from one stage to another. 

We note that in an outbreak, the number of people who become infected <b>changes over time</b>. If we take time $t$ as an independent variable, we can then express change in number of susceptible (the same follows for the number of exposed, infected and recovered) as a function of time. 

Given we are interested in tracking the <b> rate of change </b> from one stage to another, we will use a mathematical concept from Calculus called "differential equations". 

## Differential equations 

A differential equation is an equation with a function and one or more of its derivatives. For example, if $Y(t)$ is a function of time $t$, then its rate of change can be expressed as $\frac{dY}{dt}$. 

One example of differential equation involving $Y(t)$ is 

$$\frac{dY}{dt} = Y(t) + 5.$$

In our world things change over time, and describing how things change can be expressed using differential equations. 

## Exercise: Going from the diagram to a system of equations

We can express the rate of change in the number of Susceptible with the mathematical symbol $\frac{dS}{dt}$. This notation indicates the ratio of change in the value of the function $S(t)$ with respect to the independent variable $t$. 

We can then use our diagram to help us define an equation. 

1. Given that $\Delta$ is the number of people <b>moving into</b> the $S$ box, it has a <b>positive value</b>. 

2. Given that $\beta \frac{S}{N}I$ is <b>moving away from</b> the $S$ box, it has a <b>negative value</b>. 

3. $\delta S$ denotes the number of people who were susceptible that died. Since $\delta S$ is <b>moving away from</b> the $S$ box, it has a <b>negative value</b>. 

## Exercise: Going from the diagram to a system of equations

We can then express the rate of change in the number of Susceptible as


$$\frac{dS}{dt} = \Delta N - \beta \frac{S}{N}I - \mu S$$


## Your task:

Mentors, undergraduates and high school students will work together to use differential equations to generate the rest of the equations for Exposed, Infectious and Recovered individuals.

Your task is to discuss and agree on the equations for 

$$\frac{dE}{dt} = \text{?}, \frac{dI}{dt}= \text{?}, \frac{dR}{dt} = \text{?}$$


## Our system of equations

$N$ is updated at each time step, and infected peopel die at a higher rate. 

$$ N = S + E + I + R$$

We can then express our model using differential equations

$$\frac{dS}{dt} = \Delta N - \beta \frac{S}{N}I - \mu S$$

$$\frac{dE}{dt} = \beta \frac{S}{N}I - (\mu + \epsilon )E$$

$$\frac{dI}{dt} = \epsilon E - (\gamma+ \mu + \alpha )I$$

$$\frac{dR}{dt} = \gamma I - \mu R$$


## Our system of equations


Also, we can keep track of people who die due to the infection. 

$$\frac{dD}{dt} = \alpha I $$

## Initial conditions

If $N(t)$ denotes the total population and if $S(t), E(t), I(t), R(t)$ denote the number of susceptible, exposed, infectious and recovered, then at a given time $t$,

$$N(t) = S(t) + E(t) + I(t) + R(t).$$

In particular, if for $t = 0$ (also known as "day zero") we set  

$$S(0) = S_0, E(0) = E_0, I(0) = I_0, R(0) = R_0, $$

then the population at day 0 is:

$$N(0) = S_0 + E_0 + I_0 + R_0.$$

$S_0, E_0, I_0, R_0$ are known as "initial conditions" - we will need them to solve our system.

## Implementing out set of equations using Python

In this next exercise, high school and undergraduate students will be broken down into groups again, and will work along with mentors to implement the set of equations using Python code. 

### Exercise: Going from the system of equations to implementing in Python

Recall that we can express the rate of change for Susceptible as

$$\frac{dS}{dt} = \Delta N - \beta \frac{S}{N}I - \mu S$$


In [None]:
# What does this look like in Python?
dS = Delta*N - beta * (S/N)*I - mu*S

## Your task:

Mentors, undergraduates and high school students will work together to express the system of equations using Python.

In [None]:
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
from ipywidgets import interact, interact_manual, widgets, Layout, VBox, HBox, Button
from IPython.display import display, Javascript, Markdown, HTML, clear_output
import pandas as pd
import plotly.express as px 
import plotly.graph_objects as go

# A grid of time points (in days)
t = np.linspace(0, 750, 750)


# The SEIR model differential equations.
def deriv(y, t, Delta, beta, mu, epsilon,gamma,alpha):
    S, E, I, R, D = y
    N = S + E + I + R
    dS = Delta*N  - beta*S*I/N - mu*S
    dE = beta*S*I/N - (mu + epsilon)*E
    dI = epsilon*E - (gamma + mu + alpha)*I
    dR = gamma*I - mu*R
    dD = alpha*I 
    
    return [dS,dE, dI, dR, dD]


def plot_infections(Delta, beta, mu, epsilon,gamma,alpha):
    
    # Initial number of infected and recovered individuals, I0 and R0.
    S0, E0,I0, R0 ,D0 = 37000000,0,100,0,0
    # Total population, N.
    N = S0 + E0 + I0 + R0
    # Initial conditions vector
    y0 = S0,E0, I0, R0, D0
    # Integrate the SIR equations over the time grid, t.
    ret = odeint(deriv, y0, t, args=(Delta, beta, mu, epsilon,gamma,alpha))
    S, E,I, R, D = ret.T

    seir_simulation = pd.DataFrame({"Susceptible":S,"Exposed":E,"Infected":I,"Recovered":R,"Deaths":D, "Time (days)":t})

    layout = dict( xaxis=dict(title='Time (days)', linecolor='#d9d9d9', mirror=True),
              yaxis=dict(title='Number of people', linecolor='#d9d9d9', mirror=True))
    
    fig = go.Figure(layout=layout)
    
#    fig.add_trace(go.Scatter(x=seir_simulation["Time (days)"], y=seir_simulation["Susceptible"],
#                        mode='lines',
#                        name='Susceptible'))
    
    fig.add_trace(go.Scatter(x=seir_simulation["Time (days)"], y=seir_simulation["Exposed"],
                        mode='lines',
                        name='Exposed'))
    
    fig.add_trace(go.Scatter(x=seir_simulation["Time (days)"], y=seir_simulation["Infected"],
                    mode='lines',
                    name='Infected'))
    
    fig.add_trace(go.Scatter(x=seir_simulation["Time (days)"], y=seir_simulation["Recovered"],
                        mode='lines', name='Recovered'))

    fig.add_trace(go.Scatter(x=seir_simulation["Time (days)"], y=seir_simulation["Deaths"],
                        mode='lines', name='Deaths'))

    fig.update_layout(title_text="Projected Susceptible, Exposed, Infectious, Recovered, Deaths")

    fig.show();


In [None]:
# Our code
# A grid of time points (in days)
t = np.linspace(0, 750, 750)

# The SEIR model differential equations.
def deriv(y, t, Delta, beta, mu, epsilon,gamma,alpha):
    S, E, I, R, D = y
    N = S + E + I + R
    dS = Delta*N  - beta*S*I/N - mu*S
    dE = beta*S*I/N - (mu + epsilon)*E
    dI = epsilon*E - (gamma + mu + alpha)*I
    dR = gamma*I - mu*R
    dD = alpha*I 
    
    return [dS,dE, dI, dR, dD]

## We have a model.... how do we solve it? 

We can use Python to help us determine how the values of $S(t),E(t),I(t),R(t),D(t)$ change over time $t$ using a function `odeint` found in the SciPy library. 

Python libraries contain code to help us solve problems, without needing us to re-create code each time. 

Odeint is a powerful tool that will help us solve our system of equations. 

We will provide initial conditions, as well as values for the parameters $\Delta, \mu,\alpha, \beta, \epsilon, \gamma$.

## Solving the system of equations using Python

In [None]:
# Import libraries
from scipy.integrate import odeint
import pandas as pd

In [None]:
# Initial number of infected and recovered individuals, I0 and R0.
S0, E0,I0, R0 ,D0 = 37000000,0,100,0,0
# Total population, N.
N0 = S0 + E0 + I0 + R0
# Initial conditions vector
y0 = S0,E0, I0, R0, D0

In [None]:
# Integrate the SEIR equations over the time grid, t.
# Solving the equation
t = np.linspace(0, 375, 375)
Delta = 0 # natural birth rate. Set to zero for simplicity
mu = 0 # natural death rate. Set to zero for simplicity
alpha = 0.005  # death rate due to disease
beta = 0.9 # an interaction parameter. Rate for susceptible to exposed. 
epsilon = 0.1 # rate from exposed to infectious
gamma = 0.5 # rate from infectious to recovered (We expect this to be bigger than mu)

## Solving the system of equations using Python

In [None]:
# Solving the equation
ret = odeint(deriv, y0, t, args=(Delta, beta, mu, epsilon,gamma,alpha))
S, E,I, R,D = ret.T
# Store data in a table
seir_simulation = pd.DataFrame({"Susceptible":S,
                                "Exposed":E,
                                "Infectious":I,
                                "Recovered":R,
                                "Deaths":D,
                                "Time (days)":t})
seir_simulation

## Solving the system of equations using Python - let's visualize!

In [None]:
# Initial number of infected and recovered individuals, I0 and R0.
px.line(seir_simulation,"Time (days)",'Infectious',title="Number of infectious people")

## When is there a solution? 

First it is important to note that solutions that make sense include all those solutions where the rate of change in population is positive (-1 people in the world makes no sense). Mathematically this can be expressed as

$$\frac{dN}{dt} =\frac{dS}{dt} + \frac{dE}{dt} + \frac{dI}{dt} + \frac{dR}{dt} \geq 0$$

If we substitute each of the equations from our system, this happens when

$$N  \leq \frac{\Delta}{\mu} \Leftrightarrow S + E + I + R \leq \frac{\Delta}{\mu} $$

In simple terms, the rate of change of the population is positive when there are more births than (natural) deaths. 

## When is there equilibrium? 

Another way to think about rate of change is in terms of slope. One value that is of interest to mathematicians is the value in which a rate changes from positive to negative. 

At equlibrium the slope is horizontal. We can find this value using mathematics by setting a derivative equal to zero. 

We can find the equilibrium for our system by setting

$$\frac{dS}{dt} =\frac{dE}{dt} =\frac{dI}{dt} =\frac{dR}{dt}  = 0 $$

Playing some more with the equations indicates that R can be manipulated to be in terms of E or I. So that if the number of infectious (or exposed) is zero, then the number of exposed and recovered is zero too. This makes sense - if no one is infected in our population, then no one can catch the virus. 

In [None]:
# Initial number of infected and recovered individuals, I0 and R0.
S0, E0,I0, R0 ,D0 = 37000000,0,1,0,0
# Total population, N.
N = S0 + E0 + I0 + R0
# Initial conditions vector
y0 = S0,E0, I0, R0, D0
# Integrate the SEIR equations over the time grid, t.
ret = odeint(deriv, y0, t, args=(Delta, beta, mu, epsilon,gamma,alpha))
S, E,I, R,D = ret.T

seir_simulation = pd.DataFrame({"Susceptible":S,"Exposed":E,"Infectious":I,"Recovered":R,"Time (days)":t})
px.line(seir_simulation,"Time (days)",'Infectious',title="Number of infectious people")

In [None]:
# Initial number of infected and recovered individuals, I0 and R0.
S0, E0,I0, R0 ,D0 = 37000000,0,0,0,0
# Total population, N.
N = S0 + E0 + I0 + R0
# Initial conditions vector
y0 = S0,E0, I0, R0, D0
# Integrate the SEIR equations over the time grid, t.

mu = 0.1 # natural death rate. Set to zero for simplicity
Delta = mu # natural birth rate. Set to zero for simplicity
alpha = 0.005  # death rate due to disease
beta = 0.5 # an interaction parameter. Rate for susceptible to exposed. 
epsilon = 1/3 # rate from exposed to infectious
gamma = 1/8 # rate from infectious to recovered (We expect this to be bigger than mu)

ret = odeint(deriv, y0, t, args=(Delta, beta, mu, epsilon,gamma,alpha))
S, E,I, R ,D= ret.T

seir_simulation = pd.DataFrame({"Susceptible":S,"Exposed":E,"Infectious":I,"Recovered":R,"Time (days)":t})
px.line(seir_simulation,"Time (days)",'Infectious',title="Number of infectious people")

## When is there equilibrium?

Let's suppose there is at least one infectious person in the population. 

We can do a bit of algebra to compute a very important number called $R_0$. This number is called "general (or basic) reproduction number". This is the number that epidemiologists use to determine the number of new cases a single individual will produce. 

We can do a bit of math to get this number. I will show you the simulation first, then the math behind it. 


$$R_0 = \frac{\beta \epsilon}{ (\epsilon + \mu) (\gamma + \alpha + \mu)}$$



If $R_0 < 1$ - this is disease free.

If $R_0 \geq 1$ - this is called "endemic" and indicates there is an outbreak.

In [None]:
# Solving the equation
t = np.linspace(0, 1750, 750)
mu = 0# natural death rate. Set to zero for simplicity
Delta = 0 # natural birth rate. Set to zero for simplicity
alpha = 0.001  # death rate due to disease
beta = 0.1 # an interaction parameter. Rate for susceptible to exposed. 
epsilon = 0.05 # rate from exposed to infectious
gamma = 1/8 # rate from infectious to recovered (We expect this to be bigger than mu)
numerator = beta*epsilon
denominator = (alpha + gamma + mu)*(epsilon + mu)
print("R_0 is equal to", numerator/denominator)  
plot_infections(Delta, beta, mu, epsilon,gamma,alpha)

## Playing with the parameters


In [None]:
def f(beta,eps,gamma,alpha):
    numerator = beta*epsilon
    denominator = (alpha + gamma + mu)*(epsilon + mu)
    print("R_0 is equal to", numerator/denominator)
    plot_infections(0, beta, 0, eps, gamma, alpha)

In [None]:
interact_manual(f, 
         beta=widgets.FloatSlider(min=0, max=1, step=0.01, value=0.5),
         eps =widgets.FloatSlider(min=.1, max=1.0, step=.1, value=.1),
         gamma=widgets.FloatSlider(min=.1, max=1.0, step=.1, value=.1),
         alpha  =widgets.FloatSlider(min=.005, max=1.0, step=.005, value=.005)
         );

## Session II Take Away

In this session we learned:

1. We can use differential equations, a tool from calculus, to express rate of change for the different stages in our model 
2. How to go from a flow diagram into a system of differential equations
3. How to implement our system of equations and solve it using Python
4. The relationship between changing variables $\Delta, \delta, \beta, \epsilon, \gamma$ and the number of S,E,I,R cases in our model.

## Q & A

We can use the remainder of the time to talk about any aspect of this process you are interested in, as well as ask the speaker and any of the mentors what it is like to work in a field related to data science. 

### Speaker & Mentors (by last name)

- Lisa Cao, She/her
- Laura G Funderburk, She/her
- Annie Li, She/her
- Sophie MacDonald, She/her
- Alyaa Mohamed
- Rajan Patel, He/him
- Alex Tennant, He/him 

## Further reading 

Infectious Disease Modelling https://towardsdatascience.com/infectious-disease-modelling-beyond-the-basic-sir-model-216369c584c4

Model adapted from Carcione José M., Santos Juan E., Bagaini Claudio, Ba Jing, A Simulation of a COVID-19 Epidemic Based on a Deterministic SEIR Model. <b>Frontiers in Public Health</b> Vol 8, 2020 https://www.frontiersin.org/article/10.3389/fpubh.2020.00230   DOI=10.3389/fpubh.2020.00230    


[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)