# Notebook 2.5.1: Student Plague (The SIR Model)

---

<br>

*Modeling and Simulation in Python*

Copyright 2021 Allen Downey, (License: [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-nc-sa/4.0/))

Revised, Mike Augspurger (2021-present)

<br>

---

Every year at Augustana, a crowd of new students come to campus from
around the country and the world. Most of them arrive healthy and happy, but usually at least one brings with them some kind of infectious disease. A few weeks later, predictably, some fraction of the incoming class comes down with what we call the "Freshman Plague".

<br>

<center>
<img src = https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Images/2_5/so_good.PNG width = 300>
</center>

<br>

We want to develop a model the spread of this disease and use it to evaluate the effectiveness of
possible interventions.  But the spread of disease depends on multiple variables: how many people are infected? how many are recovered? how many have not been infected at all?

<br>

The interaction of all of these factors can be modeled as a *system of differential equations*: multiple DEQs that are interdependent.  Such systems tend to be unpredictable, and analytical solutions are rare, so we'll really need to depend on our computational tools for this model.


---

## The Kermack-McKendrick Model

To implement our simulation, we are going to borrow an existing model, called the Kermack-McKendrick (KM) model. Along the way, we will consider the way it abstracts the system, and identify the capabilities and
limitations that result from these abstraction decisions.


### Representing the state of the system in the KM model

One of the beautiful things about the KM model is that it is built from the ground up by making some basic assumptions about how disease behaves in a human population.

<br>

The first of these assumptions concerns how we can describe the *state* of the system.  KM is an example of an *SIR model*,
so-named because it divides a system (i.e. a group of people) into three categories:

<br>

-   *S*: "Susceptible" people are people who are capable of contracting the disease if they come into contact with someone who is infected.  So a "susceptible" person is not sick, and has no immunity to the infection.

-   *I*: "Infectious" people are currently sick, and so capable of passing along the disease if they come into contact with a "susceptible" person.

-   *R*: "Recovered" people have been infected but recovered. In the basic version of the model, people who have recovered are considered to be immune to
reinfection and are not capable of infecting others.

<br>

It's important to note that this is a fairly intuitive way to describe the system: people are either not sick, sick, or recovered.  But recognize, too, that it is not without flaws: recovery equates to immunity for some diseases but not for others, for instance, so it should be on our mental list of simplifying abstractions.

<br>

The KM model also assumes that the population is closed; that is, no one
arrives or departs, so the size of the population is constant.

---

<br>

🟨 🟨  Active Reading

Consider your experience with Covid.  How might the category "susceptible" perhaps too simple?  In what ways are some uninfected people more or less susceptible than others?

✅ ✅ Put your answer here

---

### Representing change in the KM model

Since epidemics grow and decline over time, we need to model the way that the number of people in each category changes over time.

<br>

The first thing to consider is how often an infectious person recovers.  Suppose we know that people with the disease are infectious for a period of 4 days, on average. If a large number of people are infectious at a particular point in time, we would expect about 1 out of 4 to recover on any particular day.  Putting that a different way, if the time between recoveries is 4 days, the *recovery rate* is about 0.25 recoveries per day, which we'll denote with the letter $R$.

<br>

The recovery rate $R$ will affect the rates of changes in our three categories.  Let's say that the total number of people in the population is $N$, and the *fraction* currently infectious is $i$.  This means there are currently $i N$ infectious people, and the total number of recoveries we expect per day would be $R i N$.

<br>

Now let's think about the number of new infections.  New infections will come with contact between susceptible and infectious people. Suppose we know that each susceptible person comes into contact with 1 person every 3 days, on average, in a way that would cause them to become infected if the other person is infected. We'll denote this contact rate with $C$.

<br>

It's probably not reasonable to assume that we know $C$ ahead of
time, but later we'll see how to estimate it based on data from previous outbreaks.

<br>

With $C$ we can calculate the number of newly infecious people per day.  If $s$ is the fraction of the population that's susceptible, $s N$ is
the number of susceptible people, $Cs N$ is the number of contacts per day, and $Csi N$ is the number of those contacts where the other person is infectious.

<br>

Notice again that we have built this model using only intuition and observation! No specialized knowledge of infectious diseases was necessary.



---

<br>

🟨 🟨  Active Reading:  Multiple Choice

In [None]:
import pandas as pd
import numpy as np
from urllib.request import urlretrieve

location = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/'
folder = 'Support_files/'
name = 'Embedded_Qs.ipynb'
local, _ = urlretrieve(location + folder + name, name)
%run /content/$name

#@title { run: "auto", form-width: "50%", display-mode: "form" }
home = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Embedded_Qs/'
data = display_multC('2_5_systems',home,0)
answer = "" # @param ["", "A", "B", "C", "D", "E"]
check_multC(data,answer)

What is the meaning of the variable 'i'?

A) Percentage of students who are infected
B) Recovery rate (percent chance that a sick student will recover in a given day)
C) Percent of total students who recover on a given day
D) Number of infected students


---

<br>

🟨 🟨  Active Reading:  Multiple Choice

In [None]:
#@title { run: "auto", form-width: "50%", display-mode: "form" }
home = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Embedded_Qs/'
data = display_multC('2_5_systems',home,1)
answer = "" # @param ["", "A", "B", "C", "D", "E"]
check_multC(data,answer)

What is the meaning of the variable product 'iN'?

A) Percentage of students who are infected
B) Recovery rate (percent chance that a sick student will recover in a given day)
C) Percent of total students who recover on a given day
D) Number of infected students


---

<br>

🟨 🟨  Active Reading:  Multiple Choice

In [None]:
#@title { run: "auto", form-width: "50%", display-mode: "form" }
home = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Embedded_Qs/'
data = display_multC('2_5_systems',home,2)
answer = "" # @param ["", "A", "B", "C", "D", "E"]
check_multC(data,answer)

What is the meaning of the variable product 'CsN'?

A) Number of susceptible students who have 'infectible' contact with another student on a given day
B) Contact rate (percent chance that a student will be in 'infectible' contact with another student on a given day)
C) Number of susceptible students who get sick on a given day
D) Percent of total students who recover on a given day


---

<br>

🟨 🟨  Active Reading:  Multiple Choice

In [None]:
#@title { run: "auto", form-width: "50%", display-mode: "form" }
home = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Embedded_Qs/'
data = display_multC('2_5_systems',home,3)
answer = "" # @param ["", "A", "B", "C", "D", "E"]
check_multC(data,answer)

What is the meaning of the variable product 'CsiN'?

A) Number of susceptible students who have 'infectible' contact with another student on a given day
B) Contact rate (percent chance that a student will be in 'infectible' contact with another student on a given day)
C) Number of susceptible students who get sick on a given day
D) Percent of total students who recover on a given day


---

### Representing change with differential equations

If we treat time as a continuous quantity, we can write differential
equations that describe the rates of change for $s$, $i$, and $r$ (where $r$ is the fraction of the population that has recovered):

<br>

$$\frac{ds}{dt} = -C s i$$

<br>

$$\frac{di}{dt} = C s i - R i$$

<br>

$$\frac{dr}{dt} = R i$$

<br>

This looks really complicated, right?  All those letters and differential equations!  But we can "translate" them into English, and see that they represent a fairly intuitive sense of how these populations would change.

<br>

The first equation, for instance, says this: "The rate of change of the fraction of susceptible people is the product of the contact rate, the fraction of susceptible people, and the fraction of infected people."  More colloquially: lots of uninfected people will get sick if there is a lot of contact, a lot uninfected people, and a lot of sick people.  This is nothing more than an straight-forward observation! (In practice, defining $\beta$ is the challenging part).

<br>

<center>
<img src = https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Images/2_5/translate.PNG width = 300>
</center>

<br>

The second equation says essentially: "the rate of growth of infected people depends on the rate of people getting sick minus the rate of people recovering".  Not rocket science, even if it looks like it might be! 😀

---

<br>

🟨 🟨  Active Reading

In [None]:
#@title { run: "auto", form-width: "50%", display-mode: "form" }
home = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Embedded_Qs/'
data = display_multC('2_5_systems',home,4)
answer = "" # @param ["", "A", "B", "C", "D", "E"]
check_multC(data,answer)

---

### Representing the model as a stock-and-flow diagram

SIR models are examples of *compartment models*, so-called because
they divide the world into discrete categories, or compartments, and
describe transitions from one compartment to another. Compartments are
also called *stocks* and transitions between them are called
*flows*.

<br>

In this example, there are three stocks---susceptible, infectious, and
recovered---and two flows---new infections and recoveries.  Here is a stock and flow diagram for the KM model:

<br>

<img src = https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Images/2_5/stock_flow.PNG width = 600>

<br>

Stocks are represented by rectangles, flows by arrows. The parameter in the middle of the arrows represents a valve that controls the rate of flow.  

<br>

Notice, as this stock and flow diagram shows, that the flow only goes in one direction: in more complex models that would not necessarily be true.  For instance, consider a stock and flow diagram for a model in which immunity from the disease lasted only for a limited time.  In this case, a recovered person might become "susceptible" again at a rate governed by another parameter ($A$, in this case):

<br>

<img src = https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Images/2_5/stock_flow_add.PNG width = 600>


---

<br>

🟨 🟨  Active Reading: Multiple answer (List correct letters with a space in between them)

In [None]:
#@title { form-width: "50%", display-mode: "form" }
home = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Embedded_Qs/'
data = display_multAns('2_5_systems', home,5)
answer = "" #@param {type:"string"}
a = answer.split(sep=" ")
check_multAns(data,a)

Consider the stock-and-flow diagram above.  How would this  "flow" from "recovered" to "suscepitble" be represented in the differential equations?  Mark all that are correct.

A) It would require a fourth equation
B) There would need to be an additional term on the right side of the 'ds/dt' equation
C) There would need to be an additional term on the right side of the 'di/dt' equation
D) There would need to be an additional term on the right side of the 'dr/dt' equation


---