# Usage: SIR-derived models

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lisphilar/covid19-sir/blob/master/example/usage_theoretical.ipynb)

Here, we will create example datasets with simulated values of SIR-derived models. Then, we will perform scenario analysis with them.

## Preparation
Prepare tha packages.

In [None]:
# !pip install covsirphy --upgrade
from pprint import pprint
import covsirphy as cs
cs.__version__

## Create example dataset with theoretical values
We will use `ExampleData` class to perform simulation with preset initial values and parameters. $\tau$ (coeficient for non-dimensionalization) will be set as $1440\ \mathrm{[min]}$. The first date of records will be 01Jan2020 as an example.

In [None]:
# Set tau value and start date of records
example_data = cs.ExampleData(tau=1440, start_date="01Jan2020")

No records were registered at this time.

In [None]:
# Check records
example_data.cleaned()

`ExampleData` class is a child class of `JHUData`. i.e. We can use the example data in scenario analysis. Example codes will be shown in "Scenario analysis with theoretical data" subsection.

In [None]:
issubclass(cs.ExampleData, cs.JHUData)

In [None]:
isinstance(example_data, cs.JHUData)

## SIR model
Let's start with the simlest SIR model proposed by [Kermack, W. O., & McKendrick, A. G. (1927)](https://royalsocietypublishing.org/doi/10.1098/rspa.1927.0118). "Susceptible people" may meet "Infected" persons and may be confirmed as "Infected". "Infected" patients will move to "Recovered" compertment later.
\begin{align*}
\mathrm{S} \overset{\beta I}{\longrightarrow} \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}  \\
\end{align*}

Variables:  

* $\mathrm{S}$: Susceptible (= Population - Confirmed)  
* $\mathrm{I}$: Infected (=Confirmed - Recovered - Fatal)  
* $\mathrm{R}$: Recovered or Fatal (= Recovered + Fatal)  

Parameters:  

* $\beta$: Effective contact rate $\mathrm{[1/min]}$  
* $\gamma$: Recovery (+ Mortality) rate $\mathrm{[1/min]}$  

Note:  
Though $R$ in SIR model is "Recovered and have immunity", we defines $R$ as "Recovered or fatal". This is because mortality rate cannot be ignored in our COVID-19 outbreak.

### Non-dimensional SIR model
To simplify the model, we will remove the units of the variables from the ODE model.

Set $(S, I, R) = N \times (x, y, z)$ and $(T, \beta, \gamma) = (\tau t, \tau^{-1} \rho, \tau^{-1} \sigma)$.  

This results in the ODE  
\begin{align*}
& \frac{\mathrm{d}x}{\mathrm{d}t}= - \rho x y  \\
& \frac{\mathrm{d}y}{\mathrm{d}t}= \rho x y - \sigma y  \\
& \frac{\mathrm{d}z}{\mathrm{d}t}= \sigma y  \\
\end{align*}

Where $N$ is the total population and $\tau$ is a coefficient ([min], is an integer to simplify).  

The model name and preset of parameter and initial values are registed as class variables of `SIR` class.

In [None]:
# Model name
print(cs.SIR.NAME)
# Example parameter values
pprint(cs.SIR.EXAMPLE, compact=True)

With the preset values, `ExampleData` instance will produce a example data.

In [None]:
model = cs.SIR
area = {"country": "Full", "province": model.NAME}
# Add records with SIR model
example_data.add(model, **area)

We can get example records with `ExampleData.specialized()` method.

In [None]:
# Records with model variables
df = example_data.specialized(model, **area)
df.head()

With `covsirphy.line_plot()` function, figures will be shown (or saved when `filename` argument was applied).

In [None]:
# Line plot with the example data
cs.line_plot(df.set_index("Date"), title=f"Example data of {model.NAME} model", y_integer=True)

### Reproduction number

Reproduction number of SIR model is defined as follows.

\begin{align*}
R_0 = \rho \sigma^{-1} = \beta \gamma^{-1}
\end{align*}

$R_0$ ("R naught") means "the average number of secondary infections caused by an infected host" ([external link: Infection Modeling — Part 1](https://towardsdatascience.com/infection-modeling-part-1-87e74645568a)). When $x=\frac{1}{R_0}$, $\frac{\mathrm{d}y}{\mathrm{d}t}=0$ (the number of infected cases does not change).  

We can calculate reproduction number using `.calc_r0()` method.

In [None]:
# Calculate reproduction number
# Note: population value will be applied, but not used in calculation
param_dict = cs.SIR.EXAMPLE["param_dict"].copy()
model_instance = cs.SIR(population=100000, **param_dict)
r0 = model_instance.calc_r0()
print(f"Reproduction number of {model_instance.NAME} model: {r0}")

## SIR-D model
Because we are measuring the number of fatal cases and recovered cases separately, we can use two variables ("Recovered" and "Deaths") instead of "Recovered + Deaths" in the mathematical model. We call this model as SIR-D model.
\begin{align*}
\mathrm{S} \overset{\beta  I}{\longrightarrow}\ & \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}  \\
& \mathrm{I} \overset{\alpha}{\longrightarrow} \mathrm{D}  \\
\end{align*}

Variables:  

* $\mathrm{S}$: Susceptible (= Population - Confirmed)  
* $\mathrm{I}$: Infected (=Confirmed - Recovered - Fatal)  
* $\mathrm{R}$: Recovered  
* $\mathrm{D}$: Fatal  

Parameters:  

* $\alpha$: Mortality rate $\mathrm{[1/min]}$  
* $\beta$: Effective contact rate $\mathrm{[1/min]}$  
* $\gamma$: Recovery rate $\mathrm{[1/min]}$  

### Non-dimensional SIR-D model
Set $(S, I, R, D) = N \times (x, y, z, w)$ and $(T, \alpha, \beta, \gamma) = (\tau t, \tau^{-1} \kappa, \tau^{-1} \rho, \tau^{-1} \sigma)$.  
This results in the ODE  
\begin{align*}
& \frac{\mathrm{d}x}{\mathrm{d}t}= - \rho x y  \\
& \frac{\mathrm{d}y}{\mathrm{d}t}= \rho x y - (\sigma + \kappa) y  \\
& \frac{\mathrm{d}z}{\mathrm{d}t}= \sigma y  \\
& \frac{\mathrm{d}w}{\mathrm{d}t}= \kappa y  \\
\end{align*}


The model name and preset values are registered in `SIRD` class.

In [None]:
# Model name
print(cs.SIRD.NAME)
# Example parameter values
pprint(cs.SIRD.EXAMPLE, compact=True)

Example data is here.

In [None]:
model = cs.SIRD
area = {"country": "Full", "province": model.NAME}
# Add records with SIR model
example_data.add(model, **area)
# Records with model variables
df = example_data.specialized(model, **area)
cs.line_plot(df.set_index("Date"), title=f"Example data of {model.NAME} model", y_integer=True)

### Reproduction number

Reproduction number of SIR-D model is defined as follows.

\begin{align*}
R_0 = \rho (\sigma + \kappa)^{-1} = \beta (\gamma + \alpha)^{-1}
\end{align*}

We can calculate reproduction number using `.calc_r0()` method.


In [None]:
# Calculate reproduction number
# Note: population value will be applied, but not used in calculation
param_dict = cs.SIRD.EXAMPLE["param_dict"].copy()
model_instance = cs.SIRD(population=100000, **param_dict)
r0 = model_instance.calc_r0()
print(f"Reproduction number of {model_instance.NAME} model: {r0}")

## SIR-F model
In the initial phase of COVID-19 outbreak, many cases were confirmed after they died. To consider this issue, "S + I $\to$ Fatal + I" should be added. We call the next model as SIR-F model. This is an original model of CovsirPhy. When $\alpha_{1}=0$, no difference with the SIR-D model.
\begin{align*}
\mathrm{S} \overset{\beta I}{\longrightarrow} \mathrm{S}^\ast \overset{\alpha_1}{\longrightarrow}\ & \mathrm{F}    \\
\mathrm{S}^\ast \overset{1 - \alpha_1}{\longrightarrow}\ & \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}    \\
& \mathrm{I} \overset{\alpha_2}{\longrightarrow} \mathrm{F}    \\
\end{align*}

Variables:  

* $\mathrm{S}$: Susceptible (= Population - Confirmed)  
* $\mathrm{S}^\ast$: Confirmed and un-categorized  
* $\mathrm{I}$: Confirmed and categorized as Infected  
* $\mathrm{R}$: Confirmed and categorized as Recovered  
* $\mathrm{F}$: Confirmed and categorzied as Fatal  

Parameters:  

* $\alpha_1$: Direct fatality probability of $\mathrm{S}^\ast$ (non-dimensional) 
* $\alpha_2$: Mortality rate of Infected cases $\mathrm{[1/min]}$  
* $\beta$: Effective contact rate $\mathrm{[1/min]}$  
* $\gamma$: Recovery rate $\mathrm{[1/min]}$  

Notes on $\mathrm{S}^\ast$ variable:  
$\mathrm{S}^\ast$ describes the cases who are actually carriers of the disease without anyone (including themselves) knowing about it, who either die and they are confirmed positive after death, while some others are moved to infected after being confirmed.

In JHU-style dataset, we know the number of cases who were confirmed with COVID-19, but we do not know the number of died cases who died without COVID-19.
Essentially $\mathrm{S}^\ast$ serves as an auxiliary compartment in SIR-F model to separate the two death situations and insert a probability factor of {$\alpha_1$, $1 - \alpha_1$}.  

Notes on the difference of SIR-D and SIR-F model:  
$\alpha_1$ is small at this time because performance of PCR tests was improved, but we can use SIR-F model rather than SIR-D model as an enhanced model even now becase $\alpha_1$ can be 0 in the ODE model.  

SIR-F model was developed with [Kaggle: COVID-19 data with SIR model](https://www.kaggle.com/lisphilar/covid-19-data-with-sir-model#SIR-to-SIR-F).

### Non-dimensional SIR-F model
Set $(S, I, R, F) = N \times (x, y, z, w)$ and $(T, \alpha_1, \alpha_2, \beta, \gamma) = (\tau t, \theta, \tau^{-1} \kappa, \tau^{-1} \rho, \tau^{-1} \sigma)$.  
This results in the ODE  
\begin{align*}
& \frac{\mathrm{d}x}{\mathrm{d}t}= - \rho x y  \\
& \frac{\mathrm{d}y}{\mathrm{d}t}= \rho (1-\theta) x y - (\sigma + \kappa) y  \\
& \frac{\mathrm{d}z}{\mathrm{d}t}= \sigma y  \\
& \frac{\mathrm{d}w}{\mathrm{d}t}= \rho \theta x y + \kappa y  \\
\end{align*}


We use `SIRF` class.

In [None]:
# Model name
print(cs.SIRF.NAME)
# Example parameter values
pprint(cs.SIRF.EXAMPLE, compact=True)

Example data is here.

In [None]:
model = cs.SIRF
area = {"country": "Full", "province": model.NAME}
# Add records with SIR model
example_data.add(model, **area)
# Records with model variables
df = example_data.specialized(model, **area)
cs.line_plot(df.set_index("Date"), title=f"Example data of {model.NAME} model", y_integer=True)

### Reproduction number

Reproduction number of SIR-F model is defined as follows.

\begin{align*}
R_0 = \rho (1 - \theta) (\sigma + \kappa)^{-1} = \beta (1 - \alpha_1) (\gamma + \alpha_2)^{-1}
\end{align*}

We can calculate reproduction number using `.calc_r0()` method.

In [None]:
# Calculate reproduction number
# Note: population value will be applied, but not used in calculation
param_dict = cs.SIRF.EXAMPLE["param_dict"].copy()
model_instance = cs.SIRF(population=100000, **param_dict)
r0 = model_instance.calc_r0()
print(f"Reproduction number of {model_instance.NAME} model: {r0}")

## SIR-F with exposed/waiting cases
The next model is SEWIR-F model.  
The number of exposed cases in latent period (E) and wating cases for confirmation (W) are un-measurable variables, but key variables as well as S, I, R, F. If E and W are large, outbreak will occur in the near future. Let's replace S$\overset{\beta I}{\longrightarrow}$S$^\ast$ as follows because W also has infectivity.
\begin{align*}
\mathrm{S} \overset{\beta_1 (W+I)}{\longrightarrow} \mathrm{E} \overset{\beta_2}{\longrightarrow} \mathrm{W} \overset{\beta_3}{\longrightarrow} \mathrm{S}^\ast \overset{\alpha_1}{\longrightarrow}\ & \mathrm{F}    \\
\mathrm{S}^\ast \overset{1 - \alpha_1}{\longrightarrow}\ & \mathrm{I} \overset{\gamma}{\longrightarrow} \mathrm{R}    \\
& \mathrm{I} \overset{\alpha_2}{\longrightarrow} \mathrm{F}    \\
\end{align*}

Variables:  

* $\mathrm{S}$: Susceptible  
* $\mathrm{E}$: <u>Exposed and in latent period (without infectivity)</u>  
* $\mathrm{W}$: <u>Waiting for confirmaion fiagnosis (with infectivity)</u>  
* $\mathrm{S}^\ast$: Confirmed and un-categorized  
* $\mathrm{I}$: Confirmed and categorized as Infected  
* $\mathrm{R}$: Confirmed and categorized as Recovered  
* $\mathrm{F}$: Confirmed and categorzied as Fatal  

Parameters:  

* $\alpha_1$: Direct fatality probability of $\mathrm{S}^\ast$ (non-dimensional) 
* $\alpha_2$: Mortality rate of Infected cases $\mathrm{[1/min]}$  
* $\beta_1$: <u>Exposure rate (the nymber of encounter with the virus in a minute)</u> $\mathrm{[1/min]}$  
* $\beta_2$: <u>Inverse of latent period</u> $\mathrm{[1/min]}$  
* $\beta_3$: <u>Inverse of waiting time for confirmation</u> $\mathrm{[1/min]}$  
* $\gamma$: Recovery rate $\mathrm{[1/min]}$ 

### Non-dimensional SEWIR-F model
Set $(S, E, W, I, R, F) = N \times (x_1, x_2, x_3, y, z, w)$, $(T, \alpha_1) = (\tau t, \theta)$ and $(\alpha_2, \beta_i, \gamma) = \tau^{-1} \times (\kappa, \rho_i, \sigma)$.  
This results in the ODE  
\begin{align*}
& \frac{\mathrm{d}x_1}{\mathrm{d}t}= - \rho_1 x_1 (x_3 + y)  \\
& \frac{\mathrm{d}x_2}{\mathrm{d}t}= \rho_1 x_1 (x_3 + y) - \rho_2 x_2  \\
& \frac{\mathrm{d}x_3}{\mathrm{d}t}= \rho_2 x_2 - \rho_3 x_3  \\
& \frac{\mathrm{d}y}{\mathrm{d}t}= (1-\theta) \rho_3 x_3 - (\sigma + \kappa) y  \\
& \frac{\mathrm{d}z}{\mathrm{d}t}= \sigma y  \\
& \frac{\mathrm{d}w}{\mathrm{d}t}= \theta \rho_3 x_3 + \kappa y  \\
\end{align*}

Note:  
**We cannot use SEWIR-F model for parameter estimation because we do not have records of Exposed and Waiting. Please use SIR-F model with covsirphy.SIRF class.**

`SEWIRF` class is for the SEWIR-F model.

In [None]:
# Model name
print(cs.SEWIRF.NAME)
# Example parameter values
pprint(cs.SEWIRF.EXAMPLE, compact=True)

Example records are here.

In [None]:
model = cs.SEWIRF
area = {"country": "Full", "province": model.NAME}
# Add records with SIR model
example_data.add(model, **area)
# Records with model variables
df = example_data.specialized(model, **area)
cs.line_plot(df.set_index("Date"), title=f"Example data of {model.NAME} model", y_integer=True)

### Reproduction number

Reproduction number of SEWIR-F model is defined as follows.

\begin{align*}
R_0 = \rho_1 /\rho_2 * \rho_3 (1-\theta) (\sigma + \kappa)^{-1}
\end{align*}

We can calculate reproduction number using `.calc_r0()` method.

In [None]:
# Calculate reproduction number
# Note: population value will be applied, but not used in calculation
param_dict = cs.SEWIRF.EXAMPLE["param_dict"].copy()
model_instance = cs.SEWIRF(population=100000, **param_dict)
r0 = model_instance.calc_r0()
print(f"Reproduction number of {model_instance.NAME} model: {r0}")

## SIR-F with vaccination
Vaccination is a key factor to prevent outbreak as you know.

In the previous version, we defined SIR-FV model with $\omega$ (vaccination rate) and
$$
\frac{\mathrm{d}S}{\mathrm{d}T}= - \beta S I - \omega N  \\
$$

However, **SIR-FV model was removed because vaccinated persons may move to the other compartments, including "Susceprtible". Please use SIR-F model for simulation and parameter estimation with adjusted parameter values, considering the impact of vaccinations on infectivity, its effectivity and safety.**

## SIR-F with re-infection
Re-infection (Recovered -> Susceptble) is sometimes reported and we can consider SIR-S (SIR-FS) model. However, this is not impremented at this time because we do not have data regarding re-infection. SIR-F model could be the final model in our data-driven approach at this time.  

Re-infection changes the parameter values of SIR-F model. There are two patterns.

1. If re-infected case are counted as new confirmed cases and removed from "Recovered" compartment, $\sigma$ will be decreased.
2. If re-infected cases are counted as new confirmed cases and **NOT** removed from "Recovered" compartment, $\rho$ will be increased because "Susceptible" will be decreased.

## Impact of parameter change
Because `ExampleData` class is a subclass of `JHUData`, we can perform scenario analysis with example datasets easily. We evaluate the impact of parameter changes.  

Here, we will use the following scenarios. For explanation, $\tau=1440$, the start date is 01Jan2020, population is 1,000,000 and country name is "Theoretical". **Their scenarios are not based on actual data.**

| name | 01Jan2020 - 31Jan2020 | 01Feb2020 - 31Dec2020 |
|:---:|:---:|:---|
| Main | SIR-F | SIR-F|
| Lockdown | SIR-F | SIR-F with 50% of $\rho$ |
| Medicine | SIR-F | SIR-F with 50% of $\kappa$ and 200% of $\sigma$ |
| Vaccine | SIR-F | SIR-F with 80% of $\rho$, 60% of $\kappa$ and 120% of $\sigma$|

As baseline (main scenario), we use preset values of the SIR-F model.

In [None]:
# Preset of SIR-F parameters and initial values
preset_dict = cs.SIRF.EXAMPLE["param_dict"]
preset_dict

Create records from 01Jan2020 to 31Jan2020. These records will be used commonly in the scenarios.

In [None]:
area = {"country": "Theoretical"}
# Create dataset from 01Jan2020 to 31Jan2020
example_data.add(cs.SIRF, step_n=30, **area)

Create `Scenario` instance for scenario analysis.

In [None]:
# Create Scenario instance
snl = cs.Scenario(tau=1440, **area)
snl.register(example_data)

Then, confirm the records with `Scenario.records()` instance.

In [None]:
# Show records with Scenario instance
record_df = snl.records()
display(record_df.head())
display(record_df.tail())

Note:  
Record on 01Jan2020 was removed because the number of recovered cases is 0 and this sometimes causes error in estimation.

Then, set the records from 02Jan2020 to 31Jan2020 as the 0th phase. The 0th phase is commonly used in the all scenarios.

In [None]:
# Set 0th phase from 02Jan2020 to 31Jan2020 with preset parameter values
snl.clear(include_past=True)
snl.add(end_date="31Jan2020", model=cs.SIRF, **preset_dict)
# Show summary
snl.summary()

Set the 1st phase with the same parameter values.

In [None]:
# Add main scenario
snl.add(end_date="31Dec2020", name="Main")
snl.summary()

Copy the main scenario and name it as Lockdown scenario. `Scenario.clear()` removes the future phase (th 1st phase here) and we will register th 1st phase with halved $\rho$ value. Lockdown is supposed to reduce effective contact rate.

In [None]:
# Add lockdown scenario
snl.clear(name="Lockdown")
# Get rho value of the 0th phase and halve it
rho_lock = snl.get("rho", phase="0th") * 0.5
# Add th 1st phase with the calculated rho value
snl.add(end_date="31Dec2020", name="Lockdown", rho=rho_lock)

Next, we define medicine scenario. $\kappa$ will be halved and $\sigma$ will be doubled. New medicines may reduce the severity rate and enhance recovery.

In [None]:
# Add medicine scenario
snl.clear(name="Medicine")
kappa_med = snl.get("kappa", phase="0th") * 0.5
sigma_med = snl.get("sigma", phase="0th") * 2
snl.add(end_date="31Dec2020", name="Medicine", kappa=kappa_med, sigma=sigma_med)

We define vaccine scenario. As noted in "SIR-F model with vaccination" section, vaccination impacts on $\sigma$ and $\kappa$ with depending on its effectivity and safety. If vaccinations impact on infectivity, $\rho$ value will be also changed.

In [None]:
# Add vaccine scenario
snl.clear(name="Vaccine")
rho_vac = snl.get("rho", phase="0th") * 0.8
kappa_vac = snl.get("kappa", phase="0th") * 0.6
sigma_vac = snl.get("sigma", phase="0th") * 1.2
snl.add(end_date="31Dec2020", name="Vaccine",  rho=rho_vac, kappa=kappa_vac, sigma=sigma_vac)

See the phase settings with `Scenario.summary()` as always.

In [None]:
# Show summary
snl.summary()

Show the parameter setting with `Scenario.history()`.

In [None]:
# Show the history of rho as a dataframe and a figure
# we can set theta/kappa/rho/sigma for SIR-F model
snl.history(target="rho").head()

### Compare the scenarios
We will compare the scenarios to discuss the impact of changing parameters. We have some methods for that.

1. `Scenario.describe()` shows representative values as a dataframe.
2. `Scenario.history(target="Rt")` shows the history of automatically calculated reproduction number.
3. `Scenario.history(target="variable name")` shows simulated number of cases for the specified variable.
4. If you have any ideas, please create [issues](https://github.com/lisphilar/covid19-sir/issues)! :)

#### 1. Show representative values
`Scenario.describe()` compares

- max number of infected cases and the date with the max value, 
- the number of confirmed/infected/fatal cases on the next date of the last phase, and
- reproduction numbers for the phases with different numbers.

In [None]:
# Describe the scenarios
snl.describe()

#### 2. History of reproduction number
`Scenario.history(target="Rt")` shows the history of reproduction number.

In [None]:
# Show the history of reproduction number
_ = snl.history(target="Rt")

#### 3. Simulated number of cases of the specified variable
We can also set Confired/Infected/Fatal/Recovered as the target of `Scenario.history()`.

In [None]:
# The number of infected cases
_ = snl.history(target="Infected")

In [None]:
# The number of fatal cases
_ = snl.history(target="Fatal")

### Simulation of each scenario
We can simulate the all kind of the number of cases for a specific scenario with `Scenario.simulate()`.

In [None]:
# Main scenario
_ = snl.simulate(name="Main")

In [None]:
# Lockdown scenario
_ = snl.simulate(name="Lockdown")

In [None]:
# Medicine scenario
_ = snl.simulate(name="Medicine")

In [None]:
# Vaccine scenario
_ = snl.simulate(name="Vaccine")

**For futher analysis, let's change the parameter settings and add new scenarios!**