# Basic model construction

## Objective
In this first notebook, we will use the _summer_ interface 
to demonstrate the construction of a simple model,
and illustrate several principles of infectious disease modelling.
Because this is the first notebook/chapter, 
there is slightly more code-related content in this chapter.
We'll come back to some of the features of our simple model in the following notebooks,
including thinking more about flows and how we obtain numerical solutions for our system.
In our example model, we will create an SIR compartmental model for a general, unspecified, immunising infectious disease spreading through a fully susceptible population.

In this model there will be:

- Three compartments, named `susceptible`, `infectious` and `recovered`
- A starting population of 1000 people, with 10 of them infected (and infectious)
- An evaluation timespan from time zero to 20 units of time
- Inter-compartmental flows for infection, death and recovery

Susceptible, infectious, recovered is a very commonly used model structure,
although many variations on this exist,
some of which will be explored in later notebooks.
These compartments are commonly abbreviated to "S", "I" and "R",
a convention which we will use in our diagrams.
The recovered compartment is also commonly termed
the "removed" compartment because the population in this compartment
have no effect on transmission
(except potentially through the total population size).

## Preliminaries
As with any Python script, we first have to import the objects we need,
plus we'll set the visualisation of pandas objects to be interactive.
You'll see similar cells to these in all the notebooks in the series.

In [None]:
# If running on Google Colab, run the following line of code to install the summer package
# %pip install summerepi2

In [None]:
import pandas as pd
pd.options.plotting.backend = "plotly"

from summer2 import CompartmentalModel
from summer2.parameters import Parameter

## Building a simple SIR model with summer
First, let's create a function that gives us a base SIR model with the
basic compartments, starting populations and inter-compartmental flows implemented.
The following is a diagram to represent this.
Diagrams such as these are a common way to represent compartmental models,
with the compartments represented as boxes and the flows as arrows.
Such diagrams give a quick visual impression of the system
and may be more intuitive than the _summer_ code that actually
constitutes the system or ODEs intended to represent it.
However, unlike code and ODEs, there is no formal guide to specify
exactly how these diagrams should be constructed and so 
they should not be considered as the definitive representation.
If in doubt, always look at the actual model code itself.

![](../images/sir_structure.svg)

Now let's build this model using _summer_. Note the steps in model construction included as comments through the following function.

In [None]:
# First, initialise a basic summer compartmental model,
# specifying the analysis times, compartment names and the name of infectious compartment
sir_model = CompartmentalModel(
    times=(0, 20),
    compartments=(
        "susceptible",
        "infectious",
        "recovered",
    ),
    infectious_compartments=["infectious"],
)

# Assign the starting population
sir_model.set_initial_population(
    distribution=
    {
        "susceptible": 990., 
        "infectious": 10.,
    }
)

# Add a dynamic infection flow that transitions people from susceptible to infectious
sir_model.add_infection_frequency_flow(
    name="infection", 
    contact_rate=1.,
    source="susceptible", 
    dest="infectious",
)

# Add a constant flow that transitions people from infectious to recovered
sir_model.add_transition_flow(
    name="recovery", 
    fractional_rate=0.333,
    source="infectious", 
    dest="recovered",
)

# Add a constant death flow that transitions people from infectious out of the model
sir_model.add_death_flow(
    name="infection_death", 
    death_rate=0.05,
    source="infectious",
)

# Run the model
sir_model.run()

# Display the outputs
# Note that we previously set the pandas plotting backend to plotly for interactive plots
sir_model.get_outputs_df().plot()

So we already have a basic model set up in only a few commands
that also express the code's intention.
## The code style we will adopt from here onward
There are a few aspects of this Python code that can be improved and
will make life easier for ourselves later on in the series.
Throughout the remaining notebooks, we'll adhere to these 
"better" standards of programming which should make the code more expressive,
improve the notebook interface and allow us to write more efficient code
(e.g. when we come to calibrating our models).
Specifically, we are going to:
- Package up the construction of our model into a function
- Explicitly declare the "type" of the data structures that
are expected to go into this function (as "arguments")
and to come back out at the end
- Include a "docstring" (multi-line comment)
for the function to explain the function's purpose in more detail
- Distinguish numerical parameters that we might like to adjust later from the model "configuration" settings
- Use _summer2_ `Parameter` objects rather than `float`s for our model parameters

In [None]:
def build_sir_model(  # Declare a Python function
    model_config: dict,  # ... that expects a dictionary as the type of it's one argument
) -> CompartmentalModel:  # ... and returns a summer compartmental model object
    """
    This is the function docstring
    Generate an instance of an SIR model from a set of requested configurations/settings,
    with a fixed set of compartment names, a fixed infectious compartment
    and a fixed set of flows between the compartments.
    
    Args:
        model_config: Values needed for model construction other than the parameter values   
    Returns:
        model: The summer model object
    """

    compartments = (
        "susceptible",
        "infectious",
        "recovered",
    )
    analysis_times = (
        model_config["start_time"], 
        model_config["end_time"],
    )
    model = CompartmentalModel(
        times=analysis_times,
        compartments=compartments,
        infectious_compartments=["infectious"],
    )
    model.set_initial_population(
        distribution=
        {
            "susceptible": model_config["population"] - model_config["seed"], 
            "infectious": model_config["seed"],
        }
    )
    model.add_infection_frequency_flow(
        name="infection", 
        contact_rate=Parameter("contact_rate"),
        source="susceptible", 
        dest="infectious",
    )
    model.add_transition_flow(
        name="recovery", 
        fractional_rate=Parameter("recovery"),
        source="infectious", 
        dest="recovered",
    )
    model.add_death_flow(
        name="infection_death", 
        death_rate=Parameter("infection_death"),
        source="infectious",
    )
    return model

## Getting the model, running it and examining the results
Now we can use our model building function to get an instance of this model,
run it, and have a look at the compartment size progression over time.
Note that we use the plotting functions built-in to pandas objects to do this.
Pandas is a very widely used library for data processing, which we will use extensively in this series.
Because there may be considerable output wrangling necessary after we have run our model and this wrangling is not infectious diseases-specific,
we'll use external libraries for these steps.

First, decide on some settings to use to set up the model object.

In [None]:
# Declare some quantities that we don't intend to change
config = {
    "population": 1000.,
    "seed": 10.,
    "start_time": 0.,
    "end_time": 20.,
}

# Build the model with this configuration
sir_model = build_sir_model(config)

Now we have a model object ready to run,
except that we need to specify what our parameter values are.
This will stand us in good stead as our models get more complicated,
although there are few strict rules here.
For example, we could have made "population" a parameter
rather than part of `config`.
Perhaps most importantly,
it's better not to have to build a new model
every time we want to run it with new parameters,
although it doesn't make much difference here.

In [None]:
# Now specify all the rate parameters we need
parameters = {
    "recovery": 0.333,
    "infection_death": 0.05,
    "contact_rate": 1.,
}

# Run with these parameters and plot the outputs
sir_model.run(parameters=parameters)
compartment_values = sir_model.get_outputs_df()
compartment_values.plot()

## Digging into the model object
Now that we have our `CompartmentalModel` object,
we can use this structure to inspect some aspects of what is going on under the surface,
for example, compartments, flows and other attributes.
This is **highly recommended**, 
to ensure that the model you have created is consistent with what you were wanting.
Try out using tab complete in this notebook to inspect the range of methods and
attributes that are available for a `CompartmentalModel` object.

In [None]:
print(f"Modelled compartments are: {sir_model.compartments}")
print("\nFlows implemented are:")
for i_flow in sir_model.flows:
    print(i_flow)
print(f"\nEvaluation times are: {sir_model.times}")
print("\nModelled compartment sizes are:")
compartment_values

## Epidemiological messages
This is clearly a very simple model of an epidemic caused by a short-lived pathogen that induces complete immunity in its host.
However, despite its simplicity, it does capture a surprising number of the actual features of an epidemic caused by an infection
of this type.
In general, models of infectious diseases transmission should be as complicated as they need to be,
which means that the additional complexity that we might need to inject into this model is highly dependent on the purpose that
we will be using it for - or the epidemiological question that we will be addressing through our analysis.
It may also be dependent on us having sufficient epidemiological understanding of the epidemic to be able to incorporate these
features - with a reasonable level of confidence that we are actually capturing the processes that we are interested in
(including empiric data to estimate the parameters that we need to build our more complicated model).

Let's think of some of the epidemiological features that this very simple model **does** capture:
- Very broadly, this model gives us the shape that epi curves often follow - looking vaguely like a bell
- There is an exponential growth phase when the population remains largely susceptible
- The growth in the epidemic decreases as the proportion of susceptibles decreases
- Transmission declines as the proportion of the population that is susceptible decreases
- As the effective reproduction number falls below one, the epidemic peaks and begins to decline
- As the epidemic ends and transmission declines towards zero, susceptibles are depleted, but not completely - a proportion of the population remains susceptible even after the epidemic

Let's think of some of the epidemiological features that this model **does not** capture:
- Any heterogeneity in the background population with regards progression through infection states after exposure
- Any heterogeneity in transmission, such as greater transmission between people within the population with similar characteristics
- Any heterogeneity in the pathogen, such as multiple strains with different characteristics circulating through the population
- Any changes in how people transition through their stages over time that might be induced through changes external to the model
(i.e. other than those related to the changes in the population distribution across compartments resulting from transmission of this immunising pathogen)
- Tracking any modelled quantities other than the sizes of the model's compartments

We will return to these features and how to elaborate our base model to capture them over the following notebooks.