# Infectious disease (ID) modelling
ID modelling is a fascinating field that spans many disciplines.
The field can be thought of an extension of infectious disease epidemiology,
complex systems analysis or mathematics, 
such that modellers come from a broad range of backgrounds, including science, engineering,
mathematics, clinical medicine and public health.
Despite this, there are relatively few courses available in the discipline
and essentially no coursework degrees dedicated to the area.
Moreover, there is considerable potential for greater integration of the
ID modelling with the burgeoning field of data science,
which this textbook seeks to leverage.

## What do we mean by ID modelling?
Many forms of "modelling" can be used to shed light on aspects of various infectious diseases.
Any simpler system that provides insight into the pathogen or 
infectious disease we are interested in can be referred to as a model.
For example, infecting a mouse with _Mycobacterium tuberculosis_ and examining the
pathological features that are generated as a consequence of the infection is a "model" of a human infectious disease.
Similarly, statistical modelling is commonly used to understand a range of diseases,
including infectious diseases, but is distinct from the models we consider here.

This textbook is directed at models consisting of **_mechanistic simulations_** of
infectious disease transmission.
That is, we are interested in computational simulations that explicitly represent
the mechanisms underlying the system of interest.
Whereas statistical modelling may identify relationships between risk factors 
and clinical outcomes of interest, 
it does not necessarily provide any insights into the reasons underlying these relationships.
That is, statistical modelling might address the question:
_"What is the association between risk factor A and disease outcome B?"_,
whereas mechanistic modelling could additionally address the questions:
_"Why is this association present?"_, 
_"What are the underlying epidemiological drivers responsible for this association?"_ and
_"What do we predict would happen if we implemented a new intervention
to target risk factor A?"_.

## How can ID modelling help us?
Modelling can help us answer questions that cannot be explored in other ways.
For example, predicting the trajectory of an epidemic that hasn't finished yet
under various assumptions about how we should respond to it.
This is called "scenario analysis" and can also be undertaken retrospectively
to consider past "counterfactuals".
In this case, we might estimate an epidemic's dynamics under an alternative
past that never occurred.
These are probably the first uses of ID models that would occur to many of us.
However, even the construction of a model itself can be an extremely useful exercise,
because it challenges us to think about how transmission is taking place,
what quantities might be important to the epidemic's dynamics
and what additional information we might need to obtain to gain greater insight.
Along these lines, if we have a certain intuition about how an epidemic is likely to progress,
a model can help us to challenge and understand our own reasons for coming to that conclusion.

## Why is ID modelling so different?
While there is doubtless some overlap between other types of simulation models
(e.g. non-communicable disease models) and ID models, 
there are also important differences.
The fundamental distinction arises from the positive feedback loop present in the system,
whereby the rate at which people become infected is driven by the number of
infectious people in the population.
This apparently minor consideration changes the nature of the
systems we are dealing with fundamentally, and draws us into the world of complex systems analysis.

Note that these systems need not necessarily be highly complicated in order to be complex.
For example, a model as simple as a three-compartment "SIR" (susceptible, infectious, recovered) 
infectious disease model can have dynamic behaviours that may be difficult to anticipate
until we run the system forward in time to examine its behaviour.
In systems such as this, or with a little additional complexity
(e.g. allowing reinfection after recovery),
we may be able to identify stable equilibrium points that the system will approach,
and to consider the effect of minor changes in perturbing the system from
an equilibrium point.
This means that our models can exhibit us with counter-intuitive behaviours,
whereby small changes to model inputs 
can have large or unanticipated consequences for the overall dynamics, 
which is a classic feature of a complex system.

## Why use `summer`?
[`summer`](https://github.com/monash-emu/summer2)
is a Python library that supports the construction of compartmental models of ID dynamics.
A key objective of the `summer` platform is to allow for the construction of 
models that may be both complex and complicated,
while ensuring that they are reliably constructed, run quickly
and allow the details of the construction process to be abstracted away from the user interface.
In this series of notebooks, we illustrate several core features of ID modelling
in the context of relatively simple models constructed with `summer`.
However, these features can  easily be composed into more complicated models as needed
for the investigation of more complicated epidemic dynamics,
such as specific public health interventions directed at the control of infectious diseases.
The models that we (the Epidemiological Modelling Unit, School of Public Health and Preventive Medicine,
Monash University) typically use to produce research manuscripts and policy advice documents
are more complicated than those introduced in this textbook.
However, becoming familiar with `summer`'s syntax should faciliate the reader of this series
to construct more complicated models suitable for more formal analyses.
As well as making the process of code construction less laborious for the user, 
`summer` allows for code that expresses the epidemiological intentions,
reduces risk of error and easily integrates with established libraries
to improve code speed.

## Why Python?
Python has become the leading general purpose programming language in a broad range of fields,
has a wide user community, extensive online support and an expressive and easily understood syntax.
Commonly used Python libraries (such as `pandas` and `numpy`) are core tools and are hugely popular 
in fields that include data science and machine learning.
Becoming familiar with Python and these commonly used libraries will facilitate the reader's future use of `summer`, 
but will also support the development of a range of skills that are 
rapidly transferable to other fields.
In addition to Python and its popular libraries,
we use Jupyter notebooks that can be accessed easily from our repository through Google Colab and other interfaces.
In general, we have used `summer` wherever we need an ID modelling tool
to illustrate an epidemiological principle,
but have chosen commonly used external libraries otherwise.

## The scope of this book
This textbook aims to be as introductory as possible,
but also to illustrate several key principles of modelling infectious diseases
in Python using `summer`.
Therefore, some basic knowledge of the following fields will assist understanding, including:
- Infectious diseases control and epidemiology
- Python programming and syntax
- The Python packages `pandas`, `numpy` and `plotly`
- Interaction with Jupyter notebooks

In order to keep the focus of this textbook on infectious disease modelling,
we do not provide the following:
- API documentation for `summer`, which can be found [elsewhere](https://github.com/monash-emu/summer2)
- Comprehensive examples of how `summer` methods can be implemented
- Discussion of ID epidemiology, programming with Python, 
Python packages or notebook interaction, for which many other resources are available

Nevertheless, in this series, we aim to introduce good programming practices for
the construction of `summer`-based models in Python,
while also providing code that is terse, expressive and provides insight
into ID epidemiology.

For resources to support the reader to develop additional skills in areas
such as Python, coding, data wrangling, plotting and Jupyter notebooks,
there are a wealth of resources available online.
These include tutorials, videos, source code documentation,
discussion forums (e.g. Stack Overflow) and many more.
A quick internet search will uncover a huge range of such sites,
such that we do not provide a list of links in this book.

Perhaps most importantly, dive in!
Change the code, hack away, see what error messages you get,
and let us know what you think!