# Renewal models

## The renewal process
The renewal process forms the basis of the models throughout the notebooks in this folder.

The idea of a renewal process is quite a general one,
although there are many ways of constructing renewal model in practice,
and we focus on one such approach in our implementation.
For context, we will begin by describing how
our approach to renewal modelling fits with other literature
that adopts this approach.

### What is a renewal process?
The basic idea of a renewal process is that previously observed cases of
an infectious disease in an epidemic are responsible for triggering 
the cases observed on a subsequent date.
This of course captures the fundamental process by which
infectious disease epidemics propagate.
As such, the number of cases on this date of interest has two ingredients:
1. The average infectiousness of active cases on the date of interest
2. The weighted infectiousness of all previous cases

Throughout this notebook,
we are using the term "case" to mean an infection episode
and are assuming that the point of onset of the case is the same as
the point of infection.
This definition ignores the potentially imortant differences 
between the point of infection, the onset of symptoms and
the point of notification through the surveillance systems.
This issue should be addressed formally and will 
be considered in subsequent notebooks.

This renewal process is illustrated in the following figure:
![](renew_illustration.svg)

#### Infectiousness of cases at a point in time
This variable is called the "instantaneous reproduction number" 
or "$R_t$" and is distinguished in renewal models because it
is associated with a widely understood epidemiological intuition.
Specifically, $R_t$ should be thought of as 
the average number of secondary cases that
an infectious individual would be expected to generate if they 
retained their current level of infectiousness throughout their 
infectious period.
It is therefore a key metric of the epidemic state and 
the current effectiveness of control interventions.

#### The weighted infectiousness of all previous cases
Having separated out $R_t$, the remaining quantities that we need
in order to relate $R_t$ to new incidence can be thought
of as the proportion of each person's total infectiousness
that occurs on a particular day of their infection episode.
That is, it is the (discrete) distribution of an individual's 
infectiousness over time normalised such that 
the total values of the distribution sum to one.
By "infectiousness" we mean an individual's effective infectiousness,
which is influenced by various host, pathogen and social factors.
If we define infectiousness in this way, 
then we can consider this distribution to reflect the generation interval.

#### The equation
We can define the following relationship between 
the instantaneous reproduction number at a point in time, 
the time series of preceding cases and the generation interval:
$$ I_t = R_t\sum_{\tau<t}I_{\tau}{g(t-\tau)} $$
Here $t$ indicates the time dependence (usually in days), 
such that $I_t$ represents the observed incidence on day $t$,
and $R_t$ represents the instantaneous reproduction number on day $t$.
The $g(\cdot)$ function represents the (normalised, discrete) generation interval 
for the distribution of the time at which index cases go on to infect their contacts.
We use the symbol $\tau$ to index discrete time from the current time $t$
back to the start of the simulation
(or sometimes this may be truncated when $g(\cdot)$ has declined to very low levels).
We take the product of the incidence on each preceding day
and the mass of the generation interval distribution for the time from that
day to the day for which we are making the calculation ($t$).
Finally we sum these products to obtain the infectiousness of all previous cases
weighted by their infectiousness at time $t$.

Note that this equation defines the relationship between
the three quantities of interest (the case incidence,
the instantaneous reproduction number and the generation interval).
Although this defines the main conceptual relationship we are interested in,
this does not imply that we will invariably be calculating 
case incidence from the other two quantities.

## Applications
Having reviewed some key literature on the use of renewal models for infectious disease
inference, we divide the various renewal models we identified into those that 
explicitly model case incidence forward in time as a latent state and those that do not.

### Non-latent state models
Several approaches have been proposed for estimating 
the effective reproduction number that are based on renewal models.
These are summarised by 
[Gostic, et al.](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008409) 
and we provide our perspective as follows.

#### Extensions of compartmental models
[Bettencourt and Ribeiro](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0002185)
proposed an approach to estimating the reproduction number over time
that is conceptually derived from the assumptions underpinning compartmental models.
Unless modified, this approach inherently assumes that the generation
interval distribution is exponential.
This is unlikely to hold for many infectious diseases and 
flexibility in the generation interval distribution is a key component of
many renewal modelling approaches, including the form we have implemented.

#### Likelihood-based approach
[Wallinga and Teunis](https://academic.oup.com/aje/article-lookup/doi/10.1093/aje/kwh255)
proposed a novel likelihood-based approach to estimating the reproduction number.
This method first defines the relative likelihood that a given case $i$
has been infected by another specific case in the epidemic $j$,
rather than any of the other cases in the epidemic $k$, as:
$$ p_{ij} = \frac{w(t_i - t_j)}{\sum_{i \neq k}{w(t_i - t_k)}}$$
The inequality indicates that cases cannot infect themselves 
(which could also be achieved by setting $w(t)$ to zero at $t=0$).
The reproduction number can then be considered for a specific case $j$ as:
$$ R_j = \sum_{i}{p_{ij}} $$
Although this method was reported by the authors as being relatively robust 
to incomplete reporting, the approach does inherently assume
that observed cases were triggered by preceding cases in the same epidemic.
It also assumes that the chance that any case $j$ infected a specific 
case $i$ is independent of the chance that $j$ infected any other case $k$,
although this assumption is common to many infectious disease models.

Importantly, this approach does not directly estimate the instantaneous 
reproduction number ($R_t$), but rather works forward to consider the
number of future cases that were infected by cases at a particular point in the epidemic.
This quantity is referred to as the "case" or "cohort" reproduction number,
and so is less applicable to estimates of the effect of population-wide interventions.

#### Conjugate prior-based approach
[Cori et al.](https://academic.oup.com/aje/article/178/9/1505/89262?login=false), 
leverage the association of the gamma distribution 
as the conjugate prior for the Poisson sampling distribution.
With this elegant approach, the authors start from the same renewal equation
as presented above (although notated in the reverse direction as 
$R_t\sum_{s=1}^{t}{I_{t-s}w_s})$.
Assuming that the number of cases on day $t$ is Poisson-distributed,
the prior estimate of $R_t$ can then be updated 
to obtain a posterior distribution for $R_t$.
In practice, this approach provides estimates of $R_t$ that 
are highly variable from day to day. However, the authors 
address this problem by extending their technique to estimate
$R_t$ over a time window. This provides smoother and 
more epidemiological plausible estimates of $R_t$,
which can only be calculated from the time that the time window has elapsed onwards.

Limitations of this approach include that it would be impossible to
apply this approach directly to predict incidence or $R_t$ into the future.

### Latent state models
Since the onset of the COVID-19 pandemic, 
renewal models that include explicit representation of an incidence state 
that is calculated sequentially for each time point from the preceding 
incidence values have become more prominent.
These models have underpinned some of the most prominent
publications relating to COVID-19 epidemiology,
as well as several software packages that have been used 
by public health agencies for estimating $R_t$ to guide policy.
With these approaches, a function of time is typically constructed 
to represent the evolution of $R_t$, 
the parameters of which can then be calibrated until an accurate
fit to the empiric observations is achieved.

#### EpiNow2
The [EpiNow2 platform](https://epiforecasts.io/EpiNow2/articles/estimate_infections.html) 
has been particularly popular for estimation of $R_t$ of COVID-19.
Although it offers a non-mechanistic approach in addition to 
the renewal equation-based method, 
the default implementation is based around the renewal model.
The renewal equation implemented by EpiNow2 is denoted:
$$ I_t=R_t\sum_{\tau=1}^{g_{max}}{g(\tau|\mu_g,\sigma_g)}I_{t-\tau}$$
Conceptually, this equation captures the same principles as outlined above,
with the $g_{max}$ limit to the summation indicating that 
the generation interval is typically truncated 
when the distribution mass falls to negligible levels.
$\mu_g$ and $\sigma_g$ indicate the parameters to the distribution
used to represent the generation interval,
which may be gamma or log-normal.
By default, the $R_t$ function of time is modelled as a Gaussian process.

#### Europe application
In this application, 
[Flaxman et al.](https://www.nature.com/articles/s41586-020-2405-7#Sec11) 
adapted the standard renewal approach to incorporate the depletion of the susceptible population.
Analyses were undertaken separately for each European country,
with additional notation used to indicate the country being addressed.
If this additional notation is ignored, the equation for 
the renewal model with susceptible depletion becomes:
$$c_t=(1-\frac{\sum_{i=1}^{t-1}{c_i}}{N})R_t\sum_{\tau=0}^{t-1}{c_\tau g_{t-\tau}}$$
where $N$ represents the total population of the country of analysis.
With this approach, the proportion of the population remaining susceptible
at each time point scales the incidence value,
thus requiring a greater value of $R_t$ to offset this.
This illustrates the flexibility of this approach, 
in that such models can track additional quantities 
that emerge from the model explicitly as time evolves.

#### Manaus application
[Faria et al.](https://www.science.org/doi/10.1126/science.abh2644) 
used the renewal approach to consider the COVID-19 epidemics 
in the city of Manaus in Brazil's State of Amazonas.
Manaus suffered a major epidemic of wild-type virus 
followed by a major epidemic of the Gamma (or P.1) variant.
As such, the study authors addressed several questions 
pertaining to the extent of immune escape of the new variant.
To achieve this, the previous renewal equation is extended to:
$$i_{s,t}=(1-\frac{n_{s,t}}{N})R_{s,t}\sum_{\tau<t}{i_{s,\tau}g_{t-\tau}}$$
Again, $N$ represents the total population size,
while $n_{s,t}$ represents the extent of population immunity to strain $s$
and incorporates both immunity from previous infection with strain $s$, 
as well as the partial cross-protection afforded by infection 
with the other ciruclating strain.

#### Advantages of latent state renewal models
Because these approaches work forward in time, 
it is generally easier to extend the mechanistic aspects of 
these types of models and to incorporate additional considerations 
(such as the complex population immunity profiles 
considered in the Manaus application).
For this reason, the approach implemented within our package
is most similar to this approach.