## Semi-mechanistic modelling introduction

### Rationale
Thinking about this equation in Faria, et al:
$\\i_{s,t} = (1-\frac{n_{s,t}}{N})R_{s,t}\sum_{\tau<t} i_{s,\tau}g_{t-\tau}$

This is a standard "semi-mechanistic" or "renewal" modelling approach,
in that the population is not explicitly partitioned into categories or compartments.
It is divided in this way for our standard compartmental models,
including both standard SEIR `summer` models, 
as well as Romain's semi-mechanistic models,
which are compartmental with an additional non-mechanistic random walk 
flow adjustment.

First, ignoring strains, we'll consider:

$i_t = (1-\frac{n_t}{N})R_t\sum_{\tau<t} i_{\tau}g_{t-\tau}$

This is essentially the same as the equation provided by [Cori, et al.](https://academic.oup.com/aje/article/178/9/1505/89262?login=true):

$\mathbf{E}[I_t] = R_t\sum_{s=1}^t I_{t-s}w_s$

For now, we'll also ignore susceptible depletion and a varying reproduction number, and so consider:

$i_t = R_0\sum_{\tau<t} i_\tau g_{t-\tau}$

This notebook builds up this basic approach from the first principles.

In [None]:
from scipy.stats import gamma
import numpy as np
import pandas as pd
pd.options.plotting.backend = "plotly"

from emu_renewal.distributions import GammaDens

### Parameters
Choose some arbitrary model parameters to get started.

In [None]:
n_times = 20
seed = 1.0
r0 = 2.0
incidence = np.zeros(n_times)
incidence[0] = seed

### Generation time
Get a distribution we can sensibly use for the generation time,
which could represent an acute immunising respiratory infection.

In [None]:
# Generation time summary statistics
gen_mean = 5.0
gen_sd = 1.5

# Calculate equivalent parameters
var = gen_sd ** 2.0
scale = var / gen_mean
a = gen_mean / scale
gamma_params = {"a": a, "scale": scale}

# Get the increment in the CDF
# (i.e. the integral over the increment by one in the distribution)
gen_time_densities = np.diff(gamma.cdf(range(n_times + 1), **gamma_params))

pd.Series(gen_time_densities, index=range(n_times)).plot(labels={"index": "time", "value": "density"}).update_layout(showlegend=False)

### Check calculations make sense from first principles
Looping in native Python with pre-calculated generation times
to be completely explicit (but slow).
Note that the delay is specified as `t - tau - 1` because
delay then starts from zero each time,
which then indexes the first element of the generation time densities.
As shown in the previous cell,
the `gen_time_densities` is the integral of the probability
density over each one-unit interval of the gamma distribution.

In [None]:
for t in range(1, n_times):
    val = 0
    for tau in range(t):  # For each day preceding the day of interest
        delay = t - tau - 1  # The generation time index for each preceding day to the day of interest
        val += incidence[tau] * gen_time_densities[delay] * r0  # Calculate the incidence value
    incidence[t] = val
incidence

Get rid of one loop to get lists/arrays for the incidence and generation time distribution 
(and check that calculations are the same).

In [None]:
for t in range(1, n_times):
    delays = [t - tau - 1 for tau in range(t)]
    gammas = gen_time_densities[delays]
    incidence[t] = (incidence[:t] * gammas).sum() * r0
incidence

We can get this down to a one-liner if preferred.
The epidemic is going to just keep going up exponentially, of course, 
because $R_{0} > 1$ and there is no susceptible depletion.

In [None]:
for t in range(1, n_times):
    incidence[t] = (incidence[:t] * gen_time_densities[:t][::-1]).sum() * r0
incidence
pd.Series(incidence).plot(labels={"index": "day", "value": "incidence"})

Already some interesting phenomena are emerging, 
in that the humps are the generations of cases from the first seeding infection,
which progressively smooth into one-another with generations of cases.

### Threshold behaviour
Next let's check that the threshold behaviour is approximately correct.
We would expect a declining epidemic with $R_{0} < 1$ (even without
susceptible depletion implemented yet).

In [None]:
r0 = 0.8
for t in range(1, n_times):
    incidence[t] = (incidence[:t] * gen_time_densities[:t][::-1]).sum() * r0
pd.Series(incidence).plot(labels={"index": "day", "value": "incidence"})

## Susceptible depletion
We'll now start to think about susceptible depletion.

Again, from this equation in Faria, et al:
$\\i_{s,t} = (1-\frac{n_{s,t}}{N})R_{s,t}\sum_{\tau<t} i_{s,\tau}g_{t-\tau}$

And again reducing the complexity of this by ignoring strains,
we'll now consider the equation with susceptible depletion included:
$\\i_t = (1-\frac{n_t}{N})R_t\sum_{\tau<t} i_{\tau}g_{t-\tau}$

### Parameters
Set model parameters, now including the population size.
Also get the generation times as described previously
(parameter calculation code now packaged away).
We'll need a higher reproduction number to deplete 
the susceptible population within the time window we have.

In [None]:
r0 = 6.0
pop = 100.0
incidence = np.zeros(n_times)
incidence[0] = seed

gen_time_densities = GammaDens().get_densities(n_times, gen_mean, gen_sd)

### Model run
Run the model with susceptible depletion,
decrementing the susceptible population by the incidence at each step.
We'll also zero out any negative values for the susceptibles
that could occur if the time step is too large.
For reasonable parameter values, these typically seem to be very small.

In [None]:
suscept = pop - seed
for t in range(1, n_times):
    suscept_prop = suscept / pop
    infect_contribution_by_day = incidence[:t] * gen_time_densities[:t][::-1] * r0
    this_inc = infect_contribution_by_day.sum() * suscept_prop
    incidence[t] = this_inc
    suscept = max(suscept - this_inc, 0.0)
pd.Series(incidence).plot(labels={"index": "day", "value": "incidence"})

Now with susceptible depletion, we have an epi-curve that goes up in the initial phase with $R_0 > 1$,
but comes back down as susceptibles are depleted and so $R_t$ falls below one.

## Susceptible depletion and varying $R_{0}$
Building on the previous cells,
we'll now look at varying the reproduction number with time,
because inferring the variation in this quantity is what
I'd like to achieve from these models.

As previously, the equation we're considering will be:
$\\i_t = (1-\frac{n_t}{N})R_t\sum_{\tau<t} i_{\tau}g_{t-\tau}$
However, now the $R_{t}$ value is determined both
by the proportion of the population remaining susceptible
and an external "random" process.
At this stage, the process will be arbitrary values,
and there are several functions that could be used 
at this stage (including a random walk and an 
autoregressive process).

### Parameters
Set model parameters, now including the population size.
Also get the generation times as previously.

In [None]:
incidence = np.zeros(n_times)
incidence[0] = seed

gen_time_densities = GammaDens().get_densities(n_times, gen_mean, gen_sd)

### Model run
Run the model with susceptible depletion,
and a variable intrinsic reproduction number.
Now we can manipulate the shape of the epicurve a little more.

In [None]:
process_req = [2.0, 1.2, 2.4, 1.8]
process_times = np.linspace(0.0, n_times, len(process_req))
process_vals = np.interp(range(n_times), process_times, process_req)
suscept = pop - seed
for t in range(1, n_times):
    suscept_prop = suscept / pop
    infect_contribution_by_day = incidence[:t] * gen_time_densities[:t][::-1] * r0
    this_inc = infect_contribution_by_day.sum() * suscept_prop * process_vals[t]
    incidence[t] = this_inc
    suscept = max(suscept - this_inc, 0.0)
pd.Series(incidence).plot(labels={"index": "day", "value": "incidence"})

Alternatively, we may wish to use the log process values
rather than the straight linear parameters,
but we can get the same result back this way.

In [None]:
process_times = np.linspace(0.0, n_times, len(process_req))
process_vals = np.interp(range(n_times), process_times, process_req)
suscept = pop - seed
for t in range(1, n_times):
    suscept_prop = suscept / pop
    infect_contribution_by_day = incidence[:t] * gen_time_densities[:t][::-1] * r0
    this_inc = infect_contribution_by_day.sum() * suscept_prop * np.exp(process_vals[t])
    incidence[t] = this_inc
    suscept = max(suscept - this_inc, 0.0)
pd.Series(incidence).plot(labels={"index": "day", "value": "incidence"})