# Transmission assumptions
In this notebook, we'll work through the assumptions underpinning
frequency and density-dependent transmission.

## Frequency dependence
In previous chapters, 
we have been using the frequency-dependent transmission assumption.
Under this assumption, the _per capita_ rate of transmission 
for an individual from the susceptible population is proportional
to the **_prevalence_** of infectious persons in the population.
We have been referring to the parameter used for this transition process as the `contact_rate`,
in which case we can consider it as the _per capita_ rate at which
a member of the population will come into effective contact 
with **_any_** other member of the population.
Note that an "effective" contact is defined as a contact that would result in
transmission were it to occur between a susceptible person and an infectious person.
So the calculation of the rate of infection can be thoughts of as the product of:
- The rate at which a person has effective contacts
- The prevalence of infection in the population,
i.e. the probability of an effective contact being with someone who is currently infected
- The number of susceptibles,
because we are working out the rate at which susceptibles are infected

Comparing against standard transition rates,
we can think of the first two steps as the process of calculating the force of infection,
which is equivalent to the transition rates we have been dealing with
for other flows (except that it varies with the model state).
By constrast, the last step is just multiplying through by the origin compartment as usual.

We can also notate this frequency-dependent assumption mathematically as:

$$ \lambda (t) = \beta \frac{I(t)}{N(t)} $$

where $\lambda (t)$ represents the force of infection,
$\beta$ represents the rate of effective contacts per unit time,
$I(t)$ the number of infectious persons in the population,
$N(t)$ the total population size
and $t$ time.
The absolute rate of infection is therefore $\lambda (t) S(t)$.

Let's return to the really simple SIR model from [notebook 02](./02-basic-model-intro.ipynb).
We'll initially create a function that returns the model object without
any transmission flows, so we can add the infection process we want later.
Then we'll look at the outputs of our standard model as our starting point for comparisons
to other assumptions in this notebook.

In [None]:
try:
    import google.colab
    %pip install summerepi2==1.3.6
except:
    pass

In [None]:
import pandas as pd
pd.options.plotting.backend = "plotly"

from summer2 import CompartmentalModel
from summer2.parameters import Parameter

In [None]:
def get_sir_base_structure(
    config: dict,
) -> CompartmentalModel:
    """
    Generate a mode that doesn't do much in itself, but has basic
    characteristics that we can then use to add our transmission assumptions to.
    
    Args:
        config: The fixed values used in creating the model structure
    Returns:
        The summer model object
    """
    
    compartments = (
        "susceptible",
        "infectious",
        "recovered",
    )
    analysis_times = (0.0, config["end_time"])
    model = CompartmentalModel(
        times=analysis_times,
        compartments=compartments,
        infectious_compartments=("infectious",),
    )
    model.set_initial_population(
        distribution=
        {
            "susceptible": model_config["population"] - model_config["seed"], 
            "infectious": model_config["seed"],
        },
    )
    
    model.add_transition_flow(
        name="recovery", 
        fractional_rate=Parameter("recovery"), 
        source="infectious", 
        dest="recovered",
    )
    
    return model

In [None]:
model_config = {
    "population": 1.0,
    "seed": 0.01,
    "end_time": 20.0,
}
freq_parameters = {
    "recovery": 0.333,
    "contact_rate": 1.0,
}

In [None]:
sir_freq_model = get_sir_base_structure(model_config)
sir_freq_model.add_infection_frequency_flow(
    name="infection", 
    contact_rate=Parameter("contact_rate"),
    source="susceptible", 
    dest="infectious",
)
sir_freq_model.run(parameters=freq_parameters)
sir_freq_values = sir_freq_model.get_outputs_df()
axis_labels = labels={"index": "time", "value": "proportion"}
sir_freq_values.plot(labels=axis_labels)

## How is density-dependent tranmission different?
The conceptual difference is that the force of infection
is proportional to the number of infectious persons rather than the
prevalence of infectious persons.
That is, for density-dependent transmission:

$$ \lambda (t) = \beta I(t) $$

For either of these assumptions, 
there is also a model input parameter that will be multiplied through
by the size of the infectious population (number or prevalence).
For density-dependent transmission,
the parameter itself can be thought of as incorporating the division by the population size ($N(t)$).

We've used the same symbol ($\beta$) and variable name (`contact_rate`) for our parameter here, 
but it's important to remember that this quantity now represents something very different.
Unfortunately there are no consistent rules for what 
$\beta$ should represent in the general literature,
and so it is essential to examine how it is implemented in any given model.
We should arguably choose a different symbol for each of our assumptions,
but because we're generally avoiding notating our systems as equations,
we'll just stick with $\beta$ for now.
Of course, the name of the parameter is less important than what it represents.

As for frequency-dependent transmission, 
once we have calculated the force of infection, 
we can think of it in a similar way to 
how we think of a parameter to a standard inter-compartmental transition flow.
That is, it is next multiplied by the source compartment
(`susceptible`) when we come to calculating the actual number of people being infected.

Next, let's run a model with density-dependent transmission instead of frequency-dependent.

In [None]:
dens_parameters = {
    "recovery": 0.333,
    "contact_rate": 1.0,
}
sir_dens_model = get_sir_base_structure(model_config)
sir_dens_model.add_infection_density_flow(
    name="infection", 
    contact_rate=Parameter("contact_rate"),
    source="susceptible", 
    dest="infectious",
)
sir_dens_model.run(parameters=dens_parameters)
sir_dens_values = sir_dens_model.get_outputs_df()
sir_dens_values.plot(labels=axis_labels)

The model outputs are identical, 
which is unsurprising because this is a really trivial example.
Because the population size is fixed at a value of one,
the division by $N(t)$ has no effect on the force of infection.

However, next let's consider what happens if we scale the population size back up to 1000.
We can easily recover essentially the same dynamics as we saw previously for both the
frequency and density-dependent models by dividing `population`, `seed` and `contact_rate`
by the new, larger population size.

In [None]:
# Scale all population-related quantities up by a factor of 1000
model_config.update(
    {
        "population": 1000.0,
        "seed": 10.0,
    }
)

# Scale the contact rate parameter down by a factor of 1000
dens_parameters.update({"contact_rate": 0.001})
large_pop_dens_model = get_sir_base_structure(model_config)
large_pop_dens_model.add_infection_density_flow(
    name="infection", 
    contact_rate=Parameter("contact_rate"),
    source="susceptible", 
    dest="infectious",
)
large_pop_dens_model.run(parameters=dens_parameters)
large_pop_dens_values = large_pop_dens_model.get_outputs_df()
large_pop_dens_values.plot(labels=axis_labels)

So here we have shown that if the population size is fixed over time
(or "closed" in modelling terminology),
we can easily recover the same dynamics for both transmission types.
The only difference is the interpretation of the `contact_rate` parameter
that we used to produce these two simulations.
Under the assumption of frequency dependence,
we can interpret this parameter as the rate at which two specific individuals
come into effective contact in the population.

By contrast, under the assumption of density dependence,
the parameter (`contact_rate`) should be interpreted as the number of effective contacts
made by an infectious person per unit time.
Therefore, the parameter needs to be smaller by a factor of the size of 
the population under density dependence compared to frequency dependence.

## Changing population size
So far, this is all pretty trivial.
We have demonstrated identical dynamics under these assumptions.
Next, let's consider what would happen if the population size changes over time.
To do this, let's add deaths to all of the compartments so that the population
changes dramatically during the simulation time frame.
This unrealistic for just about any population if we're thinking of the time unit as days,
but is an easy way to make sure the population size changes rapidly enough to affect model dynamics.

### Frequency dependence, declining population

In [None]:
sir_freq_deaths_model = get_sir_base_structure(model_config)
sir_freq_deaths_model.add_infection_frequency_flow(
    name="infection", 
    contact_rate=Parameter("contact_rate"),
    source="susceptible", 
    dest="infectious",
)
freq_parameters.update({"crude_death_rate": 0.05})
sir_freq_deaths_model.add_universal_death_flows(
    "non_infection_deaths",
    Parameter("crude_death_rate"),
)
sir_freq_deaths_model.run(parameters=freq_parameters)
sir_freq_deaths_values = sir_freq_deaths_model.get_outputs_df()
sir_freq_deaths_values.plot(labels=axis_labels)

### Frequency dependence, unchanged proportional epidemic dynamics
It looks like the model dynamics are different, and in a sense they are.
However, if we look at the proportional sizes of the compartments,
we see that actually the _per capita_ dynamics are essentially the same
as they were throughout our earlier examples.

In [None]:
sir_freq_deaths_props = sir_freq_deaths_values.div(sir_freq_deaths_values.sum(axis=1), axis=0)
sir_freq_deaths_props.plot(labels=axis_labels)

### Density dependence, declining population
Let's work through the same process for the density-dependent transmission model.
Again, if we look at the compartment sizes directly,
we can see that the model dynamics have changed.

In [None]:
sir_dens_deaths_model = get_sir_base_structure(model_config)
sir_dens_deaths_model.add_infection_density_flow(
    name="infection", 
    contact_rate=Parameter("contact_rate"),
    source="susceptible", 
    dest="infectious",
)
dens_parameters.update({"crude_death_rate": 0.05})
sir_dens_deaths_model.add_universal_death_flows(
    "non_infection_deaths",
    Parameter("crude_death_rate"),
)
sir_dens_deaths_model.run(parameters=dens_parameters)
sir_dens_deaths_values = sir_dens_deaths_model.get_outputs_df()
sir_dens_deaths_values.plot(labels=axis_labels)

### Density dependence, changing infection dynamics with changing population size
However, perhaps more importantly we can also see that the
dynamics are different if we look at the compartment proportions over time.
Specifically, because the population shrinks as the epidemic proceeds,
the rate of infection falls as the `susceptible` and `infectious` compartments
of the model drop, which is not offset by the shrinking denominator
of the population size (as it would be under frequency dependence).
This leads to a smaller epidemic final size relative to population,
and so a greater proportion of the population remaining susceptible throughout.

In [None]:
sir_dens_deaths_props = sir_dens_deaths_values.div(sir_dens_deaths_values.sum(axis=1), axis=0)
sir_dens_deaths_props.plot(labels=axis_labels)

## When to choose each assumption

Hopefully the last example provides some intuition around 
the differences between frequency- and density-dependent transmission.
Imagine that we have a fixed population,
perhaps a geographical region like a city,
whose population size is changing over time.
That is, the population density is increasing
because the numerator is increasing but the denominator is fixed.
We should ask ourselves whether we expect the rate of transmission
to increase under these conditions for the infectious disease
we're interested in.
If the answer is yes,
as it may be for some directly transmitted infections,
we may wish to represent this through density-dependent transmission.
If the answer is no,
as it may be for many sexually transmitted infections,
then we may wish to represent this through frequency-dependent transmission.

Remember these considerations are only relevant
if the population size is changing in some way.
If the population size is fixed,
the choice doesn't really matter that much
because we can easily recover the same dynamics from either assumption.
We would just need to adjust the parameter value we're using
as appropriate to the assumption we are using
and interpret the meaning of the parameter appropriately.

## Summary
||Frequency dependence |Density dependence|
|---|---|---|
|Quantity scaling the force of infection |Infectiousness prevalence |Number of infectious persons|
|Equation for $\lambda(t)$ |$\beta \frac{I(t)}{N(t)}$ |$\beta I(t)$ |
|Interpretation of parameter |Rate at which any other persons are contacted |Rate at which two specific individuals contact one another |
|Parameter value (if $N(t)$ represents number of persons in the population) |Big |Much smaller|
|Dynamics change markedly with changes in population size that are not directly relevant to the infection of interest |No |Yes |