# Heterogeneous mixing and transmission assumptions
Heterogeneous mixing can be a very confusing area of infectious disease modelling,
even with relatively simple compartmental models.
Let's work through this slowly and carefully.
This may not be the most fun notebook,
but if we work through these issues methodically
we'll avoid confusion later.
Please read [notebook 09](./09-freq-dens-transmission.ipynb)
to ensure you understand frequency and density-dependent transmission before
coming to this notebook.

## Equivalence between approaches
We previously explored the differences between density-dependent
and frequency-dependent transmission.
Let's now explore this in the context of a heterogeneous mixing model.
To get started, let's do our standard preliminaries, set up a
generic shell of a model, and create a simple stratification object that
only stratifies the population into two categories and allows for 
a $2 \times 2$ mixing matrix that defines interactions between these categories.

In [None]:
import pandas as pd
import numpy as np
pd.options.plotting.backend = "plotly"
from jax import numpy as jnp

from summer2 import CompartmentalModel, Stratification, Multiply
from summer2.parameters import Parameter, DerivedOutput, Function

In [None]:
def build_sir_model(
    config: dict,
) -> CompartmentalModel:
    """
    Get the shell of a simple, unstratified model,
    with no infection process implemented yet - we'll define this later.
    
    Args:
        config: User requests to define model construction
    Returns:
        The model object, that won't do much yet  
    """
    
    compartments = config["compartments"]
    analysis_times = (0.0, config["end_time"])
    model = CompartmentalModel(
        times=analysis_times,
        compartments=compartments,
        infectious_compartments=("infectious",),
    )
    model.set_initial_population(
        distribution=
        {
            "susceptible": config["population"] - config["seed"], 
            "infectious": config["seed"],
        },
    )
    
    # Recovery is the only transition we'll define at this stage
    model.add_transition_flow(
        name="recovery", 
        fractional_rate=1.0 / Parameter("infectious_period"),
        source="infectious", 
        dest="recovered",
    )
    
    model.request_output_for_compartments(
        "prevalence",
        "infectious",
    )
    
    return model

In [None]:
def build_simple_strat(
    compartments: list,
    mixing_matrix: jnp.array,
) -> Stratification:
    """
    Get a stratification that just divides the population into two groups
    (not intended to represent any specific population characteristic),
    splits the population between these groups 
    according to the user-requested parameter,
    and sets the mixing matrix.
    
    Args:
        compartments: The compartments to be stratified 
            (here all the model's compartments)
        mixing_matrix: The mixing matrix for this stratification
    Returns:
        The completed Stratification object    
    """
    
    mix_strat = Stratification(
        "groups",
        ["group1", "group2"],
        compartments,
    )
    
    prop1 = Parameter("prop1")
    prop2 = 1.0 - prop1
    mix_strat.set_population_split(
        {
            "group1": prop1,
            "group2": prop2,
        }
    )

    mix_strat.set_mixing_matrix(mixing_matrix)

    return mix_strat

Let's define some standard values for model configuration.
We'll set the total population to one,
which should make things easier later.
With a population of one,
we can think of the compartment sizes as representing
proportions of the total modelled population.

In [None]:
model_config = {
    "end_time": 40.0,
    "population": 1.0,
    "seed": 0.01,
    "compartments": (
        "susceptible", 
        "infectious", 
        "recovered",
    ),
}

In [None]:
parameters = {
    "risk_per_contact": 0.5,
    "infectious_period": 4.0,
    "prop1": np.random.uniform(),
    "mixing_value": np.random.uniform(),
}

Now we're ready to think through the epidemiology,
and we have four possible sets of assumptions:
with and without stratification, factorially combined with
density-dependent and frequency-dependent transmission.

## Density-dependent transmission, unstratified
Under this assumption (which we've covered previously), 
we have no mixing matrix
and so the parameter to the density-dependent transmission
flow governs the infection rate.
As mentioned before, this parameter represents the rate at which two specific
individuals in the population come into contact with one another.

If we keep the risk of transmission per contact set to one,
then we are saying that each infectious person in the population
comes into contact with each susceptible person once per time unit.

## Density-dependent transmission, stratified
If we apply the simple two-stratum stratification we defined above
and want to have the option to implement heterogeneous mixing later,
we'll need to define a $2 \times 2$ mixing matrix.
Although this is deliberately defeating the purpose of heterogeneous mixing,
let's try to keep the behaviour the same as for the unstratified model.
We've now got some proportion of the population assigned to `group1`
and the remainder assigned to `group2`, 
but other than their mixing behaviours,
these groups will have the same characteristics.
To calculate the force of infection,
we should take a row of the mixing matrix (e.g. the top row for `group1`)
and multiply it through by the number of infectious people 
in each of our two strata.
The values of the matrix therefore represent the rates
at which two individuals from specific sub-groups come into effective contact.
To keep the dynamics the same as for the unstratified version,
we need to set our matrix to a $2 \times 2$ matrix of ones,
as we do within the loop below.

## Frequency-dependent transmission, unstratified
Now let's start from the same dynamics as we did in the density-dependent case,
but this time using the frequency-dependent transmission assumption.
Here, although the force of infection calculation involves dividing through
by the total population size,
we have set the total population to one,
so we get exactly the same dynamics without any other changes.
Remember that in this case,
the transmission parameter represents
the rate at which an individual is contacted by _any_
other individual within the population,
but when the population size is one,
the parameter values we need to represent this are the same.

## Frequency-dependent transmission, stratified
This one is a bit harder to understand than the other three,
which is the point.
Here we are calculating the force of infection as the
values from a row of the mixing matrix multiplied element-wise through
by the **_prevalence_** of infection in each of the infecting
strata we are simulating.
This time round,
if we kept the values of the mixing matrix the same
as in the density-dependent transmission assumption,
we would double the rate of transmission.
This is because the prevalence of infection is
the same across our two strata
(regardless of the proportion of the population assigned to each stratum,
because we have not changed anything else epidemiologically
about how our sub-populations behave).
To keep the modelled dynamics the same 
as for our other three possible assumptions above,
we need to ensure that the rows of our mixing matrix sum to one.
We can think of this as ensuring that a proportion of the contacts
come from each of the two infecting strata in the model.
The reason for this is that we are now thinking of the rate of contact
for an individual we are modelling as being the rate at which
that person receives contacts from _any_ other person in the population stratum.

There are lots of ways we could ensure that the totals
of the rows of our mixing matrix sum to one,
but one fairly simple way to do it is
to ensure that the proportion of each of the group's 
contacts that come from the same group is `mixing_value`,
while the remainder of the contacts (`1.0 - mixing_value`)
come from the other stratum.
There are lots of other possible assumptions,
and this assumption might not be appropriate in some situations,
but we'll implement this one in `build_frequency_mixing_matrix` just below.

In [None]:
def build_frequency_mixing_matrix(mixing_value):
    return jnp.array(
        [
            [mixing_value, 1.0 - mixing_value],
            [1.0 - mixing_value, mixing_value],
        ]
    )

OK, now we're all set up,
let's run each of our four possible assumptions by using nested loops 
to work through each combination of assumptions.

In [None]:
transmission_options = ("dens", "freq")
stratification_options = ("stratified", "unstratified")
outputs = pd.DataFrame()

for trans_opt in transmission_options:
    for stratify_opt in stratification_options:
        model = build_sir_model(model_config)
        
        # Density dependence
        if trans_opt == "dens":
            model.add_infection_density_flow(
                name="infection", 
                contact_rate=Parameter("risk_per_contact"),
                source="susceptible", 
                dest="infectious",
            )
            if stratify_opt == "stratified":
                mixing_matrix = jnp.array(
                    [
                        [1.0, 1.0],
                        [1.0, 1.0],
                    ]
                )
                mix_strat = build_simple_strat(model_config["compartments"], mixing_matrix)
                model.stratify_with(mix_strat)

        # Frequency dependence
        else:
            model.add_infection_frequency_flow(
                name="infection", 
                contact_rate=Parameter("risk_per_contact"),
                source="susceptible", 
                dest="infectious",
            )
            if stratify_opt == "stratified":
                mixing_matrix = Function(
                    build_frequency_mixing_matrix, 
                    (Parameter("mixing_value"),),
                )
                mix_strat = build_simple_strat(model_config["compartments"], mixing_matrix)
                model.stratify_with(mix_strat)

        # Run and collate the results
        model.run(parameters)
        outputs[f"{trans_opt}_{stratify_opt}"] = model.get_derived_outputs_df()["prevalence"]

## Checking the behaviour
Let's make sure that we do get the same results back with each of these
four assumptions.
(Toggle the `plotly` curves on and off by clicking on the legend
to demonstrate the the curves are all exactly overlying one another.)

In [None]:
differences = outputs.min(axis=1) - outputs.max(axis=1)
assert all(abs(differences) < 1e-9), "There's a discrepancy"
outputs.plot()

Phew, that was a little laborious, 
and we didn't see any interesting epidemiological dynamics.
However, hopefully we now have a starting point
to really understand heterogeneous mixing.