# Derived outputs

## Introduction
In the first notebook, we introduced our general approach to creating and running a simple compartmental model of the transmission of an acute immunising infection. This notebook builds on this simple model to consider other "derived outputs" that we might want to examine other than just the estimated compartment sizes over time. In general, we may wish to estimate epidemiological quantities that are derived from some combination of:

- The model compartment sizes for each timestep
- The model flow rates at each timestep
- Model inputs

_summer_ offers a range of approaches to calculating model outputs beyond
just the absolute size of the various compartments modelled,
which are described in the _summer_ API.

In [None]:
# If running on Google Colab, run the following line of code to install the summer package
# %pip install summerepi2

In [None]:
import pandas as pd
pd.options.plotting.backend = "plotly"

from summer2 import CompartmentalModel
from summer2.parameters import Parameter

In [None]:
def get_sir_model(
    model_config: dict,
) -> CompartmentalModel:
    """
    This is the same model as introduced in notebook 01.
    """
    
    compartments = (
        "susceptible",
        "infectious",
        "recovered",
    )
    analysis_times = (
        model_config["start_time"], 
        model_config["end_time"],
    )
    model = CompartmentalModel(
        times=analysis_times,
        compartments=compartments,
        infectious_compartments=["infectious"],
    )
    model.set_initial_population(
        distribution=
        {
            "susceptible": model_config["population"] - model_config["seed"], 
            "infectious": model_config["seed"],
        }
    )
    model.add_infection_frequency_flow(
        name="infection", 
        contact_rate=Parameter("contact_rate"),
        source="susceptible", 
        dest="infectious",
    )
    model.add_transition_flow(
        name="recovery", 
        fractional_rate=Parameter("recovery"),
        source="infectious", 
        dest="recovered",
    )
    model.add_death_flow(
        name="infection_death", 
        death_rate=Parameter("infection_death"),
        source="infectious",
    )
    return model

In [None]:
config = {
    "population": 1000.,
    "seed": 10.,
    "start_time": 0.,
    "end_time": 20.,
}

parameters = {
    "recovery": 0.333,
    "infection_death": 0.05,
    "contact_rate": 1.,
}

sir_model = get_sir_model(config)
sir_model.run(parameters=parameters)
compartment_values = sir_model.get_outputs_df()
compartment_values.plot.area()

## Calculations based on compartment sizes
First let's think about the proportion of the population ever infected.
This might be a particularly important output, because it 
might be the best quantity emerging from our model to compare against data from a serosurvey.
Specifically, we want to know the proportion of the total population that is
in either the `infectious` or `recovered` compartments.
In our very simple model, this can be easily derived from the compartment sizes.
This sort of quantity is easy to derive from the
compartment size dataframe that can be output from the model after it has run
using the `get_outputs_df` method, as shown in the previous cell.
However, these calculations can get more complicated,
so we'll demonstrate the syntax for asking the _summer_ object to calculate this.

In [None]:
# Find the size of the compartments that have ever been infected
sir_model = get_sir_model(config)
sir_model.request_output_for_compartments(
    name="ever_infected", 
    compartments=["infectious", "recovered"]
)

# Find the total population
sir_model.request_output_for_compartments(
    name="total_population",
    compartments=sir_model.compartments,
)

# Get the proportion
sir_model.request_function_output(
    name="prop_ever_infected",
    func=lambda inf, tot: inf / tot,
    sources=["ever_infected", "total_population"],
)
sir_model.run(parameters=parameters)
derived_outputs = sir_model.get_derived_outputs_df()
derived_outputs["prop_ever_infected"].plot(title="Seropositive proportion")

## Flow outputs

A **[flow output](/api/model.html#summer.model.CompartmentalModel.request_output_for_flow)** tracks a set of requested flow rates for each timestep. These requests can also select flows between particular strata in a stratified model (see later examples).
For example, we might want to ask the model to track the number of people who died from infection per timestep. Note that this is not represented by any of the explicitly modelled states.

In [None]:
sir_model = get_sir_model(parameters)

# Request that the model calculate a derived output when it is run
sir_model.request_output_for_flow(
    name="deaths", 
    flow_name="infection_death"
)

Now when we run the model, we can obtain a pandas dataframe representing the new infection-related daily deaths that can be accessed through `model.get_derived_outputs_df`, as follows.

In [None]:
# Run the model
sir_model.run(parameters=parameters)

# View the derived outputs that were calculated when the `run()` method was called
modelled_deaths = sir_model.get_derived_outputs_df()
modelled_deaths.plot()

### Distinguishing incidence from infection
Let's build a new model that incorporates an explicit delay between infection and later progression or activation to the infectious compartment.
Because we still only allow that the compartment called "_infectious_" is actually infectious (contributes to the force of infection calculation),
we therefore have a delay between the process of being infected by someone else and progressing to become infectious yourself.

In [None]:
def get_seir_model():
    """
    An adaptation of the SIR model introduced above, with a couple of
    small differences to turn it into an SEIR model.
    Generate an instance of an SEIR model with some fixed parameters, 
    population distribution and parameters.
    
    Returns:
        model: The SEIR compartmental model
    """
    compartments = (
        "susceptible",
        "exposed",
        "infectious",
        "recovered",
    )
    infectious_compartment = [
        "infectious",
    ]
    analysis_times = (
        parameters["start_time"], 
        parameters["end_time"]
    )
    
    model = CompartmentalModel(
        times=analysis_times,
        compartments=compartments,
        infectious_compartments=infectious_compartment,
    )
    
    # Check and assign infectious seed
    pop = parameters["population"]
    seed = parameters["seed"]
    suscept_pop = pop - seed
    msg = "Seed larger than population"
    assert pop >= 0.
    
    model.set_initial_population(
        distribution={
            "susceptible": suscept_pop, 
            "infectious": seed}
    )
    
    # Add the flows to the model
    model.add_infection_frequency_flow(
        name="infection", 
        contact_rate=param("contact_rate"),
        source="susceptible", 
        dest="exposed",  # This is different from the SIR model
    )
    # This flow didn't exist in the SIR model
    model.add_transition_flow(
        name="progression",
        fractional_rate=param("progression"),
        source="exposed",
        dest="infectious",
    )
    model.add_transition_flow(
        name="recovery", 
        fractional_rate=param("recovery_rate"), 
        source="infectious", 
        dest="recovered",
    )
    model.add_death_flow(
        name="infection_death", 
        death_rate=param("death_rate"), 
        source="infectious",
    )
    return model

Let's track these transitions (infection and progression) explicitly.
The process of infection is intuitive, but let's refer to the process of progressing from the 
exposed to the infectious compartment as "incidence",
because it represents the rate at which new disease episodes occur.

In [None]:
seir_model = get_seir_model()

parameters.update(
    {"progression": 0.333}
)

seir_model.request_output_for_flow(
    name="infection",
    flow_name="infection",
)
seir_model.request_output_for_flow(
    name="incidence", 
    flow_name="progression",
)

Looks like these quantities are pretty similar, but not identical, in this model.
This is consistent with what we would expect.
The delay from individuals being infected to progressing to active disease
is directly reflected in the delay in the changes in the infection and incidence quantities.

In [None]:
seir_model.run(parameters=parameters)
derived_outputs = seir_model.get_derived_outputs_df()
derived_outputs.plot()

## Cumulative outputs

You can use a  **[cumulative output](/model.html#summer.model.CompartmentalModel.request_cumulative_output)** to request that the model tracks the cumulative sum of other derived outputs over time. For example, let's track total infection deaths and the total people recovered:

In [None]:
model = get_sir_model(parameters)
model.request_output_for_flow(name="deaths", flow_name="infection_death")

# Request that the 'deaths' derived output is accumulated into 'deaths_cumulative'.
model.request_cumulative_output(name="deaths_cumulative", source="deaths")

# Run and plot the outputs
model.run(parameters=parameters)
derived_outputs = model.get_derived_outputs_df()
derived_outputs.plot()

## Aggregate outputs

You can use an **[aggregate output](/model.html#summer.model.CompartmentalModel.request_aggregate_output)** to request an aggregate of other derived outputs.

In [None]:
model = get_sir_model(parameters)

# Track some flows
model.request_output_for_flow(name="deaths", flow_name="infection_death")
model.request_output_for_flow(name="recoveries", flow_name="recovery")

# Accumulate the flows
model.request_cumulative_output(name="deaths_cumulative", source="deaths")
model.request_cumulative_output(name="recoveries_cumulative", source="recoveries")

# Aggregate 'deaths_cumulative' and 'recovered_cumulative' into a single output
model.request_aggregate_output(
    name="cum_dead_or_recovered",
    sources=["deaths_cumulative", "recoveries_cumulative"]
)

model.run(parameters=parameters)
derived_outputs = model.get_derived_outputs_df()

In [None]:
derived_outputs[["deaths_cumulative", "recoveries_cumulative"]].plot.area()