# Bringing it all together

In this notebook we aim to apply several different stratifications, mixing and vaccination concepts learnt previously into a single model. The goal here is to take everything that we have learnt previously using "toy" code with simplified stratifications/functionality loosely representative of Covid transmission - and extend this to a model that we can be satisfied reasonably replicates most of the most important dynamics of Covid transmission in Malaysia.

A little like notebook number 04, this is an opportunity to take stock and apply the epidemiological principles we have worked through in the preceding notebooks.

## Standard preliminaries
Before we get into building the model, let's start off with some of our standard (or "boilerplate") code to get everything set up.

In [None]:
# pip install the required packages if running in Colab
try:
  import google.colab
  IN_COLAB = True
  %pip install summerepi
except:
  IN_COLAB = False

In [None]:
# Standard imports, plotting option and constant definition
from datetime import datetime, timedelta
from typing import List, Union
import pandas as pd
import plotly.express as px
import numpy as np
import pickle

from summer.utils import ref_times_to_dti

from summer import CompartmentalModel, Stratification, StrainStratification, Overwrite

pd.options.plotting.backend = "plotly"

COVID_BASE_DATE = datetime(2019, 12, 31)

In [None]:
# The data import module lives in a file on AuTuMN github - download it for colab use
if IN_COLAB:
    !wget https://raw.githubusercontent.com/monash-emu/AuTuMN/master/notebooks/capacity_building/philippines/import_phl_data.py
    !wget https://raw.githubusercontent.com/monash-emu/AuTuMN/master/notebooks/capacity_building/philippines/PHL_matrices.pkl
    !wget https://raw.githubusercontent.com/monash-emu/AuTuMN/master/notebooks/capacity_building/philippines/NCR_age_pops.pkl
    !wget https://raw.githubusercontent.com/monash-emu/AuTuMN/master/notebooks/capacity_building/philippines/NCR_vac_coverage.pkl

import import_phl_data
from import_phl_data import get_population_and_epi_data, get_timeseries_data

mixing_matrix = pd.read_pickle("PHL_matrices.pkl", compression='infer')


In [None]:
# Shareable google drive links
PHL_DOH_LINK = "1ULjAmO7dE9YEEI8j7MWeSujRSRPhlvE1"  # sheet 05 daily report
PHL_FASSSTER_LINK = "1Cg_jsjhXsOtsqcMVUSHK6F7y9Ky8VxZL"  # Fassster google drive zip file
# initial_population, df = get_population_and_epi_data(PHL_DOH_LINK, PHL_FASSSTER_LINK) 
initial_population, df = get_timeseries_data() 

# We define a day zero for the analysis
COVID_BASE_DATE = datetime(2019, 12, 31)

In [None]:
# Define a target set of observations to compare against our modelled outputs later
notifications_target = df["cases"]

In [None]:
age_groups = range(0, 80, 5)

In [None]:
age_pops = pickle.load(open("NCR_age_pops.pkl", "rb"))
age_pops.index = age_pops.index.map(str)

# Model

## Define a model

In [None]:
unstratified_compartments = ["S", "E", "I", "R", "S2"]

In [None]:
def build_unstratified_model(parameters: dict) -> CompartmentalModel:
    """
    Create a compartmental model, with the minimal compartmental structure needed to run and produce some sort of 
    meaningful outputs.
    
    Args:
        parameters: Flow parameters
    Returns:
        A compartmental model currently without stratification applied
    """

    model = CompartmentalModel(
        times=(parameters["start_time"], parameters["end_time"]),
        compartments=unstratified_compartments,
        infectious_compartments=["I"],
        ref_date=COVID_BASE_DATE
    )

    infectious_seed = parameters["infectious_seed"]

    model.set_initial_population(
        distribution=
        {
            "S": initial_population - infectious_seed, 
            "I": infectious_seed
        }
    )
    
    # Susceptible people can get infected
    model.add_infection_frequency_flow(
        name="infection", 
        contact_rate=parameters["contact_rate"], 
        source="S", 
        dest="E",
    )
    
    # Recovered people can also get infected
    model.add_infection_frequency_flow(
        name="reinfection", 
        contact_rate=parameters["contact_rate"], 
        source="S2", 
        dest="E",
    )
    
    # Expose people transition to infected
    model.add_transition_flow(
        name="progression",
        fractional_rate=parameters["progression_rate"],
        source="E",
        dest="I",
    )

    # Infectious people recover
    model.add_transition_flow(
        name="recovery",
        fractional_rate=parameters["recovery_rate"],
        source="I",
        dest="R",
    )
    
    # Infectious people recover
    model.add_transition_flow(
        name="waning_natural_immunity",
        fractional_rate=parameters["waning_immunity_rate"],
        source="R",
        dest="S2",
    )

    # Add an infection-specific death flow to the I compartment
    model.add_death_flow(name="infection_death", death_rate=parameters["death_rate"], source="I")
    
    model.request_output_for_flow(
        "progressions",
        "progression",
    )
    
    def prop_detected(progressions):
        return progressions * parameters["cdr"]
    
    model.request_function_output(
        "notifications",
        func=prop_detected,
        sources=["progressions"],
    )

    return model

In [None]:
def get_age_stratification(
    compartments_to_stratify: List[str],
    strata: List[str],
    matrix: Union[np.ndarray, callable],
) -> Stratification:
    """
    Create a summer stratification object that stratifies all of the compartments into
    strata, which are intended to represent age bands according to the user inputs.
    This is essentially adapting the model's age stratification approach to the format
    of the mixing matrix, which is a reasonable approach.
    
    Args:
        compartments_to_stratify: List of the compartments to stratify, which should be all the compartments
        strata: The strata to be implemented in the age stratification
        matrix: The mixing matrix we are applying for the age structure
    Returns:
        A summer stratification object to represent age stratification (not yet applied)
    """
    
    if isinstance(matrix, np.ndarray):
        msg = "Mixing matrix is not 2-dimensional"
        assert matrix.ndim == 2, msg

        msg = f"Dimensions of the mixing matrix incorrect: {matrix.shape[0]}, {matrix.shape[1]}, {len(strata)}"
        assert matrix.shape[0] == matrix.shape[1] == len(strata), msg
    
    # Create the stratification, just naming the age groups by their starting value
    strat = Stratification(name="age", strata=strata, compartments=compartments_to_stratify)
    
    age_split_props = age_pops / age_pops.sum()
    strat.set_population_split(age_split_props.to_dict())
    
    # Add the mixing matrix to the stratification
    strat.set_mixing_matrix(matrix)
    
    return strat

In [None]:
def get_strain_stratification(
    compartments_to_stratify: List[str], 
    voc_params: dict
) -> Stratification:
    """
    Create a summer stratification object that stratifies compartments into
    strata, which are intended to represent infectious disease strains.
    
    Args:
        compartments_to_stratify: List of the compartments to stratify
        voc_params: A dictionary which speicifies the infectiousness and severity of strains
    Returns:
        A summer stratification object to represent strain stratification (not yet applied)
    """
    strata = [
        "delta", 
        "omicron"
    ]
    strat = StrainStratification(name="strain", strata=strata, compartments=compartments_to_stratify)

    # At the start of the simulation, a certain proportion of infected people have the variant strain.
    strat.set_population_split(
        {
            "delta": 1.,
            "omicron": 0.,
        }
    )

    for infection_flow in ["infection", "reinfection"]:
        strat.set_flow_adjustments(
            infection_flow,
            {
                "delta": None,
                "omicron": voc_params["omicron_rel_transmissibility"],
            },
        )

    return strat

In [None]:
full_dose_coverage = pickle.load(open("NCR_vac_coverage.pkl", "rb"))
full_dose_coverage.plot.area(title="two-dose vaccination coverage")

In [None]:
# To save on calculations a little, let's thin out the data
thinning_interval = 7
thinned_full_coverage = full_dose_coverage[::thinning_interval]

def get_prop_of_remaining_covered(old_prop, new_prop):
    return (new_prop - old_prop) / (1. - old_prop)

interval_prop_unvacc_vaccinated = [
    get_prop_of_remaining_covered(
        thinned_full_coverage.iloc[i],
        thinned_full_coverage.iloc[i + 1],
    ) 
    for i in range(len(thinned_full_coverage) - 1)
]

coverage_times = thinned_full_coverage.index

pd.Series(interval_prop_unvacc_vaccinated, index=coverage_times[1:]).plot(
    title="proportion of remaining unvaccinated vaccinated during each interval"
)

In [None]:
def get_rate_from_coverage_and_duration(coverage_increment: float, duration: float) -> float:
    assert duration >= 0.0, f"Duration request is negative: {duration}"
    assert 0.0 <= coverage_increment <= 1.0, f"Coverage increment not in [0, 1]: {coverage_increase}"
    return -np.log(1.0 - coverage_increment) / duration


interval_lengths = [
    coverage_times[i + 1] - coverage_times[i] 
    for i in range(len(coverage_times) - 1)
]

vaccination_rates = [
    get_rate_from_coverage_and_duration(i, j) for 
    i, j in zip(interval_prop_unvacc_vaccinated, interval_lengths)
]
pd.Series(vaccination_rates, index=coverage_times[1:]).plot(kind="scatter")

In [None]:
len(coverage_times)

In [None]:
def get_vacc_rate_func(end_times, vaccination_rates):
    def get_vaccination_rate(time, derived_outputs):

        # Identify the index of the first list element greater than the time of interest
        # If there is such an index, return the corresponding vaccination rate
        for end_i, end_t in enumerate(end_times):
            if end_t > time:
                return vaccination_rates[end_i]

        # Return zero if the time is after the last end time
        return 0.0
    return get_vaccination_rate

vacc_rate_func = get_vacc_rate_func(coverage_times[1:], vaccination_rates)

In [None]:
def get_vaccine_stratification(
    compartments_to_stratify: List[str], 
    vaccine_params: dict
) -> Stratification:
    """
    Create a summer stratification object that stratifies compartments into
    strata, which are intended to represent vaccine stratifications.
    
    Args:
        compartments_to_stratify: List of the compartments to stratify
        vaccine_params: A dictionary which speicifies the vaccination-related parameters to implement
    Returns:
        A summer stratification object to represent strain stratification (not yet applied)
    """
    strata = ["vaccinated", "unvaccinated"]
    
    # Create the stratification
    vaccine_strat = Stratification(name="vaccination", strata=strata, compartments=compartments_to_stratify)

    # Create our population split dictionary, whose keys match the strata with 80% vaccinated and 20% unvaccinated
    pop_split = {
        "vaccinated": 0., 
        "unvaccinated": 1.,
    }

    # Set a population distribution
    vaccine_strat.set_population_split(pop_split)

    # Adjusting the death risk associated with vaccination
    vaccine_strat.set_flow_adjustments(
        "infection_death",
        {
            "unvaccinated": None,
            "vaccinated": 1. - vaccine_params["ve_death"],
        }
    )
    
    # Susceptibility
    for infection_flow in ["infection", "reinfection"]:
        vaccine_strat.set_flow_adjustments(
            infection_flow,
            {
                "unvaccinated": None,
                "vaccinated": 1. - vaccine_params["ve_infection"],
            }
        )

    return vaccine_strat

In [None]:
start_date = datetime(2021, 3, 10)
end_date = start_date + timedelta(days=500)
start_date_int = (start_date - COVID_BASE_DATE).days
end_date_int = (end_date - COVID_BASE_DATE).days

parameters = {
    "contact_rate": 0.05,
    "progression_rate": 0.3,
    "recovery_rate": 0.2,
    "death_rate": 0.001,
    "start_time": start_date_int,
    "end_time": end_date_int,
    "infectious_seed": 200.,
    "cdr": 0.1,
    "waning_immunity_rate": 1. / 180.,
}

# Get an unstratified model object
model = build_unstratified_model(parameters)

# Get and apply the age stratification
age_strat = get_age_stratification(
    model.compartments, 
    age_groups, 
    mixing_matrix["all_locations"],
)
model.stratify_with(age_strat)

# Get and apply vaccination stratification
vacc_params = {
    "ve_death": 0.9,
    "ve_infection": 0.3,
}

vacc_strat = get_vaccine_stratification(
    model.compartments,
    vacc_params,
)
model.stratify_with(vacc_strat)

# Get and apply the strain stratification
strain_strat = get_strain_stratification(
    ["E", "I"],
    {"omicron_rel_transmissibility": 2.},
)
model.stratify_with(strain_strat)


def omicron_seed_func(time, computed_values):
    """
    A simple step function to allow seeding of the new strain
    (Omicron) after the start of the analysis period.
    """    
    if 590 < time < 600:
        return 1.
    return 0.


model.add_importation_flow(
    "omicron_seeding",
    omicron_seed_func,
    "E",
    split_imports=True,
    dest_strata={"strain": "omicron"},
)

for comp in unstratified_compartments:
    model.add_transition_flow(
        name="vaccination",
        fractional_rate=vacc_rate_func,
        source=comp,
        dest=comp,
        source_strata={"vaccination": "unvaccinated"},
        dest_strata={"vaccination": "vaccinated"},
    )

model.request_output_for_compartments(
    "vaccinated",
    unstratified_compartments,
    strata={"vaccination": "vaccinated"},
)
model.request_output_for_compartments(
    "unvaccinated",
    unstratified_compartments,
    strata={"vaccination": "unvaccinated"},
)


In [None]:

model.run()

comparison_df = pd.DataFrame({
    "modelled": model.get_derived_outputs_df()["notifications"],
    "reported": notifications_target,
})
comparison_df.plot()

In [None]:
derived_df = model.get_derived_outputs_df()
derived_df[["vaccinated", "unvaccinated"]].plot.area()

## Thinking about the pieces

### What do we have in the model that reflects COVID-19 epidemiology and biology
##### Age-stratified heterogeneous mixing; 
 For COVID-19 and respiratory viruses in general social mixing plays an important role in dynamics
##### Variants/strains; 
 Viral evolution leading to different variants/strains with different underlying characteristics has been and continues to be a   key driver in COVID-19 epidemiology.
 We are capturing some of these properties by allowing the Omicron variant to have greater relative transmissibility than that     of the Delta variant.
##### Waning immunity from infection; 
Immunity from natural infection wanes over time.
##### Vaccines; 
 Vaccination is arguebly the most important public health intervention against COVID-19.
 We are capturing the effect of vaccination on reducing the risk of death.
 We are capturing the effect of vaccination on reducing susceptibility to infection, i.e. reducing the risk of infection in     vaccinated individuals.

## What might we be missing?
##### Differences in disease severity; 
 We are not accurately capturing the spectrum of disease. 
 For example,  asymptomatic vs symptomatic disease and hospitalisations (and ICU admissions). We could include stratification to capture by asymptomaitc and symptomatic infection states. This might be important for thinking about case ascertainment and thus the case detection rate parameter employed in the model.
##### Variants/strains; 
 We are not capturing all strain/variant properties (or all strains..).
 We are only simulating the Delta and Omicron variants as these led to the most impactful waves of transmission in Malaysia, but we could simulate more, such as Beta variant and wild-type.
 Not entirely capturing how immunity varies with variants (i.e. immune escape). In this model infection with Delta variant (before waning) provides complete (sometimes called sterilising immunity) against infection with the Omicron variant. Whilst this is true for some diseases, it is not true for COVID-19. 
 Successive variants have shown significant capacity to typically escape immunity derived from infection with prior variants, this has been most pronounced for the Omicron variant.
Other viral properties such as difference in the generation/latent period is not captured. There is evidence to suggest that for Omicron in particular shorter latency (time from exposure to active infection) played a role in it’s fitness advantage over Delta.
We are also not capturing difference in variant/strain virulence. i.e. Delta > virulence than Omicron.
##### Vaccination; 
We are not capturing all effects of vaccination, differences in doses or waning vaccine immunity
The impact of vaccination on reducing infectiousness or onward transmission is not captured, but is an important effect. Also not capturing effects on severity other than Death.
