# LEAP Model Validation

This notebook presents validation results for the LEAP (Lifetime Exposures and Asthma outcomes Projection) model. The objective is to compare key simulated outputs from the current pythonic version to observed targets. The taregts include demographic trends, asthma prevalence and incidence, risk factor distributions, and asthma-related health outcomes.

A large sample run of LEAP was used to generate all of the data files used in this notebook.


## Notebook setup

### Environment setup

In [10]:
# Environment setup
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pathlib import Path

### Running Simulation to Obtain Data

*Currently, simulations at a sufficient size to get meaningful results take too long to run practically in the notebook. So calling simulation.run() in the Jupyter notebook and using the generated outcome_matrix object for analysis is infeasible. Instead, run it seperately from the command line and use the outputted csvs.*

To run the simulation, open a terminal:

```sh
leap
    --run-simulation
    --path-output PATH/TO/SAVE/OUTPUT/
    --province PROVINCE
    --max-age MAX_AGE
    --min-year STARTING_YEAR
    --time-horizon SIMULATION_LENGTH
    --population-growth-type GROWTH_TYPE
    --num-births-initial N_BIRTHS
    --ignore-pollution
```

**NOTE**: The default simulation folder naming scheme is:

`./leap_output/PROVINCE-MAX_AGE-STARTING_YEAR-SIMULATION_LENGTH-GROWTH_TYPE-N_BIRTHS`

For instance, if the model was run with the following parameters:

```json
"parameters": {
        "province": "CA",
        "max_age": 100,
        "min_year": 2001,
        "time_horizon": 15,
        "population_growth_type": "M3",
        "num_births_initial": 5000,
        "pollution ignored": true,
        "max_year": 2015
    }
```

Then the deafult folder name would be:

``./leap_output/CA-100-2001-15-M3-5000``

### Constants and Parameters

In [11]:
# Constants for this notebook
notebook_path = Path(os.getcwd())
LEAP_ROOT = notebook_path.parent.parent
DATA_FOLDER = LEAP_ROOT / "leap" / "processed_data"

# Simulation parameters
RUN_BUNDLE_NAME = "CA-110-2001-30-M3-5"  # REPLACE WITH YOUR DESIRED SIMULATION RUN
PROVINCE = ""
MAX_AGE = 0
STARTING_YEAR = ""
SIMULATION_LENGTH = 0
GROWTH_TYPE = ""
N_BIRTHS = 0

OUTPUT_FOLDER = LEAP_ROOT / "output" # REPLACE WITH PATH TO THE FOLDER STORING YOUR LEAP OUTPUTS
RUN_BUNDLE_FOLDER = OUTPUT_FOLDER / RUN_BUNDLE_NAME

# Figures

## Mortality

### Load Data

In [None]:
# Read obsereved StatsCanada asthma prevalence data
target_mortality_raw_df = pd.read_csv(DATA_FOLDER / "life_table.csv")

# Read mortality data from simulation
model_mortality_raw_df = pd.read_csv(RUN_BUNDLE_FOLDER / "outcome_matrix_death.csv")

# Read population data from simulation
model_alive_raw_df = pd.read_csv(RUN_BUNDLE_FOLDER / "outcome_matrix_alive.csv")

### Process Data

In [None]:
# Filter by ages under 80 years for the figure
target_mortality_df = target_mortality_raw_df[target_mortality_raw_df["age"] <= 80]
# Filter by chosen province
target_mortality_df = target_mortality_df[target_mortality_df["province"] == PROVINCE]

# Filter by ages under 80 years for the figure
model_mortality_df = model_mortality_raw_df[model_mortality_raw_df["age"] <= 80]
# Merge model_mortality_df with model_alive_df to get n_alive for each row
model_mortality_df = model_mortality_df.merge(model_alive_raw_df, on=["year", "age", "sex"])
# Calculate mortality rate as n_deaths / n_alive
model_mortality_df["prob_death"] = model_mortality_df["n_deaths"] / model_mortality_df["n_alive"]

### Visualize Data