# Example usage with Python 3
This notebook demonstrates usage of `petab_select` to perform forward selection in a Python 3 script.

## Problem setup with initial model

Dependencies are imported. A model selection problem is loaded from the specification files. Some helper methods are defined.

In [1]:
import petab_select
from petab_select import Model
from petab_select.constants import (
    CANDIDATE_SPACE,
    MODELS,
    UNCALIBRATED_MODELS,
)

BOLD_TEXT = "\033[1m"
NORMAL_TEXT = "\033[0m"

# Load the PEtab Select problem.
select_problem = petab_select.Problem.from_yaml(
    "model_selection/petab_select_problem.yaml"
)
# Fake criterion values as a surrogate for a model calibration tool.
fake_criterion = {
    "M1_0": 200,
    "M1_1": 150,
    "M1_2": 140,
    "M1_3": 130,
    "M1_4": -40,
    "M1_5": -70,
    "M1_6": -110,
    "M1_7": 50,
}


def print_model(model: Model) -> None:
    """Helper method to view model attributes."""
    print(
        f"""\
Model subspace ID: {model.model_subspace_id}
PEtab YAML location: {model.petab_yaml}
Custom model parameters: {model.parameters}
Model hash: {model.get_hash()}
Model ID: {model.model_id}
{select_problem.criterion}: {model.get_criterion(select_problem.criterion, compute=False)}
"""
    )


def calibrate(model: Model, fake_criterion=fake_criterion) -> None:
    """Set model criterion values to fake values that could be the output of a calibration tool.

    Each model subspace in this problem contains only one model, so a model-specific criterion can
    be indexed by the model subspace ID.
    """
    model.set_criterion(
        select_problem.criterion, fake_criterion[model.model_subspace_id]
    )


print("Information about the model selection problem:")
print(select_problem)

Information about the model selection problem.
YAML: model_selection/petab_select_problem.yaml
Method: forward
Criterion: Criterion.AIC
Version: beta_1



## First iteration

Neighbors of the initial predecessor model in the model space are identified for testing. Here, no initial predecessor model is specified. If an initial predecessor model is required for the algorithm, PEtab Select can automatically use the `VIRTUAL_INITIAL_MODEL`. With the forward and backward methods, the virtual initial model defaults to a model with no parameters estimated, and all parameters estimated, respectively.

The model space is then used to find neighbors to the initial model. A candidate space is used to calculate distances between models, and whether a candidate model represents a valid move in model space.

The built-in `ForwardCandidateSpace` uses the following properties to identify candidate models:

- previously estimated parameters must remain estimated;
- the number of estimated parameters must increase; and
- this increase must be minimal.

The model space keeps a history of calibrated models, such that subsequent calls ignore previously identified neighbors. This can be disabled by changing usage to `petab_select.ModelSpace.search(..., exclude=False)`, or reset to forget all history with `petab_select.ModelSpace.reset()`.

In [2]:
iteration = petab_select.ui.start_iteration(problem=select_problem)

Model IDs default to the model hash, which is generated from the model subspace ID, the model parameterization, and the "PEtab hash". The PEtab hash is generated from the location of the PEtab problem YAML file, and the nominal values and list of estimated parameters from the model's PEtab parameter table.

Here, the model identified is a model with all possible parameters fixed, because it matches the virtual initial model. If the initial model was from the "real" model subspace, then candidate models would be true forward steps in the subspace (e.g. an increase in the number of estimated parameters).

Each of the candidate models includes information that should be sufficient for model calibration with any suitable tool that supports PEtab.

NB: the `petab_yaml` is for the original PEtab problem, and would need to be customized by `parameters` to be the actual candidate model.

In [3]:
for candidate_model in iteration[UNCALIBRATED_MODELS]:
    print_model(candidate_model)

Model subspace ID: M1_0
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0, 'k2': 0, 'k3': 0}
Model hash: M1_0-000
Model ID: M1_0-000
Criterion.AIC: None



At this point, a model calibration tool is used to find the best of the test models, according to some criterion. PEtab select can select the best model from a collection of models that provide a value for this criterion, or a specific model can be supplied. Here, PEtab Select will be used to select the best model from multiple models. At the end of the following iterations, a specific model will be provided.

In [4]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in iteration[UNCALIBRATED_MODELS]:
    calibrate(candidate_model)

iteration_results = petab_select.ui.end_iteration(
    candidate_space=iteration[CANDIDATE_SPACE],
    calibrated_models=iteration[UNCALIBRATED_MODELS],
)

In [5]:
local_best_model = petab_select.ui.get_best(
    problem=select_problem, models=iteration_results[MODELS].values()
)
print_model(local_best_model)

Model subspace ID: M1_0
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0, 'k2': 0, 'k3': 0}
Model hash: M1_0-000
Model ID: M1_0-000
Criterion.AIC: 200



## Second iteration
The process then repeats.

The chosen model is used as the predecessor model, such that neighboring models are identified with respect to the chosen model. Here, we define a dummy calibration tool that performs all parts of the model selection iteration.

In [6]:
def dummy_calibration_tool(
    problem: petab_select.Problem,
    candidate_space: petab_select.CandidateSpace = None,
):
    # Initialize iteration
    iteration = petab_select.ui.start_iteration(
        problem=problem,
        candidate_space=candidate_space,
    )

    # "Calibrate": set fake criterion values that might be the output of a model calibration tool.
    for candidate_model in iteration[UNCALIBRATED_MODELS]:
        calibrate(candidate_model)

    # Finalize iteration
    iteration_results = petab_select.ui.end_iteration(
        candidate_space=iteration[CANDIDATE_SPACE],
        calibrated_models=iteration[UNCALIBRATED_MODELS],
    )

    return iteration_results

In [7]:
iteration_results = dummy_calibration_tool(
    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]
)
local_best_model = petab_select.ui.get_best(
    problem=select_problem, models=iteration_results[MODELS].values()
)

for candidate_model in iteration_results[MODELS].values():
    if candidate_model.get_hash() == local_best_model.get_hash():
        print(BOLD_TEXT + "BEST MODEL OF CURRENT ITERATION" + NORMAL_TEXT)
    print_model(candidate_model)

Model subspace ID: M1_1
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0.2, 'k2': 0.1, 'k3': 'estimate'}
Model hash: M1_1-000
Model ID: M1_1-000
Criterion.AIC: 150

Model subspace ID: M1_2
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0.2, 'k2': 'estimate', 'k3': 0}
Model hash: M1_2-000
Model ID: M1_2-000
Criterion.AIC: 140

[1mBEST MODEL OF CURRENT ITERATION[0m
Model subspace ID: M1_3
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 0.1, 'k3': 0}
Model hash: M1_3-000
Model ID: M1_3-000
Criterion.AIC: 130



## Third iteration

In [8]:
iteration_results = dummy_calibration_tool(
    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]
)
local_best_model = petab_select.ui.get_best(
    problem=select_problem, models=iteration_results[MODELS].values()
)

for candidate_model in iteration_results[MODELS].values():
    if candidate_model.get_hash() == local_best_model.get_hash():
        print(BOLD_TEXT + "BEST MODEL OF CURRENT ITERATION" + NORMAL_TEXT)
    print_model(candidate_model)

Model subspace ID: M1_5
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 0.1, 'k3': 'estimate'}
Model hash: M1_5-000
Model ID: M1_5-000
Criterion.AIC: -70

[1mBEST MODEL OF CURRENT ITERATION[0m
Model subspace ID: M1_6
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 0}
Model hash: M1_6-000
Model ID: M1_6-000
Criterion.AIC: -110



## Fourth iteration

In [9]:
iteration_results = dummy_calibration_tool(
    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]
)
local_best_model = petab_select.ui.get_best(
    problem=select_problem, models=iteration_results[MODELS].values()
)

for candidate_model in iteration_results[MODELS].values():
    if candidate_model.get_hash() == local_best_model.get_hash():
        print(BOLD_TEXT + "BEST MODEL OF CURRENT ITERATION" + NORMAL_TEXT)
    print_model(candidate_model)

[1mBEST MODEL OF CURRENT ITERATION[0m
Model subspace ID: M1_7
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 'estimate'}
Model hash: M1_7-000
Model ID: M1_7-000
Criterion.AIC: 50



## Fifth iteration

In [10]:
iteration_results = dummy_calibration_tool(
    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]
)

The `M1_7` model is the most complex model in the model space (all parameters in the space are estimated), so no valid neighbors are identified for the forward selection method.

In [11]:
print(f"Number of candidate models: {len(iteration_results[MODELS])}.")

Number of candidate models: 0.


At this point, the results of the model calibration tool for the different models can be used to select the best model. You can collect all calibrated models from `iteration_results`. Alternatively, you can access the `CandidateSpace.calibrated_models` attribute.

In [12]:
best_model = petab_select.ui.get_best(
    problem=select_problem,
    models=iteration_results[CANDIDATE_SPACE].calibrated_models.values(),
)
print_model(best_model)

Model subspace ID: M1_6
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 0}
Model hash: M1_6-000
Model ID: M1_6-000
Criterion.AIC: -110



## Sixth iteration
Note that there can exist additional, uncalibrated models in the model space, after a single forward algorithm terminates. These additional models can be identified with the brute-force method.

In [13]:
candidate_space = petab_select.BruteForceCandidateSpace(
    criterion=select_problem.criterion
)
# candidate_space.calibrated_models = iteration_results[CANDIDATE_SPACE].calibrated_models
petab_select.ui.start_iteration(
    problem=select_problem,
    candidate_space=candidate_space,
);

In [14]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)

Model subspace ID: M1_4
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0.2, 'k2': 'estimate', 'k3': 'estimate'}
Model hash: M1_4-000
Model ID: M1_4-000
Criterion.AIC: None

