# Example usage with Python 3
This notebook demonstrates usage of `petab_select` to perform forward selection in a Python 3 script.

## Problem setup with initial model

Dependencies are imported. A model selection problem is loaded from the specification files. Some helper methods are defined.

In [1]:
import petab_select
from petab_select import ForwardCandidateSpace, Model

# Load the PEtab Select problem.
select_problem = petab_select.Problem.from_yaml(
    'model_selection/petab_select_problem.yaml'
)
# Fake criterion values as a surrogate for a model calibration tool.
fake_criterion = {
    'M1_0': 200,
    'M1_1': 150,
    'M1_2': 140,
    'M1_3': 130,
    'M1_4': -40,
    'M1_5': -70,
    'M1_6': -110,
    'M1_7': 50,
}


def print_model(model: Model) -> None:
    """Helper method to view model attributes."""
    print(
        f"""\
Model subspace ID: {model.model_subspace_id}
PEtab YAML location: {model.petab_yaml}
Custom model parameters: {model.parameters}
Model hash: {model.get_hash()}
Model ID: {model.model_id}
{select_problem.criterion}: {model.get_criterion(select_problem.criterion, compute=False)}
"""
    )


def calibrate(model: Model, fake_criterion=fake_criterion) -> None:
    """Set model criterion values to fake values that could be the output of a calibration tool.

    Each model subspace in this problem contains only one model, so a model-specific criterion can
    be indexed by the model subspace ID.
    """
    model.set_criterion(
        select_problem.criterion, fake_criterion[model.model_subspace_id]
    )


print(
    f"""Information about the model selection problem.

YAML path: {select_problem.yaml_path}
Method: {select_problem.method}
Criterion: {select_problem.criterion}
"""
)

Information about the model selection problem.

YAML path: model_selection/petab_select_problem.yaml
Method: forward
Criterion: AIC



## First iteration

Neighbors of the initial model in the model space are identified for testing. Here, no initial model is specified. If an initial model is required for the algorithm, PEtab Select can automatically use a virtual initial model, if such a model is defined. For example, for the forward and backward methods, the virtual initial model defaults to a model with no parameters estimated, and all parameters estimated, respectively.

The model candidate space is setup with the initial model. The model space is then used to find neighbors to the initial model. The candidate space is used to calculate distances between models, and whether a candidate model represents a valid move in model space.

The in-built `ForwardCandidateSpace` uses the following properties to identify candidate models:
- previously estimated parameters must not be fixed;
- the number of estimated parameters must increase; and
- the increase in the number of estimated parameters must be minimal.

The model space keeps a history of identified neighbors, such that subsequent calls ignore previously identified neighbors. This can be disabled by changing usage to `petab_select.ModelSpace.search(..., exclude=False)`, or reset to forget all history with `petab_select.ModelSpace.reset()`.

In [2]:
candidate_space = petab_select.ui.candidates(problem=select_problem)

Model IDs default to the model hash, which is generated from hashing the model subspace ID and model parameterization.

Here, the model identified is a model with all possible parameters fixed. This is because the default virtual initial model is the same parameterization, and the closest model in the "real" model subspace is the same parameterization. If the initial model was from the "real" model subspace, then candidate models would be true forward steps in the subspace (e.g. an increase in the number of estimated parameters).

Each of the candidate models includes information that should be sufficient for model calibration with any suitable tool that supports PEtab.

NB: the `petab_yaml` is for the original PEtab problem, and would need to be customized by `parameters` to be the actual candidate model.

In [3]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)

AttributeError: 'tuple' object has no attribute 'models'

At this point, a model calibration tool is used to find the best of the test models, according to some criterion. PEtab select can select the best model from a collection of models that provide a value for this criterion, or a specific model can be supplied. Here, PEtab Select will be used to select the best model from multiple models. At the end of the following iterations, a specific model will be provided.

In [None]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)
select_problem.add_calibrated_models(candidate_space.models)

In [None]:
local_best_model = select_problem.get_best(candidate_space.models)
print_model(local_best_model)

## Second iteration
The process then repeats.

The chosen model is used as the predecessor model, such that neighboring models are identified with respect to the chosen model.

In [None]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    predecessor_model=select_problem.get_best(candidate_space.models),
);

In [None]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)

In [None]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)
select_problem.add_calibrated_models(candidate_space.models)

In [None]:
local_best_model = select_problem.get_best(candidate_space.models)
print_model(local_best_model)

## Third iteration

In [None]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    predecessor_model=select_problem.get_best(candidate_space.models),
);

In [None]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)

In [None]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)
select_problem.add_calibrated_models(candidate_space.models)

In [None]:
local_best_model = select_problem.get_best(candidate_space.models)
print_model(local_best_model)

## Fourth iteration

In [None]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    predecessor_model=select_problem.get_best(candidate_space.models),
);

In [None]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)

In [None]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)
select_problem.add_calibrated_models(candidate_space.models)

In [None]:
local_best_model = select_problem.get_best(candidate_space.models)
print_model(local_best_model)

## Sixth iteration

In [None]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    predecessor_model=select_problem.get_best(candidate_space.models),
);

The `M1_7` model is the most complex model in the model space (all parameters in the space are estimated), so no valid neighbors are identified for the forward selection method.

In [None]:
print(f'Number of candidate models: {len(candidate_space.models)}.')

At this point, the results of the model calibration tool for the different models can be used to select the best model.

In [None]:
best_model = select_problem.get_best(select_problem.calibrated_models)
print_model(best_model)

## Seventh iteration
Note that there can exist additional, uncalibrated models in the model space, after a single forward algorithm terminates. These additional models can be identified with the brute-force method.

In [None]:
candidate_space = petab_select.BruteForceCandidateSpace()
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    excluded_models=select_problem.calibrated_models,
);

In [None]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)