In [1]:
# Ax wrappers for BoTorch components
from ax.models.torch.botorch_modular.model import BoTorchModel
from ax.models.torch.botorch_modular.surrogate import Surrogate
from ax.models.torch.botorch_modular.acquisition import Acquisition
from ax.models.torch.botorch_modular.kg import KnowledgeGradient, MultiFidelityKnowledgeGradient

# Ax data tranformation layer
from ax.modelbridge.torch import TorchModelBridge
from ax.modelbridge.registry import Cont_X_trans, Y_trans, Models

# Test Ax objects
from ax.utils.testing.core_stubs import get_branin_experiment, get_branin_data

# BoTorch components
from botorch.models.model import Model
from botorch.models.gp_regression import FixedNoiseGP, SingleTaskGP
from botorch.acquisition.monte_carlo import qExpectedImprovement, qNoisyExpectedImprovement
from botorch.models.gp_regression_fidelity import SingleTaskMultiFidelityGP
from gpytorch.mlls.exact_marginal_log_likelihood import ExactMarginalLogLikelihood

In [2]:
# Default options will not be needed when the modular `BoTorchModel` functionality is completed.
# Associating these default options with corresponding model components (so there is no need to 
# specify them manually) is in the works.
DEFAULT_ACQUISITION_OPTIONS = {
    "num_fantasies": 16, "num_mv_samples": 10, "num_y_samples": 128, "candidate_size": 1000, "best_f": 0.0,
}
DEFAULT_OPTIMIZER_OPTIONS = {"num_restarts": 40, "raw_samples": 1024}

# Setup and Usage of BoTorch Models in Ax

Ax provides a set of flexible wrapper abstractions to mix-and-match BoTorch components like `Model` and `AcquisitionFunction` and combine them into a single `Model` object in Ax. The wrapper abstractions: `Surrogate`, `Acquisition`, and `BoTorchModel` – are located in `ax/models/torch/botorch_modular` directory and aim to encapsulate boilerplate code that interfaces between Ax and BoTorch. **This functionality is in alpha-release and under active development.** 

Here is a quick example of multi-fidelity Knowledge Gradient setup (GPKG) and subsequent candidate generation:

In [3]:
experiment = get_branin_experiment(with_fidelity_parameter=True, with_trial=True)
data = get_branin_data(trials=[experiment.trials[0]])

In [4]:
# `Models` automatically selects a model + model bridge combination. 
# For `BOTORCH_MODULAR`, it will select `BoTorchModel` and `TorchModelBridge`.
model_bridge_with_GPKG = Models.BOTORCH_MODULAR(
    experiment=experiment,
    data=data,
    surrogate=Surrogate(SingleTaskMultiFidelityGP),  # Optional, will use default if unspecified
    acquisition_class=MultiFidelityKnowledgeGradient,  # Optional, will use default if unspecified
    acquisition_options=DEFAULT_ACQUISITION_OPTIONS,  # Optional
)

[INFO 12-29 21:07:39] ax.modelbridge.transforms.standardize_y: Outcome branin is constant, within tolerance.


In [5]:
model_bridge_with_GPKG.gen(n=1, model_gen_options={"optimizer_kwargs": DEFAULT_OPTIMIZER_OPTIONS})

GeneratorRun(1 arms, total weight 1.0)

-----

# `BoTorchModel` Deep dive

This tutorial walks through setting up a custom combination of BoTorch components in Ax in following steps:

1. `BoTorchModel` = `Surrogate` + `Acquisition` (overview)
2. Specifying `BoTorchModel` subcomponents
  1. Surrogate model
  2. Acquisition function
  3. Leveraging default subcomponents
  4. Subcomponents Q&A
3. Leveraging Ax storage stack + `Models.BOTORCH_MODULAR`
4. Utilizing `BoTorchModel` in generation strategies

Before you read the rest of this tutorial: 
- Note that the concept of ‘model’ is Ax is somewhat a misnomer; we use ['model'](https://ax.dev/docs/glossary.html#model) to refer to an optimization setup capable of producing candidate points for optimization (and often capable of being fit to data, with exception for quasi-random generators). See [Models documentation page](https://ax.dev/docs/models.html) for more information.
- Learn about `ModelBridge` in Ax, as users should rarely be interacting with a `Model` object directly (more about ModelBridge, a data transformation layer in Ax, [here](https://ax.dev/docs/models.html#deeper-dive-organization-of-the-modeling-stack)).

Finally, some advanced modeling setups will require subclassing `Surrogate` and/or `Acquisition` to construct, for example, some inputs specific to a given `AcquisitionFunction`. For details on custom `BoTorchModel` subcomponents, refer to the [Customizing a `BoTorchModel` tutorial]().

----

## 1. `BoTorchModel` = `Surrogate` + `Acquisition`

A `BoTorchModel` in Ax consists of two main subcomponents: a surrogate model and an acquisition function. A surrogate model is represented as an instance of Ax’s [`Surrogate` class](https://github.com/facebook/Ax/blob/main/ax/models/torch/botorch_modular/surrogate.py), which is a wrapper around BoTorch's [`Model` class](https://github.com/pytorch/botorch/blob/main/botorch/models/model.py). The acquisition function is represented as an instance of Ax’s [`Acquisition` class](https://github.com/facebook/Ax/blob/main/ax/models/torch/botorch_modular/acquisition.py), a wrapper around BoTorch's [`AcquisitionFunction` class](https://github.com/pytorch/botorch/blob/main/botorch/acquisition/acquisition.py). 

In simple case, a `BoTorchModel` is instantiated like so (see Appendix 1 for methods `BoTorchModel` provides):

In [6]:
model = BoTorchModel(
    surrogate=Surrogate(FixedNoiseGP),
    botorch_acqf_class=qExpectedImprovement,
)

It can then be used with TorchModelBridge (learn more about `ModelBridge`, a data transformation layer in Ax, [here](https://ax.dev/docs/models.html#deeper-dive-organization-of-the-modeling-stack)).

In [7]:
experiment = get_branin_experiment(with_trial=True)  # Example experiment with simple search space

model_bridge = TorchModelBridge(
    experiment=experiment, 
    search_space=experiment.search_space,
    data=get_branin_data(trials=[experiment.trials[0]]),   # Example synthetic data
    model=model,
    # Transforms to apply to the data, standard continuous search-space transforms used here as example
    transforms=Cont_X_trans + Y_trans,
)

[INFO 12-29 21:10:42] ax.modelbridge.transforms.standardize_y: Outcome branin is constant, within tolerance.


----

## 2. Specifying `BoTorchModel` subcomponents

### A. Surrogate model

To specify a given surrogate model, construct an Ax `Surrogate`:

In [8]:
GP_surrogate = Surrogate(
    botorch_model_class=FixedNoiseGP,  # BoTorch `Model` class to construct in this `Surrogate`
    mll_class=ExactMarginalLogLikelihood,  # Optional, MLL class with which to optimize model parameters
    model_options={},  # Optional, dictionary of keyword arguments to underlying BoTorch `Model`
)

Alternatively, for BoTorch `Model`-s that require complex instantiation procedures, leverage the `from_BoTorch` instantiation method of `Surrogate`:

In [9]:
surrogate_from_botorch_model = Surrogate.from_BoTorch(
    model=...,   # BoTorch `Model` instance, with training data already set
    mll_class=ExactMarginalLogLikelihood,  # Optional, MLL class with which to optimize model parameters
)

### B. Acquisition function

To specify acquisition function, provide one of:
- the `botorch_acqf_class` argument, a BoTorch `AcquisitionFunction` class (if it can work with base `Acquisition` wrapper) or
- the `acquisition_class` argument, an Ax `Acquisition` class (in case where base `Acquisition` is not sufficient to support a given BoTorch `AcquisitionFunction` option –– for instance, if it requires custom inputs that are not covered by base `Acquisition`).

In [10]:
# `qExpectedImprovement` is supported by the default `Acquisition`, 
# so there is no need to explicitly specify `acquisition_class`.
GPEI_model = BoTorchModel(
    surrogate=GP_surrogate,
    botorch_acqf_class=qExpectedImprovement,
    # Optional dict of keyword arguments, passed to the BoTorch `AcquisitionFunction`
    acquisition_options=DEFAULT_ACQUISITION_OPTIONS,
        
)

# `qKnowledgeGradient` requires a custom `Acquisition` subclass, `KnowledgeGradient`.
KG_model = BoTorchModel(
    surrogate=GP_surrogate,
    acquisition_class=KnowledgeGradient,
)

### C. Leveraging default subcomponents

BoTorchModel does not always require surrogate and acquisition specification. If instantiated without one or both components specified, defaults are selected based on properties of experiment and data (see Appendix 2 for auto-selection logic).

In [11]:
# The surrogate is not specified, so it will be auto-selected during `model.fit`.
GPEI_model = BoTorchModel(botorch_acqf_class=qExpectedImprovement)

# The acquisition class is not specified, so it will be auto-selected during `model.gen`.
GPEI_model = BoTorchModel(surrogate=GP_surrogate)

# Both the surrogate and acquisition class will be auto-selected.
GPEI_model = BoTorchModel()

### D. Subcomponents Q&A

**Why is `surrogate` expected to be an instance, but `acquisition_class` (or `botorch_acqf_class`) –– a class?**
Because a BoTorch `AcquisitionFunction` object (and therefore its Ax wrapper, `Acquisition`) is *ephemeral*: it is constructed, immediately used, and destroyed during `BoTorchModel.gen`, so there is no reason to keep around an `Acquisition` instance. A `Surrogate`, on another hand, is kept in memory as long as its parent `BoTorchModel` is.

**How to know when to use specify `acquisition_class` (and thereby a non-default `Acquisition` type) instead of just passing in `botorch_acqf_class`?**
The [Customizing a `BoTorchModel`]() tutorial covers that question. In short, custom `Acquisition` subclasses are needed when a given `AcquisitionFunction` in BoTorch needs non-standard inputs or constructs some of its subcomponents (like an `AcquisitionObjective`) in a non-standard way.

**Why do I not need to specify `botorch_acqf_class` argument if I do specify `acquisition_class`? Does each type of `Acquisition` have an associated BoTorch `AcquisitionFunction`?** All non-base `Acquisition` subclasses should have the `Acquisition.default_botorch_acqf_class` attribute specified, so they have an associated BoTorch counterpart. For example, `KnowledgeGradient` [sets it to `qKnowledgeGradient`](https://github.com/facebook/Ax/blob/main/ax/models/torch/botorch_modular/kg.py#L57-L58). 

**Please post any other questions** you have to our issues page: https://github.com/facebook/Ax/issues.

----

## 3. Leveraging Ax storage stack via `Models.BOTORCH_MODULAR`

To simplify the instantiation of an Ax `ModelBridge` and its undelying `Model`, Ax provides a [`Models` registry enum](https://github.com/facebook/Ax/blob/main/ax/modelbridge/registry.py#L201). 

Here we use `Models.BOTORCH_MODULAR` to set up a multi-fidelity Knowledge Gradient (GPKG) model. We specify both the surrogate and the acquisition to customize both components:

In [12]:
experiment = get_branin_experiment(with_fidelity_parameter=True, with_trial=True)
data = get_branin_data(trials=[experiment.trials[0]])

In [13]:
model_bridge_with_GPKG = Models.BOTORCH_MODULAR(  # Will automatically select `BoTorchModel` and `TorchModelBridge`
    experiment=experiment,
    data=data,
    surrogate=Surrogate(SingleTaskMultiFidelityGP),  # Optional, will use default if unspecified
    acquisition_class=MultiFidelityKnowledgeGradient,  # Optional, will use default if unspecified
    acquisition_options=DEFAULT_ACQUISITION_OPTIONS,  # Optional
)

[INFO 12-29 21:10:47] ax.modelbridge.transforms.standardize_y: Outcome branin is constant, within tolerance.


We can now use the model bridge to generate candidates:

In [14]:
generator_run = model_bridge_with_GPKG.gen(n=1, model_gen_options={"optimizer_kwargs": DEFAULT_OPTIMIZER_OPTIONS})
generator_run.arms

[Arm(parameters={'x1': 9.931203095212535, 'x2': 6.123563213741994, 'fidelity': 0.0})]

Generator run also records all arguments to model bridge and model, which allows to restore the model state from a generator run it produced. Therefore, the generator run can then be serialized and stored, then recreated:

In [15]:
from ax.storage.json_store.encoder import object_to_json
from ax.storage.json_store.decoder import object_from_json
from ax.modelbridge.registry import get_model_from_generator_run

In [16]:
generator_run_restored = object_from_json(object_to_json(generator_run))

In [17]:
model_bridge_with_GPKG_restored = get_model_from_generator_run(generator_run_restored, experiment, data)
model_bridge_with_GPKG_restored

[INFO 12-29 21:20:21] ax.modelbridge.transforms.standardize_y: Outcome branin is constant, within tolerance.


<ax.modelbridge.torch.TorchModelBridge at 0x7fc8a9cff7d0>

**Note that not all arguments to BoTorch `Model` or `AcquisitionFunction` are serializable by default!** For example, a BoTorch `Prior` object, which could be among `Surrogate.model_options`, does not currently have associated serialization logic in Ax. See Appendix 3 for how to address errors that stem from some objects among options lacking serialization logic.

----

## 4. Utilizing `BoTorchModel` in generation strategies

Generation strategy is a key concept in Ax, enabling use of Service API (a.k.a. `AxClient`) and many other higher-level abstractions. A [`GenerationStrategy`](https://ax.dev/api/modelbridge.html#ax.modelbridge.generation_strategy.GenerationStrategy) allows to chain multiple models in Ax and thereby automate candidate generation. 

An example generation stategy with the modular `BoTorchModel` would look like this:

In [31]:
from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from botorch.acquisition import UpperConfidenceBound
from ax.modelbridge.modelbridge_utils import get_pending_observation_features

gs = GenerationStrategy(
    steps=[
        GenerationStep(
            model=Models.SOBOL,  # Which model to use for this ste[]
            num_trials=5,  # How many generator runs (which are then made into trials) to produce with this step
            min_trials_observed=5,  # How many trials generated from this step must be `COMPLETED` before the next one
        ),
        GenerationStep(
            model=Models.BOTORCH_MODULAR,
            num_trials=-1,  # No limit on how many generator runs will be produced
            model_kwargs={  # Kwargs to pass to `BoTorchModel.__init__`
                "surrogate": Surrogate(SingleTaskGP),
                "botorch_acqf_class": qNoisyExpectedImprovement,
                "acquisition_options": DEFAULT_ACQUISITION_OPTIONS,
            },
            model_gen_kwargs={"model_gen_options": {  # Kwargs to pass to `BoTorchModel.gen`
                "optimizer_kwargs": DEFAULT_OPTIMIZER_OPTIONS},
            },
        )
    ]
)

Set up an experiment and generate 10 trials in it, adding synthetic data to experiment after each one:

In [32]:
experiment = get_branin_experiment(minimize=True)

assert len(experiment.trials) == 0
experiment.search_space

SearchSpace(parameters=[RangeParameter(name='x1', parameter_type=FLOAT, range=[-5.0, 10.0]), RangeParameter(name='x2', parameter_type=FLOAT, range=[0.0, 15.0])], parameter_constraints=[])

In [33]:
for _ in range(10):
    # Produce a new generator run and attach it to experiment as a trial
    generator_run = gs.gen(
        experiment=experiment, 
        n=1, 
    )
    trial = experiment.new_trial(generator_run)
    
    # Mark the trial as 'RUNNING' so we can mark it 'COMPLETED' later
    trial.mark_running(no_runner_required=True)
    
    # Attach data for the new trial and mark it 'COMPLETED'
    experiment.attach_data(get_branin_data(trials=[trial]))
    trial.mark_completed()
    
    print(f"Completed trial #{trial.index}, suggested by {generator_run._model_key}.")

Completed trial #0, suggested by Sobol.
Completed trial #1, suggested by Sobol.
Completed trial #2, suggested by Sobol.
Completed trial #3, suggested by Sobol.
Completed trial #4, suggested by Sobol.
Completed trial #5, suggested by BoTorch.
Completed trial #6, suggested by BoTorch.
Completed trial #7, suggested by BoTorch.
Completed trial #8, suggested by BoTorch.
Completed trial #9, suggested by BoTorch.


Inspect trials that were generated:

In [34]:
gs.trials_as_df

[INFO 12-29 21:27:42] ax.modelbridge.generation_strategy: Note that parameter values in dataframe are rounded to 2 decimal points; the values in the dataframe are thus not the exact ones suggested by Ax in trials.


Unnamed: 0,Generation Step,Generation Model,Trial Index,Trial Status,Arm Parameterizations
0,0,Sobol,0,COMPLETED,"{'0_0': {'x1': 0.6, 'x2': 0.31}}"
1,0,Sobol,1,COMPLETED,"{'1_0': {'x1': 4.02, 'x2': 0.48}}"
2,0,Sobol,2,COMPLETED,"{'2_0': {'x1': 5.63, 'x2': 4.15}}"
3,0,Sobol,3,COMPLETED,"{'3_0': {'x1': -3.57, 'x2': 2.06}}"
4,0,Sobol,4,COMPLETED,"{'4_0': {'x1': -1.02, 'x2': 10.5}}"
5,1,BoTorch,5,COMPLETED,"{'5_0': {'x1': 2.65, 'x2': 15.0}}"
6,1,BoTorch,6,COMPLETED,"{'6_0': {'x1': 2.64, 'x2': 15.0}}"
7,1,BoTorch,7,COMPLETED,"{'7_0': {'x1': 2.63, 'x2': 15.0}}"
8,1,BoTorch,8,COMPLETED,"{'8_0': {'x1': 2.57, 'x2': 15.0}}"
9,1,BoTorch,9,COMPLETED,"{'9_0': {'x1': 2.63, 'x2': 15.0}}"


----

## Appendix 1: Methods available on `BoTorchModel`

**Core methods on `BoTorchModel`:**

* `fit` selects a surrogate if needed and fits the surrogate model to data via `Surrogate.fit`,
* `predict` estimates metric values at a given point via `Surrogate.predict`,
* `gen` instantiates an acquisition function via `Acquisition.__init__` and optimizes it to generate candidates.

**Other methods on `BoTorchModel`:**

* `update` updates surrogate model with training data and optionally reoptimizes model parameters via `Surrogate.update`,
* `cross_validate` re-fits the surrogate model to subset of training data and makes predictions for test data,
* `evaluate_acquisition_function` instantiates an acquisitino function and evaluates it for a given point.
---

## Appendix 2: Default surrogate models and acquisition functions

By default, the chosen surrogate model will be:

* if fidelity parameters are present in search space: `FixedNoiseMultiFidelityGP` (if [SEM](https://ax.dev/docs/glossary.html#sem)s are known on observations) and `SingleTaskMultiFidelityGP` (if variance unknown and needs to be inferred),
* if task parameters are present: a set of `FixedNoiseMultiTaskGP` (if known variance) or `MultiTaskGP` (if unknown variance), wrapped in a `ModelListGP` and each modeling one task,
* `FixedNoiseGP` (known variance) and `SingleTaskGP` (unknown variance) otherwise.

The chosen acquisition function will be:

* for multi-fidelity settings: `qMultiFidelityKnowledgeGradient`,
* `qExpectedImprovement` (known variance) and `qNoisyExpectedImprovement` (unknown variance) otherwise.

----

## Appendix 3: Handling storage errors that arise from objects that don't have serialization logic in Ax

Attempting to store a generator run produced via `Models.BOTORCH_MODULAR` instance that included options without serization logic with will produce an error like: `"Object <SomeAcquisitionOption object> passed to `object_to_json` (of type <class SomeAcquisitionOption'>) is not registered with a corresponding encoder in ENCODER_REGISTRY."` 

The two options for handling this error are:
1. disabling storage of `BoTorchModel`'s options by passing `no_model_options_storage=True` to `Models.BOTORCH_MODULAR(...)` call –– this will prevent model options from being stored on the generator run, so a generator run can be saved but cannot be used to restore the model that produced it,
2. specifying serialization logic for a given object that needs to occur among the `Model` or `AcquisitionFunction` options. Tutorial for this is in the works, but in the meantime you can [post an issue on the Ax GitHub](https://github.com/facebook/Ax/issues) to get help with this.