# Mixed-variable Bayesian Optimization Demo

This notebook serves as a general implementation guide for mixed-variable Bayesian optimization. The first section displays a simple optimization loop. The next section displays how to define a new objective function. The last section shows how to incorporate PFNs into a Bayesian optimization method. 

First, we import packages. Watch out for the package versions, the mcbo package can throw errors with newer versions of pytorch, numpy, and pandas.

In [None]:
import torch # 2.0.1
import numpy as np # 2.0.2
import matplotlib.pyplot as plt # 3.9.3
import pandas as pd # 1.3.5
from mcbo import task_factory # 0.0.1
from mcbo.optimizers.bo_builder import BoBuilder, BO_ALGOS
from mcbo.tasks.task_base import TaskBase
from typing import Optional, List, Callable, Dict, Any
from pfns4mvbo import MVPFNOptimizer

In [None]:
%matplotlib notebook
    
%matplotlib inline

# Example Bayesian optimization run

Begin with defining an objective function, or "task". The MCBO package includes a number of tasks for the purpose of benchmarking. Define the number of numerical dimensions, and categorical dimensions, and the number of categories per dimension. 

Then, initialize the "optimizer". Use the name of any Bayesian optimization method listed below, or choose your own combination of surrogate model, acquisition function, and acquisition optimizer. (Casmopolitan works pretty well, some methods are designed for scenarios like 3 numeric and 3 categorical variables, some methods are more intended for scenarios with 50 binary variables)

After that, draw some random observations to initialize the optimizer, and begin the optimization loop




In [None]:
# these are the different BO methods included in the MCBO package
# BO_ALGOS was imported at the top of this document, I just 
# copy pasted it here for easy viewing 

BO_ALGOS: Dict[str, BoBuilder] = dict(
    Casmopolitan=BoBuilder(model_id="gp_to", acq_opt_id="is", acq_func_id="ei", tr_id="basic"),
    BOiLS=BoBuilder(model_id="gp_ssk", acq_opt_id="is", acq_func_id="ei", tr_id="basic"),
    COMBO=BoBuilder(model_id="gp_diff", acq_opt_id="ls", acq_func_id="ei", tr_id=None),
    BODi=BoBuilder(model_id="gp_hed", acq_opt_id="is", acq_func_id="ei", tr_id=None),
    BOCS=BoBuilder(model_id="lr_sparse_hs", acq_opt_id="sa", acq_func_id="ts", tr_id=None),
    BOSS=BoBuilder(model_id="gp_ssk", acq_opt_id="ga", acq_func_id="ei", tr_id=None),
    CoCaBO=BoBuilder(model_id="gp_o", acq_opt_id="mab", acq_func_id="ei", tr_id=None),
    RDUCB=BoBuilder(model_id="gp_rd", acq_opt_id="mp", acq_func_id="addlcb", tr_id=None)
)

In [None]:
task_kws = dict(variable_type=['num'] + ['nominal']*2,
                num_dims=[2] + [1]*2,
                num_categories=[1, 2, 4])

task = task_factory(task_name='ackley', **task_kws)
search_space = task.get_search_space()


# define the optimizer
optimizer = BO_ALGOS['BODi'].build_bo(search_space=search_space, n_init=1, device=torch.device("cpu"))

if False:
    # choose individual components for a BO method
    bo_builder = BoBuilder(
        model_id='gp_to', acq_opt_id='is', acq_func_id='ei', tr_id='basic'
    )
    optimizer = bo_builder.build_bo(search_space=search_space, n_init=20, device=torch.device("cpu"))


# draw a set of initial observations
x_init = search_space.sample(10)
y_init = task(x_init)
optimizer.observe(x_init, y_init)


# Main BO loop
for iter in range(20):
    #print(iter)
    
    # Suggest a point
    x_next = optimizer.suggest(1)

    # Compute the Black-box value
    y_next = task(x_next)

    # Observe the new point
    optimizer.observe(x_next, y_next)


#### Small note

All of the previous observations are stored in the optimizer, but the inputs are converted to torch tensors with values between 0 and 1.

To see the optimizer's current argmin in a normal format, you need to `search_space.inverse_transform()`

In [None]:
print(optimizer.best_y)
print(search_space.inverse_transform(optimizer._best_x))

## Defining a new objective function

The MCBO package calls it a "task" instead of an "objective function". A variety of tasks are included in the MCBO package for the purpose of benchmarking different BO methods. For real world applications, you must implement your own objective function. To do this, you have to define a new class, where you specify the input dimensions (search space parameters), and the evaluate method, which actually evaluates the objective function for a given set of points. 

In [None]:
class NewTask(TaskBase):
    def __init__(self, **kwargs):
        self.kwargs = kwargs
        self._n_bb_evals = 0

    @property
    def name(self) -> str:
        return "new_task"

    def search_space_params(self) -> List[Dict[str, Any]]:
        # Define the search space as a list of dicts, each dict is one variable
        # 'type': 'num' for numerical variables
        # 'type': 'nominal' for categorical variables
        return [
            {'name': 'x1', 'type': 'num', 'lb': 0.0, 'ub': 1.0},
            {'name': 'x2', 'type': 'num', 'lb': 0.0, 'ub': 1.0},
            {'name': 'x3', 'type': 'nominal', 'categories': ['a','b','c']}
        ]

    def evaluate(self, x: pd.DataFrame) -> np.array:
        # this evaluates the objective function and returns new observations
        return np.random.uniform(0, 1, [x.shape[0], 1])

task = NewTask()
task.get_search_space().sample(5)

# Incorporating PFNs

To use a PFN as the surrogate function, we define certain settings in optimizer_kwargs, and then define a new optimizer using the MVPFNOptimizer class.







In [None]:
task_kws = dict(variable_type=['num'] + ['nominal']*2,
                num_dims=[2] + [1]*2,
                num_categories=[1, 2, 4])
task = task_factory(task_name='ackley', **task_kws)
search_space = task.get_search_space()




# Define the MVPFN optimizer
optimizer = MVPFNOptimizer(search_space=search_space,
                           input_constraints=task.input_constraints,
                           pfn='casmopolitan', # uses pfn trained on the surrogate model from Casmopolitan
                           acq_func='ei',
                           acq_optim_name='is')





# draw a set of initial observations
x_init = search_space.sample(10)
y_init = task(x_init)
optimizer.observe(x_init, y_init)

# Main loop
for iter in range(20):
    #print(iter)
    
    # Suggest a point
    x_next = optimizer.suggest(1)

    # Compute the Black-box value
    y_next = task(x_next)

    # Observe the new point
    optimizer.observe(x_next, y_next)

print(optimizer.best_y)
print(search_space.inverse_transform(optimizer._best_x))

## Some notes on extra settings

There are a number of settings that can be selected and tweaked, outlined below

### Settings
#### pfn
Three pfns are included in the pfns4mvbo package, you can choose between 'cocabo', 'casmo', or 'bodi'. Alternatively, use a filepath to a saved pfn.

#### acq_func
Acquisition function, 'ei'-expected improvement, 'pi'-probability of improvement, 'ucb'-upper confidence bound, etc

#### use_pfn_acq_func
Gaussian processes output normal distributions, so the MCBO package calculates the acquisition function using only the mean and variance of the surrogate model's output. However, PFN outputs are in the form of a Riemann distribution, which is much more flexible. If set to `True`, the the acquisition function will take the whole Riemann distribution as an input. If set to `False`, we use only the Riemann distribution's mean and variance. Expect slight differences in performance

#### acq_optim_name
Choose the optimizer for the acquisition function. There are a variety of options implemented as part of the MCBO package, such as 'mab', 'is', 'ga', etc.

If set to 'pfn', Adam will be used to optimize over the input space as if all dimensions were numeric. Upon completion, the values for the nominal dimensions are rounded to the nearest integer. This can work particularly well depending on the application.

#### pfn_acq_optim_kwargs
Only relevant if acq_optim_name is set to 'pfn'. These are settings for the acquisition optimizer, n_restarts specifies how many times we restart from a new random location, n_iter specifies how many optimization steps we take before restarting. There exists a tradeoff between speed and performance. The separate restarts can be done in parallel, which can save a lot of time.

#### fast
Only relevant for mcbo acquisition optimizer implementations ('mab', 'is', 'ga', etc). Set to `True`, will change the acquisition optimizer settings to perform much more quickly. Set to `False` for slower/better performance.


#### tr_id
Set to 'basic' to use a trust region, or set to `None`.

In [None]:
optimizer_kwargs = {
                'pfn': 'casmo',
                'acq_func': 'ei',
                'use_pfn_acq_func': True,
                'acq_optim_name': 'pfn',
                'pfn_acq_optim_kwargs': {
                    'parallelize_restarts': True,
                    'n_restarts': 10,
                    'n_iter': 10
                },
                'tr_id': None,
                'n_init': 10,
                'device': 'cpu',
                'fast': False
            }

optimizer = MVPFNOptimizer(search_space=search_space,
                           input_constraints=task.input_constraints,
                           **optimizer_kwargs)

In [None]:
# optimization loop again
x_init = search_space.sample(10)
y_init = task(x_init)
optimizer.observe(x_init, y_init)

# Main loop
for iter in range(20):
    #print(iter)
    
    # Suggest a point
    x_next = optimizer.suggest(1)

    # Compute the Black-box value
    y_next = task(x_next)

    # Observe the new point
    optimizer.observe(x_next, y_next)

print(optimizer.best_y)
print(search_space.inverse_transform(optimizer._best_x))