# Bayesian Optimization with Trees in OMLT

This notebook introduces the `OMLT` gradient-boosted trees (GBT) functionality and how `OMLT` incorporates such models in a Bayesian optimization loop. For a more comprehensive framework using GBT models for Bayesian optimization, please check out another project of our group: [ENTMOOT](https://github.com/cog-imperial/entmoot).

## Define Black-Box Function and Initial Dataset
We first define a simple benchmark function `f(X)`, i.e. Rosenbrock 2D, with inputs bounded by `(-2.048, 2.048)`. The function `generate_samples` uniformly samples input for the given bounds and can generate the initial data that trains our surrogate model.

In [1]:
import numpy as np
import random

def f(X):
    # Rosenbrock benchmark function
    X = np.asarray_chkfinite(X)
    X0 = X[:-1]
    X1 = X[1:]
    add1 = sum((1.0 - X0) ** 2.0)
    add2 = 100.0 * sum((X1 - X0 ** 2.0) ** 2.0)
    return add1 + add2

f_bnds = [(-2.048,2.048) for _ in range(2)]

def generate_samples(num_samples, bb_bnds):
    data = {'X': [], 'y': []}

    for _ in range(num_samples):
        sample = []

        # iterate through all dimension bounds
        for idx, var_bnds in enumerate(bb_bnds):
            val = random.uniform(var_bnds[0], var_bnds[1])

            # populate the sample
            sample.append(val)

        data['X'].append(sample)
        data['y'].append(f(sample))
    return data

## Training the Tree Ensemble
Next, we define a function to train a tree ensemble as the surrogate model to learn the black-box function behavior. Here, we choose `lightgbm` as the training algorithm, but `OMLT` is compatible with all tree ensemble regression libraries supported by `ONNX`.

In [2]:
import lightgbm as lgb

def train_tree(data):
    PARAMS = {'objective': 'regression',
              'metric': 'rmse',
              'boosting': 'gbdt',
              'num_trees': 50,
              'max_depth': 3,
              'min_data_in_leaf': 2,
              'random_state': 100,
              'verbose': -1}

    train_data = lgb.Dataset(data['X'], 
                             label=data['y'],
                             params={'verbose': -1})

    model = lgb.train(PARAMS, 
                      train_data,
                      verbose_eval=False)
    return model

## Handling Trees with ONNX
Using the `get_onnx_model` function, we convert the `lightgbm` model into the `ONNX` format, which `OMLT` uses to encode tree models as optimization problems. We first define the features as continuous variables and recommend using one-hot encoding to consider categorical variables.

In [3]:
from onnxmltools.convert.lightgbm.convert import convert
from skl2onnx.common.data_types import FloatTensorType

def get_onnx_model(lgb_model):
    # export onnx model
    float_tensor_type = FloatTensorType([None, lgb_model.num_feature()])
    initial_types = [('float_input', float_tensor_type)]
    onnx_model = convert(lgb_model, 
                         initial_types=initial_types, 
                         target_opset=8)
    return onnx_model

You can use tools like [Netron](https://netron.app/) to inspect the model. Use the `write_onnx_to_file` function and provide a path and file name to export the `ONNX` model.

In [4]:
def write_onnx_to_file(onnx_model, path, file_name="output.onnx"):
    from pathlib import Path
    with open(Path(path) / file_name, "wb") as onnx_file:
        onnx_file.write(onnx_model.SerializeToString())
        print(f'Onnx model written to {onnx_file.name}')

## Build the Pyomo Model
We define the `opt_model` as a `ConcreteModel()` imported from `Pyomo`. By first initializing an `OmltBlock()`, we can add a formulation using the `build_formulation` function. For tree ensembles we use the `GradientBoostedTreeModel` object and a `BigMFormulation` imported from `omlt.gbt`. Here we provide the `onnx_model` and input bounds of the black-box function. The `add_tree_model` function captures all procedures and adds a tree model block to an existing `Pyomo` model.

In [6]:
import sys
sys.path.append('src/omlt')

import pyomo.environ as pe
from omlt.block import OmltBlock
from omlt.gbt import BigMFormulation, GradientBoostedTreeModel

def add_tree_model(opt_model, onnx_model, input_bounds):
    # init omlt block and gbt model based on the onnx format
    opt_model.gbt = OmltBlock()
    gbt_model = GradientBoostedTreeModel(onnx_model, 
                                         input_bounds=input_bounds)
    
    # omlt uses a big-m formulation to encode the tree models
    formulation = BigMFormulation(gbt_model)
    opt_model.gbt.build_formulation(formulation)

ModuleNotFoundError: No module named 'omlt'

We build the `Pyomo` model and print the formulation using `pprint()` to check if everything works correctly.

In [16]:
data = generate_samples(20, f_bnds)
lgb_model = train_tree(data)
onnx_model = get_onnx_model(lgb_model)


opt_model = pe.ConcreteModel()
add_tree_model(opt_model, onnx_model, f_bnds)
# opt_model.pprint()

TypeError: Cannot create a Set from data that does not support __contains__.  Expected set-like object supporting collections.abc.Collection interface, but received 'zip'.

We import the general `OMLT` block and the `GradientBoostedTreeModel` module. `OMLT` uses `BigMFormulation` to encode the tree ensembles. This optimization model formulation was adapted from Misic 2020.

In [8]:
from omlt.block import OmltBlock
from omlt.gbt import BigMFormulation, GradientBoostedTreeModel

In include the tree model as a `Pyomo` block we import a few objects from `OMLT` and add everything to our optimization model.

In [9]:
data = generate_samples(20, f_bnds)
lgb_model = train_tree(data)

Add an uncertainty metric to the model

In [10]:
import numpy as np
def add_unc_metric(opt_model, data):
    data_x = np.asarray(data['X'])
    std = np.std(data_x, axis=0)
    mean = np.mean(data_x, axis=0)
    data_x = np.divide(data_x - mean, std)
    
    alpha_bound = abs(0.5*np.var(data['y']))
    opt_model.alpha = pe.Var(within=pe.NonNegativeReals, bounds=(0,alpha_bound))
    opt_model.unc_constr = pe.ConstraintList()
    
    for x in data_x:
        x_var = opt_model.gbt.inputs
        opt_model.unc_constr.add(
            opt_model.alpha <= sum( (x[idx]-(x_var[idx]-mean[idx])/std[idx])*(x[idx]-(x_var[idx]-mean[idx])/std[idx]) 
                                   for idx in range(len(x_var)) )
        )

In [11]:
import gurobipy

def get_next_x(opt_model):
    opt_model.obj = pe.Objective(expr=opt_model.gbt.outputs[0] - 1.96*opt_model.alpha)
    solver = pe.SolverFactory('gurobi')
    solver.options['NonConvex'] = 2
    solution = solver.solve(opt_model, tee=False)
    return [opt_model.gbt.inputs[idx].value for idx in range(len(opt_model.gbt.inputs))]

In [12]:

x = np.zeros(10) + 0.5
y = lgb_model.predict(np.atleast_2d(x))
print(f'x = {x}')
print(f'y = {y}')

opt_model = get_opt_model_core(input_domain, input_bounds)
add_tree_model(opt_model, onnx_model, input_bounds)

for i, x in enumerate(x):
    opt_model.gbt.inputs[i].fix(x)

opt_model.gbt.inputs.pprint()
opt_model.xxxx = pe.Objective(expr=opt_model.gbt.outputs[0])

solver = pe.SolverFactory('cbc')
# solver.options['NonConvex'] = 2
solver.solve(opt_model, tee=False)
opt_model.gbt.outputs.pprint()

x = [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
y = [130.17848612]
inputs : Size=10, Index=gbt.inputs_set
    Key : Lower : Value : Upper : Fixed : Stale : Domain
      0 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      1 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      2 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      3 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      4 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      5 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      6 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      7 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      8 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
      9 : -5.12 :   0.5 :  5.12 :  True : False :  Reals
outputs : Size=1, Index=gbt.outputs_set
    Key : Lower : Value     : Upper : Fixed : Stale : Domain
      0 :  None : 130.17848 :  None : False : False :  Reals


In [13]:
# run Bayesian optimization loop

for itr in range(50):
    # train the model
    lgb_model = train_tree(data)
    onnx_model = get_onnx_model(lgb_model)
    
    # build optimization model
    opt_model = get_opt_model_core(input_domain, input_bounds)
    add_tree_model(opt_model, onnx_model, input_bounds)
    add_unc_metric(opt_model, data)
    
    # get next point to evaluate
    x_next = get_next_x(opt_model)
    y_next = f(x_next)
    data['X'].append(x_next)
    data['y'].append(y_next)
    
    print(f"alpha: {opt_model.alpha.value}")
    print(f"y_next: {y_next}")
    print(x_next)
    
    # If I understand the model formulation correctly, the next two values should be equal
    print(f"gbt: {opt_model.gbt.outputs[0].value}")
    print(f"lgbm_next: {lgb_model.predict(np.atleast_2d(x_next))[0]}")
    print("")

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: 151.00212188912775

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: 0.11545090751820081

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: 0.016185661531768848

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: 0.0012407542947089048

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: 0.0006994343348267996

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: -0.006658084664800137

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: -0.005906704531509317

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: -0.014010131021007306

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: -0.004088179151719957

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
gbt: 0
lgbm_next: -0.0012771576825305604

alpha: None
y_next: 0.0
[0, 0, 0, 0, 0, 0, 0,

TypeError: Unable to insert '(132, 3)' into Set OrderedScalarSet:
	KeyboardInterrupt: 

- add references



