# Bayesian Optimization with Trees in OMLT

This notebook introduces the gradient-boosted trees (GBT) functionality of `OMLT` and how such models can be incorporated in Bayesian optimization loops. For a more comprehensive framework using GBT models for Bayesian optimization please check out another project of our group: [ENTMOOT](https://github.com/cog-imperial/entmoot).

## List of Python Imports
We start by importing a list of dependencies to implement the example. `OMLT` is compatible with all tree ensemble training libraries that support ONNX outputs. In this tutorial we use the `lightgbm` package.

In [15]:
import random
import tempfile
import numpy as np
import lightgbm as lgb
import pyomo.environ as pe
from onnxmltools.convert.lightgbm.convert import convert
from skl2onnx.common.data_types import FloatTensorType
from omlt.block import OmltBlock
from omlt.gbt import BigMFormulation, GradientBoostedTreeModel

from helpers import generate_gbt_data

random.seed(100)

## Define Dataset
We first define a simple dataset by sampling 100 random points from the 10D Rastrigin function. Every input feature of the Rastrigin function is bounded by `(-5.12, 5.12)`.

In [16]:
def f(X):
    # Rastrigin benchmark function
    x = np.asarray_chkfinite(X)
    n = len(x)
    res = 10*n + sum( x**2 - 10 * np.cos( 2 * np.pi * x ))
    return res

f_bnds = [(-5.12,5.12) for _ in range(10)]

# generate dataset
data = {'X': [], 'y': []}

for _ in range(50):
    sample =[random.uniform(*bnd) for bnd in f_bnds]
    
    data['X'].append(sample)
    data['y'].append(f(sample))

## Train the Tree Ensemble
Next we define our training function to train the tree ensemble based on the data we generated.

In [17]:
def train_tree(data):
    FIXED_PARAMS = {'objective': 'regression',
                    'metric': 'rmse',
                    'boosting': 'gbdt',
                    'num_trees': 200,
                    'max_depth': 3,
                    'min_data_in_leaf': 2,
                    'random_state': 100,
                    'verbose': -1}

    train_data = lgb.Dataset(data['X'], 
                             label=data['y'],
                             params={'verbose': -1})

    model = lgb.train(FIXED_PARAMS, 
                      train_data,
                      verbose_eval=False)
    return model

## Handling Trees with ONNX
ONNX needs to know the number of features and their type. Currently, ONNX doesn't support categorical features so we can only train models with continous features in `lightgbm`. To handle categorical features we recommend to perform a one-hot encoding transformation first.

In [18]:
def get_onnx_model(lgb_model):
    # export onnx model
    float_tensor_type = FloatTensorType([None, lgb_model.num_feature()])
    initial_type = [('float_input', float_tensor_type)]
    onnx_model = convert(lgb_model, 
                         initial_types=initial_type, 
                         target_opset=8)
    return onnx_model

You can write the ONNX model to a file so that it can be inspected using a tool like [Netron](https://netron.app/).

In [19]:
# build lightgbm model and export from onnx
lgb_model = train_tree(data)
onnx_model = get_onnx_model(lgb_model)

with tempfile.NamedTemporaryFile(suffix='.onnx', delete=False) as onnx_file:
    onnx_file.write(onnx_model.SerializeToString())
    print(f'Onnx model written to {onnx_file.name}')

Onnx model written to /run/user/1000/tmpuqpz8iip.onnx


## Build the Pyomo Model
We build the `Pyomo` model by first defining the input bounds and input domain.

In [20]:
# define problem specifications
input_bounds = f_bnds
input_domain = [pe.Reals for _ in range(len(input_bounds))]

def get_opt_model_core(input_domain, input_bounds):
    # init optimization model
    opt_model = pe.ConcreteModel()
    return opt_model

opt_model = get_opt_model_core(input_domain, input_bounds)

We can print the model to check if everything worked correctly.

In [21]:
opt_model.pprint()

0 Declarations: 


We import the general `OMLT` block and the `GradientBoostedTreeModel` module. `OMLT` uses `BigMFormulation` to encode the tree ensembles. This optimization model formulation was adapted from Misic 2020.

In [22]:
from omlt.block import OmltBlock
from omlt.gbt import BigMFormulation, GradientBoostedTreeModel

In include the tree model as a `Pyomo` block we import a few objects from `OMLT` and add everything to our optimization model.

In [23]:
def add_tree_model(opt_model, onnx_model, input_bounds):
    # init omlt block and gbt model based on the onnx format
    opt_model.gbt = OmltBlock()
    gbt_model = GradientBoostedTreeModel(onnx_model, 
                                         input_bounds=input_bounds)
    
    # omlt uses a big-m formulation to encode the tree models
    formulation = BigMFormulation(gbt_model)
    opt_model.gbt.build_formulation(formulation)
    
add_tree_model(opt_model, onnx_model, input_bounds)

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 10), (0, 11), (0, 12), (0, 13), (0, 14), (0, 15), (0, 16), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 10), (2, 11), (2, 12), (2, 13), (2, 14), (2, 15), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (3, 10), (3, 11), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (4, 10), (4, 11), (4, 12), (4, 13), (4, 14), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (5, 7), (5, 8), (5, 9), (5, 10), (5, 11), (5, 12), (6, 0), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6), (6, 7), (6, 8), (6, 9), (6, 10), (6, 11), (6, 12), (6, 13), (6, 14), (7, 0), (7, 1), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7, 7), (7, 8), (7, 9), (7, 10), (7, 11), (7, 12), (7, 13), (7, 14),

Add an uncertainty metric to the model

In [24]:
import numpy as np
def add_unc_metric(opt_model, data):
    data_x = np.asarray(data['X'])
    std = np.std(data_x, axis=0)
    mean = np.mean(data_x, axis=0)
    data_x = np.divide(data_x - mean, std)
    
    alpha_bound = abs(0.5*np.var(data['y']))
    opt_model.alpha = pe.Var(within=pe.NonNegativeReals, bounds=(0,alpha_bound))
    opt_model.unc_constr = pe.ConstraintList()
    
    for x in data_x:
        x_var = opt_model.gbt.inputs
        opt_model.unc_constr.add(
            opt_model.alpha <= sum( (x[idx]-(x_var[idx]-mean[idx])/std[idx])*(x[idx]-(x_var[idx]-mean[idx])/std[idx]) 
                                   for idx in range(len(x_var)) )
        )

In [25]:
import gurobipy

def get_next_x(opt_model):
    opt_model.obj = pe.Objective(expr=opt_model.gbt.outputs[0] - 1.96*opt_model.alpha)
    solver = pe.SolverFactory('gurobi')
    solver.options['NonConvex'] = 2
    solution = solver.solve(opt_model, tee=False)
    return [opt_model.gbt.inputs[idx].value for idx in range(len(opt_model.gbt.inputs))]

In [26]:

x = np.zeros(10) + 0.5
y = lgb_model.predict(np.atleast_2d(x))
print(f'x = {x}')
print(f'y = {y}')

opt_model = get_opt_model_core(input_domain, input_bounds)
add_tree_model(opt_model, onnx_model, input_bounds)

for i, x in enumerate(x):
    opt_model.gbt.inputs[i].fix(x)

opt_model.gbt.inputs.pprint()
opt_model.xxxx = pe.Objective(expr=opt_model.gbt.outputs[0])

solver = pe.SolverFactory('cbc')
# solver.options['NonConvex'] = 2
solver.solve(opt_model, tee=False)
opt_model.gbt.outputs.pprint()

x = [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]
y = [130.17848612]
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 10), (0, 11), (0, 12), (0, 13), (0, 14), (0, 15), (0, 16), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 10), (2, 11), (2, 12), (2, 13), (2, 14), (2, 15), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (3, 10), (3, 11), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (4, 10), (4, 11), (4, 12), (4, 13), (4, 14), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (5, 7), (5, 8), (5, 9), (5, 10), (5, 11), (5, 12), (6, 0), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6), (6, 7), (6, 8), (6, 9), (6, 10), (6, 11), (6, 12), (6, 13), (6, 14), (7, 0), (7, 1), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7,

In [27]:
# run Bayesian optimization loop

for itr in range(50):
    # train the model
    lgb_model = train_tree(data)
    onnx_model = get_onnx_model(lgb_model)
    
    # build optimization model
    opt_model = get_opt_model_core(input_domain, input_bounds)
    add_tree_model(opt_model, onnx_model, input_bounds)
    add_unc_metric(opt_model, data)
    
    # get next point to evaluate
    x_next = get_next_x(opt_model)
    y_next = f(x_next)
    data['X'].append(x_next)
    data['y'].append(y_next)
    
    print(f"alpha: {opt_model.alpha.value}")
    print(f"y_next: {y_next}")
    print(x_next)
    
    # If I understand the model formulation correctly, the next two values should be equal
    print(f"gbt: {opt_model.gbt.outputs[0].value}")
    print(f"lgbm_next: {lgb_model.predict(np.atleast_2d(x_next))[0]}")
    print("")

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 10), (0, 11), (0, 12), (0, 13), (0, 14), (0, 15), (0, 16), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 10), (2, 11), (2, 12), (2, 13), (2, 14), (2, 15), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (3, 10), (3, 11), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (4, 10), (4, 11), (4, 12), (4, 13), (4, 14), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (5, 7), (5, 8), (5, 9), (5, 10), (5, 11), (5, 12), (6, 0), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6), (6, 7), (6, 8), (6, 9), (6, 10), (6, 11), (6, 12), (6, 13), (6, 14), (7, 0), (7, 1), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7, 7), (7, 8), (7, 9), (7, 10), (7, 11), (7, 12), (7, 13), (7, 14),