# Bayesian Optimization with Trees in OMLT

This notebook introduces the gradient-boosted trees (GBT) functionality of `OMLT` and how such models can be incorporated in Bayesian optimization loops. For a more comprehensive framework using GBT models for Bayesian optimization please check out another project of our group: [ENTMOOT](https://github.com/cog-imperial/entmoot).

## List of Python Imports
We start by importing a list of dependencies to implement the example. `OMLT` is compatible with all tree ensemble training libraries that support ONNX outputs. In this tutorial we use the `lightgbm` package.

In [214]:
import random
import tempfile
import numpy as np
import lightgbm as lgb
import pyomo.environ as pe
from onnxmltools.convert.lightgbm.convert import convert
from skl2onnx.common.data_types import FloatTensorType
from omlt.block import OmltBlock
from omlt.gbt import BigMFormulation, GradientBoostedTreeModel

from helpers import generate_gbt_data

random.seed(100)

## Define Dataset
We first define a simple dataset by sampling 100 random points from the 10D Rastrigin function. Every input feature of the Rastrigin function is bounded by `(-5.12, 5.12)`.

In [226]:
def f(X):
    # Rastrigin benchmark function
    x = np.asarray_chkfinite(X)
    n = len(x)
    res = 10*n + sum( x**2 - 10 * np.cos( 2 * np.pi * x ))
    return res

f_bnds = [(-5.12,5.12) for _ in range(10)]

# generate dataset
data = {'X': [], 'y': []}

for _ in range(50):
    sample =[random.uniform(*bnd) for bnd in f_bnds]
    
    data['X'].append(sample)
    data['y'].append(f(sample))

## Train the Tree Ensemble
Next we define our training function to train the tree ensemble based on the data we generated.

In [216]:
def train_tree(data):
    FIXED_PARAMS = {'objective': 'regression',
                    'metric': 'rmse',
                    'boosting': 'gbdt',
                    'num_trees': 200,
                    'max_depth': 3,
                    'min_data_in_leaf': 2,
                    'random_state': 100,
                    'verbose': -1}

    train_data = lgb.Dataset(data['X'], 
                             label=data['y'],
                             params={'verbose': -1})

    model = lgb.train(FIXED_PARAMS, 
                      train_data,
                      verbose_eval=False)
    return model

## Handling Trees with ONNX
ONNX needs to know the number of features and their type. Currently, ONNX doesn't support categorical features so we can only train models with continous features in `lightgbm`. To handle categorical features we recommend to perform a one-hot encoding transformation first.

In [217]:
def get_onnx_model(lgb_model):
    # export onnx model
    float_tensor_type = FloatTensorType([None, lgb_model.num_feature()])
    initial_type = [('float_input', float_tensor_type)]
    onnx_model = convert(lgb_model, 
                         initial_types=initial_type, 
                         target_opset=8)
    return onnx_model

You can write the ONNX model to a file so that it can be inspected using a tool like [Netron](https://netron.app/).

In [218]:
# build lightgbm model and export from onnx
lgb_model = train_tree(data)
onnx_model = get_onnx_model(lgb_model)

with tempfile.NamedTemporaryFile(suffix='.onnx', delete=False) as f:
    f.write(onnx_model.SerializeToString())
    print(f'Onnx model written to {f.name}')

Onnx model written to /var/folders/05/s3jw9hyd1l95qtpg8jjbkv340000gn/T/tmpjul262xp.onnx


## Build the Pyomo Model
We build the `Pyomo` model by first defining the input bounds and input domain.

In [219]:
# define problem specifications
input_bounds = f_bnds
input_domain = [pe.Reals for _ in range(len(input_bounds))]

def get_opt_model_core(input_domain, input_bounds):
    # init optimization model
    opt_model = pe.ConcreteModel()
    return opt_model

opt_model = get_opt_model_core(input_domain, input_bounds)

We can print the model to check if everything worked correctly.

In [220]:
opt_model.pprint()

0 Declarations: 


We import the general `OMLT` block and the `GradientBoostedTreeModel` module. `OMLT` uses `BigMFormulation` to encode the tree ensembles. This optimization model formulation was adapted from Misic 2020.

In [221]:
from omlt.block import OmltBlock
from omlt.gbt import BigMFormulation, GradientBoostedTreeModel

In include the tree model as a `Pyomo` block we import a few objects from `OMLT` and add everything to our optimization model.

In [222]:
def add_tree_model(opt_model, onnx_model, input_bounds):
    # init omlt block and gbt model based on the onnx format
    opt_model.gbt = OmltBlock()
    gbt_model = GradientBoostedTreeModel(onnx_model, 
                                         input_bounds=input_bounds)
    
    # omlt uses a big-m formulation to encode the tree models
    formulation = BigMFormulation(gbt_model)
    opt_model.gbt.build_formulation(formulation)
    
add_tree_model(opt_model, onnx_model, input_bounds)

Add an uncertainty metric to the model

In [223]:
import numpy as np
def add_unc_metric(opt_model, data):
    data_x = np.asarray(data['X'])
    std = np.std(data_x, axis=0)
    mean = np.mean(data_x, axis=0)
    data_x = np.divide(data_x - mean, std)
    
    alpha_bound = abs(0.5*np.var(data['y']))
    opt_model.alpha = pe.Var(within=pe.NonNegativeReals, bounds=(0,alpha_bound))
    opt_model.unc_constr = pe.ConstraintList()
    
    for x in data_x:
        x_var = opt_model.gbt.inputs
        opt_model.unc_constr.add(
            opt_model.alpha <= sum( (x[idx]-(x_var[idx]-mean[idx])/std[idx])*(x[idx]-(x_var[idx]-mean[idx])/std[idx]) 
                                   for idx in range(len(x_var)) )
        )

In [229]:
import gurobipy

def get_next_x(opt_model):
    opt_model.obj = pe.Objective(expr=opt_model.gbt.outputs[0] - 1.96*opt_model.alpha)
    solver = pe.SolverFactory('gurobi')
    solver.options['NonConvex'] = 2
    solution = solver.solve(opt_model, tee=False)
    return [opt_model.gbt.inputs[idx].value 
            for idx in range(len(opt_model.gbt.inputs))]

In [231]:
# run Bayesian optimization loop

for itr in range(50):
    # train the model
    lgb_model = train_tree(data)
    onnx_model = get_onnx_model(lgb_model)
    
    # build optimization model
    opt_model = get_opt_model_core(input_domain, input_bounds)
    add_tree_model(opt_model, onnx_model, input_bounds)
    add_unc_metric(opt_model, data)
    
    # get next point to evaluate
    x_next = get_next_x(opt_model)
    y_next = f(x_next)
    data['X'].append(x_next)
    data['y'].append(y_next)
    
    print(f"alpha: {opt_model.alpha.value}")
    print(f"y_next: {y_next}")
    
    # If I understand the model formulation correctly, the next two values should be equal
    print(f"gbt: {opt_model.gbt.outputs[0].value}")
    print(f"lgbm_next: {lgb_model.predict(np.atleast_2d(x_next))[0]}")
    print("")

y_next: 320.0996013245152
gbt: 103.59557610563934
lgbm_next: 280.21503361606045

y_next: 354.8219945519526
gbt: 101.08319323509932
lgbm_next: 281.31142802038175

y_next: 310.86202596574253
gbt: 99.13616622705013
lgbm_next: 321.8859683992703

y_next: 330.8472345143324
gbt: 107.6737381964922
lgbm_next: 325.09017200093757

y_next: 335.92289001432397
gbt: 103.14691489003599
lgbm_next: 323.3649222167264

y_next: 317.1814134770142
gbt: 102.25235510803759
lgbm_next: 325.0586607026989

y_next: 297.9911704525108
gbt: 103.50438061915338
lgbm_next: 292.26282959102326

y_next: 324.9194777672926
gbt: 102.03288578987122
lgbm_next: 322.73052495885264

y_next: 335.9237069063596
gbt: 102.25173620507121
lgbm_next: 331.59762739605856

y_next: 324.4226441972911
gbt: 102.76865433342755
lgbm_next: 336.70953254169814

y_next: 329.0913136181446
gbt: 98.72145158424973
lgbm_next: 326.80882947346447

y_next: 324.659603401865
gbt: 100.41079062782228
lgbm_next: 310.6554567693622

y_next: 335.4437472534424
gbt: 101

KeyboardInterrupt: 