Importing Libraries

In [1]:
%load_ext autoreload
%autoreload 2
import sys
sys.path.append('../')

In [2]:
from bayesian_opt import BayesianOptimization
import numpy as np
import pandas as pd

This notebooks gives examples on how to use the BayesianOptimization class.
It has built in funcionality for the following:
1. Suggesting more than 1 candidate ($q>1$) per iteration.
2. 3 different types of GP models (Single Task GP, Mixed Single Task GP, SAASBO (Model List GP for $d_{out}>1$)).
3. Optional one-hot encoding when not using Mixed Single Task GP.
4. Multi-Objective and Single-Objective optimization.
5. Weights for prioritizing targets.
6. Ingredient minimization with weights.
7. 3 different types of acquisition functions for Single-Objective optimization (UCB, EI, PI).
8. Expected Hypervolume Improvement for Multi-Objective optimization.
9. 2 different $q$-sampling strategies (Believer Update and Monte-Carlo). 

Example usage of bayesian optimization for continuous data, where $d_\text{out}=1$. 

In [3]:
train_X = np.random.uniform(low = 0, high = 10, size = (10,4))
df = pd.DataFrame(train_X, columns=['pH','c1','c2', 'c3'])
df

Unnamed: 0,pH,c1,c2,c3
0,4.740967,9.2757,4.740485,0.042907
1,5.666479,4.095971,6.419318,1.198192
2,2.455349,3.184182,0.224891,8.432362
3,7.519052,6.645685,2.814339,6.904177
4,6.857393,0.792851,6.146077,9.039888
5,5.271332,8.981241,8.984419,2.559757
6,5.780322,5.490746,2.429319,6.286241
7,5.766865,7.761652,8.308286,8.490824
8,2.463431,2.893586,1.377069,3.850435
9,1.842634,1.770776,1.206562,2.361756


In [4]:
bo_model = BayesianOptimization().fit(df, ['c3'], model_type='Single-Task GP')

a,_ = bo_model.candidates(1, export_df=False)
a


  check_min_max_scaling(


array([[6.91, 1.05, 6.08]])

Example usage of bayesian optimization for mixed continuous and categorical data with target of dim 1 

In [5]:
train_X_cat = train_X
train_X_cat[:,0] = np.round(train_X_cat[:,0] )
df_cat = pd.DataFrame(train_X_cat, columns=['pH','c1','c2','c3'])
df_cat


Unnamed: 0,pH,c1,c2,c3
0,5.0,9.2757,4.740485,0.042907
1,6.0,4.095971,6.419318,1.198192
2,2.0,3.184182,0.224891,8.432362
3,8.0,6.645685,2.814339,6.904177
4,7.0,0.792851,6.146077,9.039888
5,5.0,8.981241,8.984419,2.559757
6,6.0,5.490746,2.429319,6.286241
7,6.0,7.761652,8.308286,8.490824
8,2.0,2.893586,1.377069,3.850435
9,2.0,1.770776,1.206562,2.361756


In [7]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True)
cand_1

  check_min_max_scaling(


Unnamed: 0,pH,c1,c2,c3
0,2.0,0.79,6.15,6.944141


In [8]:
bo_model_st = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Single-Task GP')

cand_2 = bo_model_st.candidates(1, export_df=True)
cand_2

  check_min_max_scaling(


Unnamed: 0,pH,c1,c2,c3
0,6.0,7.3,8.09,7.19532


Choosing a different GP model type.

In [None]:
bo_model = BayesianOptimization().fit(df,['c3'],cat_dims=['pH'], model_type='SAASBO')

a = bo_model.candidates(2, export_df=True)
a

  check_min_max_scaling(
  check_standardization(Y=train_Y, raise_on_fail=raise_on_fail)


Unnamed: 0,pH,c1,c2,c3
0,7.0,1.36513,5.283355,8.523887
1,2.0,3.196111,2.427817,5.991574


Different $q$-sampling strategy.

In [9]:
bo_model = BayesianOptimization().fit(df,['c3'])

a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True)
a

  check_min_max_scaling(
  check_min_max_scaling(
  check_min_max_scaling(
  check_min_max_scaling(


Unnamed: 0,pH,c1,c2,c3
0,5.86,7.49,8.18,7.211009
1,6.27,7.75,8.09,7.19692
2,6.79,0.79,6.33,8.024812
3,7.71,6.16,2.72,5.883033


Choosing a different acquisition function.

In [12]:
a = bo_model.candidates(4, export_df=True, acq_func_name="UCB")
a

Unnamed: 0,pH,c1,c2,c3
0,4.96,4.38,3.35,4.983592
1,5.29,2.98,3.36,4.983587
2,6.26,3.97,3.95,4.983587
3,7.02,0.85,6.19,8.924135


Minimizing Ingredients

In [11]:
a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True, input_weights={0:1.1, 1:2.5, 2:2.0})
a

  check_min_max_scaling(
  check_min_max_scaling(
  check_min_max_scaling(


Unnamed: 0,pH,c1,c2,c3
0,2.0,0.79,0.22,5.506214
1,2.55,0.79,0.22,5.130247
2,3.17,0.79,0.22,4.983635
3,2.0,0.79,0.72,5.15406


Multi-Objective optimization.

In [13]:
bo_model = BayesianOptimization().fit(df,['c2', 'c3'])

a = bo_model.candidates(4, export_df=True)
a

  check_min_max_scaling(

	 qExpectedHypervolumeImprovement 	 --> 	 qLogExpectedHypervolumeImprovement 

instead, which fixes the issues and has the same API. See https://arxiv.org/abs/2310.20708 for details.


Unnamed: 0,pH,c1,c2,c3
0,4.44,6.41,4.265077,4.916654
1,5.95,7.72,8.181105,8.378397
2,6.05,7.8,8.187282,8.383857
3,4.83,0.94,4.265077,4.916654
