Importing Libraries

In [375]:
%load_ext autoreload
%autoreload 2
import sys
sys.path.append('../')

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [376]:
from bayesian_opt import BayesianOptimization
import numpy as np
import pandas as pd

This notebooks gives examples on how to use the BayesianOptimization class.
It has built in funcionality for the following:
1. Suggesting more than 1 candidate ($q>1$) per iteration.
2. 4 different types of GP models (Single Task GP, Mixed Single Task GP, SAASBO (Model List GP for $d_{out}>1$), HED).
3. Optional one-hot encoding when not using Mixed Single Task GP.
4. Multi-Objective and Single-Objective optimization.
5. Weights for prioritizing targets.
6. Ingredient minimization with weights.
7. 3 different types of acquisition functions for Single-Objective optimization (UCB, EI, PI).
8. Expected Hypervolume Improvement for Multi-Objective optimization.
9. 2 different $q$-sampling strategies (Believer Update and Monte-Carlo). 

Example usage of bayesian optimization for continuous data, where $d_\text{out}=1$. 

In [377]:
train_X = np.random.uniform(low = 0, high = 10, size = (10,4))
df = pd.DataFrame(train_X, columns=['pH','c1','c2', 'c3'])
df

Unnamed: 0,pH,c1,c2,c3
0,1.080811,9.343806,9.628655,4.548331
1,9.814713,6.763961,6.853429,6.482908
2,1.156634,4.057388,3.077288,8.975484
3,5.674499,3.861676,6.54848,8.249505
4,7.640964,3.293891,5.504801,3.902971
5,1.277275,0.911508,4.477831,1.505447
6,2.727482,4.136301,9.582331,5.067284
7,6.093842,8.343886,1.261594,6.870813
8,2.320844,2.063343,4.493307,4.546313
9,9.743524,4.5632,5.250376,8.672554


In [378]:
bo_model = BayesianOptimization().fit(df, ['c3'], model_type='Single-Task GP')

a = bo_model.candidates(1, export_df=True)
a


Unnamed: 0,pH,c1,c2,c3
0,1.2,3.81,3.16,8.209315


Example usage of bayesian optimization for mixed continuous and categorical data with target of dim 1 

In [379]:
train_X_cat = train_X
train_X_cat[:,0] = np.round(train_X_cat[:,0] )
df_cat = pd.DataFrame(train_X_cat, columns=['pH','c1','c2','c3'])
df_cat


Unnamed: 0,pH,c1,c2,c3
0,1.0,9.343806,9.628655,4.548331
1,10.0,6.763961,6.853429,6.482908
2,1.0,4.057388,3.077288,8.975484
3,6.0,3.861676,6.54848,8.249505
4,8.0,3.293891,5.504801,3.902971
5,1.0,0.911508,4.477831,1.505447
6,3.0,4.136301,9.582331,5.067284
7,6.0,8.343886,1.261594,6.870813
8,2.0,2.063343,4.493307,4.546313
9,10.0,4.5632,5.250376,8.672554


In [380]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True)
cand_1

Unnamed: 0,pH,c1,c2,c3
0,10.0,6.08,6.96,5.968801


Choosing the sequential greedy optimizer (BoTorch `optimize_acqf_mixed()`).

In [381]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True, optim_method="Sequential Greedy")
cand_1

Unnamed: 0,pH,c1,c2,c3
0,1.0,4.26,3.25,8.20047


Choosing a different GP model type.

In [382]:
bo_model_st = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Single-Task GP')

cand_1 = bo_model_st.candidates(1, export_df=True)
cand_1

Unnamed: 0,pH,c1,c2,c3
0,1.0,4.15,3.43,8.207541


In [383]:
bodi_model = BayesianOptimization().fit(X=df_cat,y=['c3'],cat_dims=['pH'], model_type='HED')

cand_2 = bodi_model.candidates(1, export_df=True)
cand_2

Unnamed: 0,pH,c1,c2,c3
0,1.0,5.34,3.51,9.448324


In [384]:
bo_model = BayesianOptimization().fit(df,['c3'],cat_dims=['pH'], model_type='SAASBO')

cand_3 = bo_model.candidates(2, export_df=True)
cand_3

Unnamed: 0,pH,c1,c2,c3
0,1.0,5.25,3.99,9.287114
1,6.0,4.72,1.56,8.717928


Different $q$-sampling strategy.

In [385]:
bo_model = BayesianOptimization().fit(df,['c3'])

a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True)
a

Unnamed: 0,pH,c1,c2,c3
0,1.14,4.27,3.15,8.207762
1,7.99,6.06,6.34,6.093579
2,6.19,3.65,6.81,7.276712
3,1.2,3.88,3.13,8.25223


Choosing a different acquisition function.

In [386]:
a = bo_model.candidates(4, export_df=True, acq_func_name="UCB")
a

Unnamed: 0,pH,c1,c2,c3
0,2.31,5.62,5.8,6.184589
1,5.61,9.17,8.99,6.184589
2,2.32,3.91,5.22,6.184589
3,1.0,4.01,3.14,8.894149


Minimizing Ingredients

In [387]:
a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True, input_weights={0:1.1, 1:2.5, 2:2.0})
a

Unnamed: 0,pH,c1,c2,c3
0,1.0,0.91,1.26,6.184589
1,1.57,0.91,1.26,5.984739
2,2.19,0.91,1.26,5.915817
3,1.0,0.91,1.78,6.007024


Multi-Objective optimization.

In [388]:
bo_model = BayesianOptimization().fit(df,['c2', 'c3'])

a = bo_model.candidates(4, export_df=True)
a


	 qExpectedHypervolumeImprovement 	 --> 	 qLogExpectedHypervolumeImprovement 

instead, which fixes the issues and has the same API. See https://arxiv.org/abs/2310.20708 for details.


Unnamed: 0,pH,c1,c2,c3
0,3.0,4.14,9.555815,5.072804
1,10.0,4.56,5.253195,8.653709
2,1.45,5.64,5.667809,5.882161
3,3.79,3.17,5.668169,5.882086
