Importing Libraries

In [1]:
%load_ext autoreload
%autoreload 2
import sys
sys.path.append('../')

In [2]:
from bayesian_opt import BayesianOptimization
import numpy as np
import pandas as pd

  from .autonotebook import tqdm as notebook_tqdm


This notebooks gives examples on how to use the BayesianOptimization class.
It has built in funcionality for the following:
1. Suggesting more than 1 candidate ($q>1$) per iteration.
2. 4 different types of GP models (Single Task GP, Mixed Single Task GP, SAASBO (Model List GP for $d_{out}>1$), HED).
3. Optional one-hot encoding when not using Mixed Single Task GP.
4. Multi-Objective and Single-Objective optimization.
5. Weights for prioritizing targets.
6. Ingredient minimization with weights.
7. 3 different types of acquisition functions for Single-Objective optimization (UCB, EI, PI).
8. Expected Hypervolume Improvement for Multi-Objective optimization.
9. 2 different $q$-sampling strategies (Believer Update and Monte-Carlo). 

Example usage of bayesian optimization for continuous data, where $d_\text{out}=1$. 

In [3]:
train_X = np.random.uniform(low = 0, high = 10, size = (10,4))
df = pd.DataFrame(train_X, columns=['pH','c1','c2', 'c3'])
df

Unnamed: 0,pH,c1,c2,c3
0,1.772034,0.655534,4.063714,7.812729
1,1.242205,6.020985,3.006953,6.845793
2,8.299478,5.869632,9.018573,0.597756
3,1.917869,2.852063,3.692795,0.301139
4,1.194939,3.924233,7.187629,2.518326
5,2.96587,9.34955,7.55979,5.314264
6,7.417729,2.067364,9.545755,2.36858
7,8.007675,5.214105,9.272726,3.842276
8,8.417203,4.696521,2.400571,4.094982
9,7.524067,5.001722,9.290553,9.955205


In [4]:
bo_model = BayesianOptimization().fit(df, ['c3'], model_type='Single-Task GP')

a = bo_model.candidates(1, export_df=True)
a


Unnamed: 0,pH,c1,c2,c3
0,1.68,6.09,3.25,5.257742


Example usage of bayesian optimization for mixed continuous and categorical data with target of dim 1 

In [5]:
train_X_cat = train_X
train_X_cat[:,0] = np.round(train_X_cat[:,0] )
df_cat = pd.DataFrame(train_X_cat, columns=['pH','c1','c2','c3'])
df_cat


Unnamed: 0,pH,c1,c2,c3
0,2.0,0.655534,4.063714,7.812729
1,1.0,6.020985,3.006953,6.845793
2,8.0,5.869632,9.018573,0.597756
3,2.0,2.852063,3.692795,0.301139
4,1.0,3.924233,7.187629,2.518326
5,3.0,9.34955,7.55979,5.314264
6,7.0,2.067364,9.545755,2.36858
7,8.0,5.214105,9.272726,3.842276
8,8.0,4.696521,2.400571,4.094982
9,8.0,5.001722,9.290553,9.955205


In [6]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True)
cand_1

Unnamed: 0,pH,c1,c2,c3
0,8.0,4.79,9.31,12.281423


Choosing the Sequential Fixed Subspace optimizer (BoTorch `optimize_acqf_mixed()`).

In [7]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True, optim_method="Sequential Fixed Subspace")
cand_1

Unnamed: 0,pH,c1,c2,c3
0,8.0,4.79,9.31,12.281423


Choosing a different GP model type.

In [8]:
bo_model_st = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Single-Task GP')

cand_1 = bo_model_st.candidates(1, export_df=True, optim_method="Sequential Fixed Subspace")
cand_1

Unnamed: 0,pH,c1,c2,c3
0,8.0,4.6,9.32,15.226151


In [9]:
bodi_model = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='HED')

cand_2 = bodi_model.candidates(1, export_df=True, optim_method="Sequential Fixed Subspace")
cand_2

Unnamed: 0,pH,c1,c2,c3
0,8.0,2.88,9.39,19.199846


In [10]:
bo_model = BayesianOptimization().fit(df,['c3'],cat_dims=['pH'], model_type='SAASBO')

cand_3 = bo_model.candidates(2, export_df=True)
cand_3

Unnamed: 0,pH,c1,c2,c3
0,8.0,4.91,9.23,9.461869
1,8.0,4.76,9.36,8.085065


Different $q$-sampling strategy.

In [11]:
bo_model = BayesianOptimization().fit(df,['c3'])

a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True)
a

Unnamed: 0,pH,c1,c2,c3
0,1.61,0.8,4.09,6.082505
1,2.44,8.93,7.76,4.6172
2,8.0,4.79,9.31,12.320413
3,3.07,7.06,3.39,5.642256


Choosing a different acquisition function.

In [12]:
a = bo_model.candidates(4, export_df=True, acq_func_name="UCB")
a

Unnamed: 0,pH,c1,c2,c3
0,6.55,8.71,3.97,5.642256
1,6.43,1.06,5.99,5.642256
2,1.3,5.78,2.92,6.286314
3,4.49,1.12,3.34,5.642256


Minimizing Ingredients

In [13]:
a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True, input_weights={0:1.1, 1:2.5, 2:2.0})
a

Unnamed: 0,pH,c1,c2,c3
0,1.0,0.66,2.4,5.642254
1,1.55,0.66,2.4,4.827836
2,1.0,0.66,2.9,4.926754
3,1.0,1.15,2.4,4.962928


Multi-Objective optimization.

In [None]:
bo_model = BayesianOptimization().fit(df,['c2', 'c3'])

a = bo_model.candidates(4, export_df=True)
a


	 qExpectedHypervolumeImprovement 	 --> 	 qLogExpectedHypervolumeImprovement 

instead, which fixes the issues and has the same API. See https://arxiv.org/abs/2310.20708 for details.


Unnamed: 0,pH,c1,c2,c3
0,4.18,2.54,6.503906,4.365105
1,4.17,5.93,6.503906,4.365105
2,1.55,0.66,5.779142,5.389088
3,3.2,9.34,7.329766,5.10749
