Importing Libraries

In [21]:
%load_ext autoreload
%autoreload 2
import sys
sys.path.append('../')

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [22]:
from bayesian_opt import BayesianOptimization
import numpy as np
import pandas as pd

This notebooks gives examples on how to use the BayesianOptimization class.
It has built in funcionality for the following:
1. Suggesting more than 1 candidate ($q>1$) per iteration.
2. 4 different types of GP models (Single Task GP, Mixed Single Task GP, SAASBO (Model List GP for $d_{out}>1$), HED).
3. Optional one-hot encoding when not using Mixed Single Task GP.
4. Multi-Objective and Single-Objective optimization.
5. Weights for prioritizing targets.
6. Ingredient minimization with weights.
7. 3 different types of acquisition functions for Single-Objective optimization (UCB, EI, PI).
8. Expected Hypervolume Improvement for Multi-Objective optimization.
9. 2 different $q$-sampling strategies (Believer Update and Monte-Carlo). 

Example usage of bayesian optimization for continuous data, where $d_\text{out}=1$. 

In [23]:
train_X = np.random.uniform(low = 0, high = 10, size = (10,4))
df = pd.DataFrame(train_X, columns=['pH','c1','c2', 'c3'])
df

Unnamed: 0,pH,c1,c2,c3
0,7.78478,7.623963,2.008884,5.068326
1,6.066328,4.473058,9.995012,8.453721
2,7.766454,5.309829,2.391555,8.030691
3,0.489516,0.17217,4.120526,0.120201
4,2.974166,2.334923,6.666252,0.582767
5,3.099494,3.078451,5.042503,3.870578
6,7.646816,6.01886,6.645747,5.993826
7,5.155185,6.979993,1.028199,7.053934
8,8.04672,6.040086,8.700778,4.077072
9,8.819369,2.164096,3.1137,7.275193


In [24]:
bo_model = BayesianOptimization().fit(df, ['c3'], model_type='Single-Task GP')

a = bo_model.candidates(1, export_df=True)
a


Unnamed: 0,pH,c1,c2,c3
0,7.81,5.61,2.55,6.910012


Example usage of bayesian optimization for mixed continuous and categorical data with target of dim 1 

In [25]:
train_X_cat = train_X
train_X_cat[:,0] = np.round(train_X_cat[:,0] )
df_cat = pd.DataFrame(train_X_cat, columns=['pH','c1','c2','c3'])
df_cat


Unnamed: 0,pH,c1,c2,c3
0,8.0,7.623963,2.008884,5.068326
1,6.0,4.473058,9.995012,8.453721
2,8.0,5.309829,2.391555,8.030691
3,0.0,0.17217,4.120526,0.120201
4,3.0,2.334923,6.666252,0.582767
5,3.0,3.078451,5.042503,3.870578
6,8.0,6.01886,6.645747,5.993826
7,5.0,6.979993,1.028199,7.053934
8,8.0,6.040086,8.700778,4.077072
9,9.0,2.164096,3.1137,7.275193


In [26]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True)
cand_1

Unnamed: 0,pH,c1,c2,c3
0,9.0,2.02,2.78,6.050861


Choosing the sequential greedy optimizer (BoTorch `optimize_acqf_mixed()`).

In [27]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True, optim_method="Sequential Greedy")
cand_1

Unnamed: 0,pH,c1,c2,c3
0,8.0,4.99,2.28,6.92758


Choosing a different GP model type.

In [28]:
bo_model_st = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Single-Task GP')

cand_1 = bo_model_st.candidates(1, export_df=True, optim_method="Sequential Greedy")
cand_1

Unnamed: 0,pH,c1,c2,c3
0,6.0,4.0,10.0,7.472862


In [29]:
bodi_model = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='HED')

cand_2 = bodi_model.candidates(1, export_df=True, optim_method="Sequential Greedy")
cand_2

Unnamed: 0,pH,c1,c2,c3
0,9.0,4.68,2.68,7.771627


In [30]:
bo_model = BayesianOptimization().fit(df,['c3'],cat_dims=['pH'], model_type='SAASBO')

cand_3 = bo_model.candidates(2, export_df=True)
cand_3

Unnamed: 0,pH,c1,c2,c3
0,9.0,4.07,1.65,8.270046
1,6.0,5.05,3.8,7.99863


Different $q$-sampling strategy.

In [31]:
bo_model = BayesianOptimization().fit(df,['c3'])

a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True)
a

Unnamed: 0,pH,c1,c2,c3
0,7.89,3.53,1.32,5.052631
1,7.72,6.84,1.15,5.052684
2,8.59,2.1,3.03,6.135617
3,6.22,4.48,9.85,7.636387


Choosing a different acquisition function.

In [32]:
a = bo_model.candidates(4, export_df=True, acq_func_name="UCB")
a

Unnamed: 0,pH,c1,c2,c3
0,3.64,7.48,1.12,5.13638
1,7.91,0.92,1.9,5.135941
2,5.92,4.49,10.0,8.344771
3,5.83,1.62,1.85,5.135942


Minimizing Ingredients

In [33]:
a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True, input_weights={0:1.1, 1:2.5, 2:2.0})
a

Unnamed: 0,pH,c1,c2,c3
0,0.0,0.17,1.03,5.135942
1,0.59,0.17,1.03,5.079066
2,0.0,0.17,1.55,5.086928
3,1.2,0.17,1.03,5.063313


Multi-Objective optimization.

In [34]:
bo_model = BayesianOptimization().fit(df,['c2', 'c3'])

a = bo_model.candidates(4, export_df=True)
a


	 qExpectedHypervolumeImprovement 	 --> 	 qLogExpectedHypervolumeImprovement 

instead, which fixes the issues and has the same API. See https://arxiv.org/abs/2310.20708 for details.


Unnamed: 0,pH,c1,c2,c3
0,6.0,4.47,9.961111,8.430769
1,3.69,1.67,4.978378,5.034159
2,2.07,3.44,4.971345,5.049936
3,1.92,7.02,4.971316,5.052631
