Importing Libraries

In [67]:
%load_ext autoreload
%autoreload 2
import sys
sys.path.append('../')

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [68]:
from bayesian_opt import BayesianOptimization
import numpy as np
import pandas as pd

This notebooks gives examples on how to use the BayesianOptimization class.
It has built in funcionality for the following:
1. Suggesting more than 1 candidate ($q>1$) per iteration.
2. 4 different types of GP models (Single Task GP, Mixed Single Task GP, SAASBO (Model List GP for $d_{out}>1$), HED).
3. Optional one-hot encoding when not using Mixed Single Task GP.
4. Multi-Objective and Single-Objective optimization.
5. Weights for prioritizing targets.
6. Ingredient minimization with weights.
7. 3 different types of acquisition functions for Single-Objective optimization (UCB, EI, PI).
8. Expected Hypervolume Improvement for Multi-Objective optimization.
9. 2 different $q$-sampling strategies (Believer Update and Monte-Carlo). 

Example usage of bayesian optimization for continuous data, where $d_\text{out}=1$. 

In [28]:
train_X = np.random.uniform(low = 0, high = 10, size = (10,4))
df = pd.DataFrame(train_X, columns=['pH','c1','c2', 'c3'])
df

Unnamed: 0,pH,c1,c2,c3
0,9.444472,8.30278,8.618957,1.25788
1,8.70866,2.960412,4.607408,0.764976
2,9.398775,3.881291,2.665745,9.361849
3,3.076326,7.008377,2.368283,1.218547
4,8.335379,3.197838,7.625564,2.429299
5,3.262656,0.31492,3.770806,4.554692
6,8.762331,7.020547,5.306718,0.593684
7,0.11437,6.210205,5.772938,9.279455
8,8.844842,1.045196,6.885809,8.012148
9,1.153022,4.874126,6.781816,2.533053


In [29]:
bo_model = BayesianOptimization().fit(df, ['c3'], model_type='Single-Task GP')

a = bo_model.candidates(1, export_df=True)
a


Unnamed: 0,pH,c1,c2,c3
0,0.28,6.12,5.61,8.097055


Example usage of bayesian optimization for mixed continuous and categorical data with target of dim 1 

In [30]:
train_X_cat = train_X
train_X_cat[:,0] = np.round(train_X_cat[:,0] )
df_cat = pd.DataFrame(train_X_cat, columns=['pH','c1','c2','c3'])
df_cat


Unnamed: 0,pH,c1,c2,c3
0,9.0,8.30278,8.618957,1.25788
1,9.0,2.960412,4.607408,0.764976
2,9.0,3.881291,2.665745,9.361849
3,3.0,7.008377,2.368283,1.218547
4,8.0,3.197838,7.625564,2.429299
5,3.0,0.31492,3.770806,4.554692
6,9.0,7.020547,5.306718,0.593684
7,0.0,6.210205,5.772938,9.279455
8,9.0,1.045196,6.885809,8.012148
9,1.0,4.874126,6.781816,2.533053


In [31]:
bo_model_mst = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Mixed Single-Task GP')

cand_1 = bo_model_mst.candidates(1, export_df=True)
cand_1

Unnamed: 0,pH,c1,c2,c3
0,3.0,0.96,3.9,4.04463


In [7]:
bo_model_st = BayesianOptimization().fit(df_cat,['c3'],cat_dims=['pH'], model_type='Single-Task GP')

cand_2 = bo_model_st.candidates(1, export_df=True)
cand_2

Unnamed: 0,pH,c1,c2,c3
0,3.0,4.86,0.25,7.747525


In [None]:
bodi_model = BayesianOptimization().fit(X=df_cat,y=['c3'],cat_dims=['pH'], model_type='HED')

cand_3 = bodi_model.candidates(1, export_df=True)
cand_3

Trying again with a new set of initial conditions.
  return _optimize_acqf_batch(opt_inputs=opt_inputs)


Unnamed: 0,pH,c1,c2,c3
0,1.85,4.6,2.37,10.536436


Choosing a different GP model type.

In [8]:
bo_model = BayesianOptimization().fit(df,['c3'],cat_dims=['pH'], model_type='SAASBO')

a = bo_model.candidates(2, export_df=True)
a

Unnamed: 0,pH,c1,c2,c3
0,0.0,3.32,4.44,7.988038
1,0.0,1.42,4.77,7.710234


Different $q$-sampling strategy.

In [9]:
bo_model = BayesianOptimization().fit(df,['c3'])

a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True)
a

Unnamed: 0,pH,c1,c2,c3
0,7.14,2.3,4.0,4.504538
1,1.62,4.94,1.66,4.043994
2,3.18,6.14,1.73,5.235239
3,0.05,6.18,3.11,5.331708


Choosing a different acquisition function.

In [10]:
a = bo_model.candidates(4, export_df=True, acq_func_name="UCB")
a

Unnamed: 0,pH,c1,c2,c3
0,0.5,0.57,2.66,4.135625
1,4.12,6.91,1.06,4.136613
2,0.79,7.88,1.73,4.135573
3,0.09,6.58,3.06,6.649267


Minimizing Ingredients

In [11]:
a = bo_model.candidates(4, q_sampling_method="Believer", export_df=True, input_weights={0:1.1, 1:2.5, 2:2.0})
a

Unnamed: 0,pH,c1,c2,c3
0,0.0,0.24,0.84,4.135625
1,0.63,0.24,0.84,4.036131
2,0.0,0.24,1.41,4.046192
3,1.29,0.24,0.84,4.01465


Multi-Objective optimization.

In [12]:
bo_model = BayesianOptimization().fit(df,['c2', 'c3'])

a = bo_model.candidates(4, export_df=True)
a


	 qExpectedHypervolumeImprovement 	 --> 	 qLogExpectedHypervolumeImprovement 

instead, which fixes the issues and has the same API. See https://arxiv.org/abs/2310.20708 for details.


Unnamed: 0,pH,c1,c2,c3
0,5.29,6.98,4.034778,3.997936
1,0.0,2.2,4.773868,8.836711
2,4.21,2.81,4.034778,3.997936
3,0.74,2.1,6.272942,2.235691
