# Classification Surrogate Tests

We are interested in testing whether or not a surrogate model can correctly identify unknown constraints based on categorical criteria with classification surrogates. Essentially, we want to account for scenarios where specialists can look at a set of experiments and label outcomes as 'acceptable', 'unacceptable', 'ideal', etc. 

This involves new models that produce `CategoricalOutput`'s rather than continuous outputs. Mathematically, if $g_{\theta}:\mathbb{R}^d\to[0,1]^c$ represents the function governed by learnable parameters $\theta$ which outputs a probability vector over $c$ potential classes (i.e. for input $x\in\mathbb{R}^d$, $g_{\theta}(x)^\top\mathbf{1}=1$ where $\mathbf{1}$ is the vector of all 1's) and we have acceptibility criteria for the corresponding classes given by $a\in\{0,1\}^c$, we can compute the scalar output $g_{\theta}(x)^\top a\in[0,1]$ which represents the expected value of acceptance as an objective value to be passed in as a constrained function.

In this script, we look at a modified and constrained version of the optimization problem associated with the [Levy function](https://www.sfu.ca/~ssurjano/levy.html), which has a global minima at $x^*=\mathbf{1}$. We classify constraints for three classes: 'acceptable', 'unacceptable', and 'ideal' based on how close we are to the optimal decision variable; obviously, this value is unknown in a real-world setting, but this serves as a reasonable example.

Initially, this script contains an example of JUST training the classification surrogate on the generated data.

In [1]:
# Import packages
import bofire.strategies.api as strategies
from bofire.data_models.api import Domain, Outputs, Inputs
from bofire.data_models.features.api import ContinuousInput, ContinuousOutput, CategoricalOutput, CategoricalInput
from bofire.data_models.objectives.api import MinimizeObjective, MinimizeSigmoidObjective, ConstrainedCategoricalObjective
import numpy as np
import pandas as pd

  from .autonotebook import tqdm as notebook_tqdm


## Manual setup of the optimization domain

The following cells show how to manually setup the optimization problem in BoFire for didactic purposes.

In [2]:
# Write a function which scales the inputs according to the Levy function - i.e. computes $w_i$
def scale_inputs(x: pd.Series) -> pd.Series:
    return 1 + (x - 1) / 4

In [3]:
# Set-up the inputs and outputs, use categorical domain just as an example
input_features = Inputs(features=[ContinuousInput(key=f"x_{i}", bounds=(-2, 2)) for i in range(5)] + [CategoricalInput(key=f"x_5", categories=["0", "1"], allowed=[True, True])])

# here the minimize objective is used, if you want to maximize you have to use the maximize objective.
output_features = Outputs(features=[
        ContinuousOutput(key=f"f_{0}", objective=MinimizeObjective(w=1.)),
        CategoricalOutput(key=f"f_{1}", categories=["unacceptable", "acceptable", "ideal"], objective=ConstrainedCategoricalObjective(categories=["unacceptable", "acceptable", "ideal"], desirability=[False, True, True])), # This function will be associated with learning the categories
        ContinuousOutput(key=f"f_{2}", objective=MinimizeSigmoidObjective(w=1., tp=0.0, steepness=0.5)),
    ]
)

# Create domain
domain1 = Domain(inputs=input_features, outputs=output_features)

# Sample random points
sample_df = domain1.inputs.sample(50)

# Write a function which outputs one continuous variable and another discrete based on some logic
sample_df["f_0"] = np.sin(np.pi * scale_inputs(sample_df["x_0"])) ** 2 + sum([(scale_inputs(sample_df[col]) - 1) ** 2 * (1 + 10 * np.sin(np.pi * scale_inputs(sample_df[col]) + 1) ** 2 if ind < len(sample_df.columns) else 1 + np.sin(2 * np.pi * scale_inputs(sample_df[col])) ** 2) for ind, col in enumerate(sample_df.columns) if not sample_df[col].dtype == "O"])
sample_df["f_1"] = "unacceptable"
sample_df.loc[(sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 6.5) * (sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 3.5), "f_1"] = "acceptable"
sample_df.loc[(sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 5.5) * (sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 4.5), "f_1"] = "ideal"
sample_df["f_2"] = sample_df["x_0"] + 1e-2 * np.random.uniform(size=(len(sample_df),))

sample_df.head(20)

Unnamed: 0,x_0,x_1,x_2,x_3,x_4,x_5,f_0,f_1,f_2
0,0.296884,1.339136,-0.699016,-1.48366,-1.619869,0,7.435109,ideal,0.30202
1,-1.198177,-1.871994,-0.096251,1.753299,-1.822007,0,13.140989,unacceptable,-1.188586
2,-0.197863,0.134761,-1.559977,-1.984249,0.769223,0,10.049967,ideal,-0.189195
3,0.423355,-0.333732,-0.992391,0.021709,1.187608,0,1.452937,unacceptable,0.42498
4,-1.827201,-1.653218,-1.266475,1.432659,-1.1589,0,12.936642,unacceptable,-1.826852
5,0.073122,-1.789563,1.670582,1.86755,1.759474,0,6.432953,unacceptable,0.080055
6,0.847212,-1.236811,-0.533595,-1.84211,1.425404,1,7.133379,acceptable,0.854993
7,-0.728056,1.11475,1.837209,0.768175,1.179493,1,1.895683,acceptable,-0.72477
8,-0.787473,-1.280905,1.964343,1.051243,-1.969924,1,9.828818,unacceptable,-0.778318
9,-1.805895,1.54421,-0.926871,1.780826,1.477554,1,6.990552,unacceptable,-1.797201


## Evaluate the classification model performance (outside of the optimization procedure)

In [4]:
# Import packages
import bofire.surrogates.api as surrogates
from bofire.data_models.surrogates.api import ClassificationMLPEnsemble
from bofire.surrogates.diagnostics import ClassificationMetricsEnum

# Instantiate the surrogate model 
model = ClassificationMLPEnsemble(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_1")]), lr=0.01, n_epochs=100, hidden_layer_sizes=(20,10,))
surrogate = surrogates.map(model)

# Fit the model to the classification data
cv_df = sample_df.drop(["f_0", "f_2"], axis=1)
cv_df["valid_f_1"] = 1
cv = surrogate.cross_validate(cv_df, folds=5)




In [5]:
# Print results
cv[0].get_metrics(metrics=ClassificationMetricsEnum, combine_folds=True) # print training set performance

Unnamed: 0,ACCURACY,F1
0,0.795,0.795


In [6]:
cv[1].get_metrics(metrics=ClassificationMetricsEnum, combine_folds=True) # print test set performance

Unnamed: 0,ACCURACY,F1
0,0.52,0.52


## Setup strategy and ask for candidates



In [7]:
from bofire.data_models.acquisition_functions.api import qEI
from bofire.data_models.strategies.api import SoboStrategy
from bofire.data_models.surrogates.api import BotorchSurrogates, ClassificationMLPEnsemble, MixedSingleTaskGPSurrogate
from bofire.data_models.domain.api import Outputs

strategy_data = SoboStrategy(domain=domain1, 
                             acquisition_function=qEI(), 
                             surrogate_specs=BotorchSurrogates(surrogates=
                                    [
                                        ClassificationMLPEnsemble(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_1")]), lr=0.01, n_epochs=100, hidden_layer_sizes=(20,10,)),
                                        MixedSingleTaskGPSurrogate(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_2")]))
                                    ]
                                )
                            )

strategy = strategies.map(strategy_data)

strategy.tell(sample_df)

In [8]:
candidates = strategy.ask(10)
candidates



Unnamed: 0,x_0,x_1,x_2,x_3,x_4,x_5,f_1_pred,f_1_sd,f_0_pred,f_2_pred,...,f_1_acceptable_prob,f_1_ideal_prob,f_0_sd,f_2_sd,f_1_unacceptable_sd,f_1_acceptable_sd,f_1_ideal_sd,f_0_des,f_2_des,f_1_des
0,0.4814,0.586632,0.063636,0.087314,0.424632,1,acceptable,0.0,-2.807002,0.487184,...,0.974527,0.007553,0.667668,0.003238,0.017963,0.029889,0.014268,2.807002,0.439401,0.98208
1,0.205385,0.435466,0.386084,0.820992,0.56238,0,unacceptable,0.0,-2.478325,0.210586,...,0.408227,0.001115,0.736828,0.003263,0.539504,0.540456,0.002281,2.478325,0.473701,0.409342
2,0.578421,0.790867,-0.054982,0.602806,0.624962,1,acceptable,0.0,-2.495759,0.584149,...,0.96572,0.007595,0.85127,0.003274,0.024524,0.033772,0.014316,2.495759,0.427496,0.973315
3,0.225033,0.648267,0.115931,0.15359,0.52778,1,acceptable,0.0,-2.54625,0.23046,...,0.860037,0.01214,0.717848,0.003234,0.170332,0.177282,0.021604,2.54625,0.471224,0.872177
4,0.34231,0.498711,0.04775,0.008932,0.668452,1,acceptable,0.0,-2.549582,0.34779,...,0.932932,0.008269,0.655287,0.00323,0.099837,0.1011,0.014011,2.549582,0.456635,0.941201
5,0.380575,1.040915,0.214337,0.069153,0.753324,1,acceptable,0.0,-2.368879,0.38629,...,0.983611,0.00572,0.795297,0.003277,0.013258,0.023035,0.012126,2.368879,0.451863,0.989332
6,0.080792,0.26032,0.331997,0.669731,0.425205,1,unacceptable,0.0,-2.326501,0.085955,...,0.210872,0.001047,0.763624,0.003236,0.440892,0.441507,0.001058,2.326501,0.489257,0.21192
7,0.243598,0.698221,-0.014641,0.092352,0.708394,1,acceptable,0.0,-2.445781,0.248967,...,0.943763,0.006818,0.699401,0.003234,0.09064,0.091162,0.012202,2.445781,0.468919,0.950581
8,0.182756,-0.921039,-0.549821,0.689784,-0.655921,1,unacceptable,0.0,1.978269,0.187628,...,0.223373,0.004275,0.536589,0.003255,0.389221,0.391444,0.007985,-1.978269,0.476564,0.227648
9,-0.297385,-0.422154,1.407541,0.605559,1.597342,1,unacceptable,0.0,1.351191,-0.292334,...,0.198005,0.01208,0.576409,0.003309,0.436782,0.442695,0.026803,-1.351191,0.536477,0.210085


## Check classification of proposed candidates

Use the logic from above to verify the classification values

In [9]:
# Append to the candidates
candidates["f_1_true"] = "unacceptable"
candidates.loc[(candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 6.5) * (candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 3.5), "f_1_true"] = "acceptable"
candidates.loc[(candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 5.5) * (candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 4.5), "f_1_true"] = "ideal"

In [10]:
# Print results
candidates[["f_1_pred", "f_1_true"]]

Unnamed: 0,f_1_pred,f_1_true
0,acceptable,unacceptable
1,unacceptable,unacceptable
2,acceptable,unacceptable
3,acceptable,unacceptable
4,acceptable,unacceptable
5,acceptable,unacceptable
6,unacceptable,unacceptable
7,acceptable,unacceptable
8,unacceptable,unacceptable
9,unacceptable,acceptable
