# Classification Surrogate Tests

We are interested in testing whether or not a surrogate model can correctly identify unknown constraints based on categorical criteria with classification surrogates. Essentially, we want to account for scenarios where specialists can look at a set of experiments and label outcomes as 'acceptable', 'unacceptable', 'ideal', etc. 

This involves new models that produce `CategoricalOutput`'s rather than continuous outputs. Mathematically, if $g_{\theta}:\mathbb{R}^d\to[0,1]^c$ represents the function governed by learnable parameters $\theta$ which outputs a probability vector over $c$ potential classes (i.e. for input $x\in\mathbb{R}^d$, $g_{\theta}(x)^\top\mathbf{1}=1$ where $\mathbf{1}$ is the vector of all 1's) and we have acceptibility criteria for the corresponding classes given by $a\in\{0,1\}^c$, we can compute the scalar output $g_{\theta}(x)^\top a\in[0,1]$ which represents the expected value of acceptance as an objective value to be passed in as a constrained function.

In this script, we look at a modified and constrained version of the optimization problem associated with the [Levy function](https://www.sfu.ca/~ssurjano/levy.html), which has a global minima at $x^*=\mathbf{1}$. We classify constraints for three classes: 'acceptable', 'unacceptable', and 'ideal' based on how close we are to the optimal decision variable; obviously, this value is unknown in a real-world setting, but this serves as a reasonable example.

Initially, this script contains an example of JUST training the classification surrogate on the generated data.

In [1]:
# Import packages
import bofire.strategies.api as strategies
from bofire.data_models.api import Domain, Outputs, Inputs
from bofire.data_models.features.api import ContinuousInput, ContinuousOutput, CategoricalOutput, CategoricalInput
from bofire.data_models.objectives.api import MinimizeObjective, MinimizeSigmoidObjective, ConstrainedCategoricalObjective
import numpy as np
import pandas as pd

  from .autonotebook import tqdm as notebook_tqdm


## Manual setup of the optimization domain

The following cells show how to manually setup the optimization problem in BoFire for didactic purposes.

In [2]:
# Write a function which scales the inputs according to the Levy function - i.e. computes $w_i$
def scale_inputs(x: pd.Series) -> pd.Series:
    return 1 + (x - 1) / 4

In [3]:
# Set-up the inputs and outputs, use categorical domain just as an example
input_features = Inputs(features=[ContinuousInput(key=f"x_{i}", bounds=(-2, 2)) for i in range(5)] + [CategoricalInput(key=f"x_5", categories=["0", "1"], allowed=[True, True])])

# here the minimize objective is used, if you want to maximize you have to use the maximize objective.
output_features = Outputs(features=[
        ContinuousOutput(key=f"f_{0}", objective=MinimizeObjective(w=1.)),
        CategoricalOutput(key=f"f_{1}", categories=["unacceptable", "acceptable", "ideal"], objective=ConstrainedCategoricalObjective(categories=["unacceptable", "acceptable", "ideal"], desirability=[False, True, True])), # This function will be associated with learning the categories
        ContinuousOutput(key=f"f_{2}", objective=MinimizeSigmoidObjective(w=1., tp=0.0, steepness=0.5)),
    ]
)

# Create domain
domain1 = Domain(inputs=input_features, outputs=output_features)

# Sample random points
sample_df = domain1.inputs.sample(50)

# Write a function which outputs one continuous variable and another discrete based on some logic
sample_df["f_0"] = np.sin(np.pi * scale_inputs(sample_df["x_0"])) ** 2 + sum([(scale_inputs(sample_df[col]) - 1) ** 2 * (1 + 10 * np.sin(np.pi * scale_inputs(sample_df[col]) + 1) ** 2 if ind < len(sample_df.columns) else 1 + np.sin(2 * np.pi * scale_inputs(sample_df[col])) ** 2) for ind, col in enumerate(sample_df.columns) if not sample_df[col].dtype == "O"])
sample_df["f_1"] = "unacceptable"
sample_df.loc[(sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 6.5) * (sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 3.5), "f_1"] = "acceptable"
sample_df.loc[(sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 5.5) * (sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 4.5), "f_1"] = "ideal"
sample_df["f_2"] = sample_df["x_0"] + 1e-2 * np.random.uniform(size=(len(sample_df),))

sample_df.head(20)

Unnamed: 0,x_0,x_1,x_2,x_3,x_4,x_5,f_0,f_1,f_2
0,-0.310698,0.392037,-1.80657,-1.981755,1.384988,0,11.628122,acceptable,-0.31007
1,-1.489428,-1.288171,-0.505947,-0.384544,0.425808,1,6.224438,acceptable,-1.485448
2,1.461134,0.337194,-0.08236,-1.730603,-1.588481,0,8.278724,ideal,1.467232
3,-1.781506,-1.436844,1.930972,1.783539,-0.900358,1,9.733454,unacceptable,-1.77989
4,0.665722,-1.744514,0.446853,0.700885,0.651237,1,4.665782,acceptable,0.66862
5,-1.499912,-0.747193,0.664655,-1.011433,-1.354635,1,7.671702,ideal,-1.491068
6,-1.543055,1.962503,1.798311,0.168173,-1.495112,1,8.231485,unacceptable,-1.542496
7,-0.708194,1.516163,-1.662763,0.717943,-0.50021,0,5.660023,ideal,-0.703393
8,-0.146424,-0.733847,-1.398632,-0.390983,-1.966037,1,9.502397,ideal,-0.142366
9,-1.435747,-0.197231,-0.085655,1.959668,-1.492488,0,7.361934,ideal,-1.429517


## Evaluate the classification model performance (outside of the optimization procedure)

In [4]:
# Import packages
import bofire.surrogates.api as surrogates
from bofire.data_models.surrogates.api import ClassificationMLPEnsemble
from bofire.surrogates.diagnostics import ClassificationMetricsEnum

# Instantiate the surrogate model 
model = ClassificationMLPEnsemble(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_1")]), lr=0.01, n_epochs=100, hidden_layer_sizes=(20,10,))
surrogate = surrogates.map(model)

# Fit the model to the classification data
cv_df = sample_df.drop(["f_0", "f_2"], axis=1)
cv_df["valid_f_1"] = 1
cv = surrogate.cross_validate(cv_df, folds=5)




In [5]:
# Print results
cv[0].get_metrics(metrics=ClassificationMetricsEnum, combine_folds=True) # print training set performance

Unnamed: 0,ACCURACY,F1
0,0.76,0.76


In [6]:
cv[1].get_metrics(metrics=ClassificationMetricsEnum, combine_folds=True) # print test set performance

Unnamed: 0,ACCURACY,F1
0,0.18,0.18


## Setup strategy and ask for candidates



In [7]:
from bofire.data_models.acquisition_functions.api import qEI
from bofire.data_models.strategies.api import SoboStrategy
from bofire.data_models.surrogates.api import BotorchSurrogates, ClassificationMLPEnsemble, MixedSingleTaskGPSurrogate
from bofire.data_models.domain.api import Outputs

strategy_data = SoboStrategy(domain=domain1, 
                             acquisition_function=qEI(), 
                             surrogate_specs=BotorchSurrogates(surrogates=
                                    [
                                        ClassificationMLPEnsemble(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_1")]), lr=0.01, n_epochs=100, hidden_layer_sizes=(20,10,)),
                                        MixedSingleTaskGPSurrogate(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_2")]))
                                    ]
                                )
                            )

strategy = strategies.map(strategy_data)

strategy.tell(sample_df)

In [8]:
candidates = strategy.ask(10)
candidates



Unnamed: 0,x_0,x_1,x_2,x_3,x_4,x_5,f_1_pred,f_1_sd,f_0_pred,f_2_pred,...,f_1_acceptable_prob,f_1_ideal_prob,f_0_sd,f_2_sd,f_1_unacceptable_sd,f_1_acceptable_sd,f_1_ideal_sd,f_0_des,f_2_des,f_1_des
0,0.01265,-0.082007,0.196167,-0.068576,2.0,0,ideal,0.585006,-0.178305,0.017555,...,0.302352,0.696724,0.486615,0.003456,0.00153,0.413975,0.413346,0.178305,0.497806,0.999076
1,0.00609,-0.110946,0.036496,-0.075202,1.376258,0,ideal,0.58115,-0.033201,0.011008,...,0.285243,0.714701,0.317448,0.003389,8.7e-05,0.410947,0.410923,0.033201,0.498624,0.999944
2,0.16122,-0.056338,-0.295114,-0.061665,0.096058,0,acceptable,0.745874,0.011562,0.166371,...,0.591001,0.408996,0.271178,0.003351,5e-06,0.527414,0.527411,-0.011562,0.479216,0.999998
3,-0.027612,0.391135,0.121035,1.256868,1.784862,0,ideal,0.599128,0.209979,-0.022735,...,0.29719,0.702338,0.46925,0.003467,0.000979,0.42385,0.423444,-0.209979,0.502842,0.999528
4,-0.031088,1.11886,0.087444,-0.012355,2.0,0,acceptable,0.683969,0.085227,-0.026003,...,0.502813,0.468748,0.500849,0.003525,0.05398,0.49833,0.465368,-0.085227,0.50325,0.971561
5,-0.015485,-0.106424,0.251205,-0.105387,1.62749,0,ideal,0.596716,-0.075283,-0.010639,...,0.242466,0.757293,0.352754,0.003411,0.000412,0.422019,0.421865,0.075283,0.50133,0.999759
6,-0.177911,-0.006491,0.00473,-0.100077,2.0,0,ideal,0.629016,0.01512,-0.17316,...,0.391062,0.608024,0.520686,0.003441,0.001167,0.444837,0.444724,-0.01512,0.521632,0.999086
7,0.01942,1.192449,0.083663,-0.01462,1.388673,0,ideal,0.73033,0.18632,0.024592,...,0.43055,0.567086,0.322597,0.003467,0.004309,0.517441,0.515383,-0.18632,0.496926,0.997636
8,0.73501,-0.009883,2.0,-0.052121,2.0,0,ideal,0.702506,1.088985,0.740256,...,0.145336,0.659978,1.06847,0.003609,0.43521,0.304475,0.459785,-1.088985,0.40851,0.805314
9,0.028036,2.0,-0.318588,2.0,0.310111,0,ideal,0.603002,1.676667,0.033345,...,0.47537,0.524512,1.17012,0.003564,0.000234,0.42638,0.426393,-1.676667,0.495832,0.999883


## Check classification of proposed candidates

Use the logic from above to verify the classification values

In [11]:
# Append to the candidates
candidates["f_1_true"] = "unacceptable"
candidates.loc[(candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 6.5) * (candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 3.5), "f_1_true"] = "acceptable"
candidates.loc[(candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 5.5) * (candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 4.5), "f_1_true"] = "ideal"

In [12]:
# Print results
candidates[["f_1_pred", "f_1_true"]]

Unnamed: 0,f_1_pred,f_1_true
0,ideal,unacceptable
1,ideal,unacceptable
2,acceptable,unacceptable
3,ideal,acceptable
4,acceptable,unacceptable
5,ideal,unacceptable
6,ideal,unacceptable
7,ideal,unacceptable
8,ideal,ideal
9,ideal,ideal
