# Classification Surrogate Tests

We are interested in testing whether or not a surrogate model can correctly identify unknown constraints based on categorical criteria with classification surrogates. Essentially, we want to account for scenarios where specialists can look at a set of experiments and label outcomes as 'acceptable', 'unacceptable', 'ideal', etc. 

This involves new models that produce `CategoricalOutput`'s rather than continuous outputs. Mathematically, if $g_{\theta}:\mathbb{R}^d\to[0,1]^c$ represents the function governed by learnable parameters $\theta$ which outputs a probability vector over $c$ potential classes (i.e. for input $x\in\mathbb{R}^d$, $g_{\theta}(x)^\top\mathbf{1}=1$ where $\mathbf{1}$ is the vector of all 1's) and we have acceptibility criteria for the corresponding classes given by $a\in\{0,1\}^c$, we can compute the scalar output $g_{\theta}(x)^\top a\in[0,1]$ which represents the expected value of acceptance as an objective value to be passed in as a constrained function.

In this script, we look at a modified and constrained version of the optimization problem associated with the [Levy function](https://www.sfu.ca/~ssurjano/levy.html), which has a global minima at $x^*=\mathbf{1}$. We classify constraints for three classes: 'acceptable', 'unacceptable', and 'ideal' based on how close we are to the optimal decision variable; obviously, this value is unknown in a real-world setting, but this serves as a reasonable example.

Initially, this script contains an example of JUST training the classification surrogate on the generated data.

In [1]:
# Import packages
import bofire.strategies.api as strategies
from bofire.data_models.api import Domain, Outputs, Inputs
from bofire.data_models.features.api import ContinuousInput, ContinuousOutput, CategoricalOutput, CategoricalInput
from bofire.data_models.objectives.api import MinimizeObjective, MinimizeSigmoidObjective, ConstrainedCategoricalObjective
import numpy as np
import pandas as pd

  from .autonotebook import tqdm as notebook_tqdm


## Manual setup of the optimization domain

The following cells show how to manually setup the optimization problem in BoFire for didactic purposes.

In [2]:
# Write a function which scales the inputs according to the Levy function - i.e. computes $w_i$
def scale_inputs(x: pd.Series) -> pd.Series:
    return 1 + (x - 1) / 4

In [3]:
# Set-up the inputs and outputs, use categorical domain just as an example
input_features = Inputs(features=[ContinuousInput(key=f"x_{i}", bounds=(-2, 2)) for i in range(5)] + [CategoricalInput(key=f"x_5", categories=["0", "1"], allowed=[True, True])])

# here the minimize objective is used, if you want to maximize you have to use the maximize objective.
output_features = Outputs(features=[
        ContinuousOutput(key=f"f_{0}", objective=MinimizeObjective(w=1.)),
        CategoricalOutput(key=f"f_{1}", categories=["unacceptable", "acceptable", "ideal"], objective=ConstrainedCategoricalObjective(categories=["unacceptable", "acceptable", "ideal"], desirability=[False, True, True])), # This function will be associated with learning the categories
        ContinuousOutput(key=f"f_{2}", objective=MinimizeSigmoidObjective(w=1., tp=0.0, steepness=0.5)),
    ]
)

# Create domain
domain1 = Domain(inputs=input_features, outputs=output_features)

# Sample random points
sample_df = domain1.inputs.sample(50)

# Write a function which outputs one continuous variable and another discrete based on some logic
sample_df["f_0"] = np.sin(np.pi * scale_inputs(sample_df["x_0"])) ** 2 + sum([(scale_inputs(sample_df[col]) - 1) ** 2 * (1 + 10 * np.sin(np.pi * scale_inputs(sample_df[col]) + 1) ** 2 if ind < len(sample_df.columns) else 1 + np.sin(2 * np.pi * scale_inputs(sample_df[col])) ** 2) for ind, col in enumerate(sample_df.columns) if not sample_df[col].dtype == "O"])
sample_df["f_1"] = "unacceptable"
sample_df.loc[(sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 6.5) * (sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 3.5), "f_1"] = "acceptable"
sample_df.loc[(sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 5.5) * (sample_df[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 4.5), "f_1"] = "ideal"
sample_df["f_2"] = sample_df["x_0"] + 1e-2 * np.random.uniform(size=(len(sample_df),))

sample_df.head(20)

Unnamed: 0,x_0,x_1,x_2,x_3,x_4,x_5,f_0,f_1,f_2
0,-1.14752,0.352595,1.233487,0.764528,0.728527,0,2.6,acceptable,-1.140822
1,-0.541992,1.030834,0.544503,-0.451754,-1.379001,0,3.72221,acceptable,-0.533687
2,1.620988,1.56977,0.51021,0.478663,0.577311,1,0.891793,ideal,1.626114
3,0.603489,1.560799,1.14727,0.860127,-0.984688,0,1.315285,ideal,0.606293
4,1.115889,0.675806,1.156867,0.23884,-0.270817,0,0.259161,unacceptable,1.12147
5,0.947684,-1.348468,1.595431,-1.910428,0.906305,0,7.925056,unacceptable,0.956881
6,-1.5608,1.897495,-0.659937,-0.922824,1.988474,1,6.471322,unacceptable,-1.554441
7,1.765355,0.208224,0.475951,-0.412166,1.013625,0,1.024378,acceptable,1.774402
8,0.817979,-1.398751,-1.613844,-1.626274,1.718574,0,10.27123,unacceptable,0.821963
9,1.624639,0.59126,1.05666,1.964813,-0.609677,1,1.434253,acceptable,1.633391


## Evaluate the classification model performance (outside of the optimization procedure)

In [4]:
# Import packages
import bofire.surrogates.api as surrogates
from bofire.data_models.surrogates.api import ClassificationMLPEnsemble
from bofire.surrogates.diagnostics import ClassificationMetricsEnum

# Instantiate the surrogate model 
model = ClassificationMLPEnsemble(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_1")]), lr=0.01, n_epochs=100, hidden_layer_sizes=(20,10,))
surrogate = surrogates.map(model)

# Fit the model to the classification data
cv_df = sample_df.drop(["f_0", "f_2"], axis=1)
cv_df["valid_f_1"] = 1
cv = surrogate.cross_validate(cv_df, folds=5)




In [5]:
# Print results
cv[0].get_metrics(metrics=ClassificationMetricsEnum, combine_folds=True) # print training set performance

Unnamed: 0,ACCURACY,F1
0,0.77,0.77


In [6]:
cv[1].get_metrics(metrics=ClassificationMetricsEnum, combine_folds=True) # print test set performance

Unnamed: 0,ACCURACY,F1
0,0.46,0.46


## Setup strategy and ask for candidates



In [7]:
from bofire.data_models.acquisition_functions.api import qEI
from bofire.data_models.strategies.api import SoboStrategy
from bofire.data_models.surrogates.api import BotorchSurrogates, ClassificationMLPEnsemble, MixedSingleTaskGPSurrogate
from bofire.data_models.domain.api import Outputs

strategy_data = SoboStrategy(domain=domain1, 
                             acquisition_function=qEI(), 
                             surrogate_specs=BotorchSurrogates(surrogates=
                                    [
                                        ClassificationMLPEnsemble(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_1")]), lr=0.01, n_epochs=100, hidden_layer_sizes=(20,10,)),
                                        MixedSingleTaskGPSurrogate(inputs=domain1.inputs, outputs=Outputs(features=[domain1.outputs.get_by_key("f_2")]))
                                    ]
                                )
                            )

strategy = strategies.map(strategy_data)

strategy.tell(sample_df)

In [8]:
candidates = strategy.ask(10)
candidates



Unnamed: 0,x_0,x_1,x_2,x_3,x_4,x_5,f_1_pred,f_1_sd,f_0_pred,f_2_pred,...,f_1_acceptable_prob,f_1_ideal_prob,f_0_sd,f_2_sd,f_1_unacceptable_sd,f_1_acceptable_sd,f_1_ideal_sd,f_0_des,f_2_des,f_1_des
0,0.098607,1.274078,2.0,0.74904,-0.156863,0,acceptable,0.0,-0.598172,0.102933,...,0.999879,2.749825e-07,0.830174,0.003029,0.00027,0.000271,6.091118e-07,0.598172,0.487136,0.999879
1,0.64699,0.60415,-0.341737,1.307168,-0.151535,0,acceptable,0.0,-0.465931,0.650499,...,0.97947,0.01731224,0.241746,0.002957,0.007131,0.045842,0.03871093,0.465931,0.419397,0.996782
2,-0.268094,2.0,2.0,0.417189,-0.160303,0,acceptable,0.0,-0.095701,-0.263037,...,0.999936,9.78671e-08,1.129313,0.00317,0.000139,0.000139,2.182845e-07,0.095701,0.532832,0.999937
3,-0.407627,2.0,2.0,2.0,-0.147082,1,acceptable,0.0,1.523601,-0.40338,...,0.808412,1.612794e-07,1.637777,0.003359,0.41636,0.41636,2.241354e-07,-1.523601,0.550252,0.808412
4,0.928282,0.98689,-0.608701,1.760348,-0.144849,0,acceptable,0.0,-0.305604,0.931805,...,0.967507,0.02830622,0.478486,0.00304,0.009351,0.072646,0.06329445,0.305604,0.385587,0.995813
5,-0.1715,1.702729,2.0,1.147333,-0.205059,0,acceptable,0.0,-0.176775,-0.167058,...,0.999999,8.229916e-09,1.045623,0.003128,3e-06,3e-06,1.748655e-08,0.176775,0.52087,0.999999
6,0.348453,0.74903,2.0,0.805095,-0.119394,0,acceptable,0.0,-0.598377,0.352705,...,0.999376,1.184046e-06,0.710733,0.002993,0.001392,0.001395,2.614105e-06,0.598377,0.456026,0.999377
7,-2.0,0.457609,1.675461,1.207754,0.243209,0,acceptable,0.0,4.752548,-1.99421,...,0.949335,0.0001139145,0.807708,0.005353,0.113034,0.113289,0.0002547206,-4.752548,0.730489,0.949449
8,0.126459,1.752512,2.0,0.269535,-0.025944,0,acceptable,0.0,-0.373769,0.130896,...,0.99839,1.199743e-06,1.007492,0.003081,0.003593,0.003596,2.680733e-06,0.373769,0.483644,0.998391
9,0.106551,1.362876,2.0,1.558184,-0.132843,0,acceptable,0.0,-0.100465,0.110338,...,0.999999,6.51114e-09,1.03977,0.003075,1e-06,1e-06,1.175186e-08,0.100465,0.486211,0.999999


## Check classification of proposed candidates

Use the logic from above to verify the classification values

In [9]:
# Append to the candidates
candidates["f_1_true"] = "unacceptable"
candidates.loc[(candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 6.5) * (candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 3.5), "f_1_true"] = "acceptable"
candidates.loc[(candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) <= 5.5) * (candidates[input_features.get_keys(includes=ContinuousInput, excludes=CategoricalInput)].abs().sum(1) >= 4.5), "f_1_true"] = "ideal"

In [10]:
# Print results
candidates[["f_1_pred", "f_1_true"]]

Unnamed: 0,f_1_pred,f_1_true
0,acceptable,acceptable
1,acceptable,unacceptable
2,acceptable,ideal
3,acceptable,unacceptable
4,acceptable,acceptable
5,acceptable,ideal
6,acceptable,acceptable
7,acceptable,acceptable
8,acceptable,acceptable
9,acceptable,ideal
