# HyperSHAP: Creating an ExplanationTask

In this example, we will walk through multiple ways of how to create an explanation task for your specific HPO problem.

## Setup Mockup Environment

To this end, we first setup some basic environment, which is assumed to be already existing when working with HyperSHAP. We will setup a configuration space and the black-box evaluation function.

In [3]:
import math

from ConfigSpace import Configuration, ConfigurationSpace

# Configuration space with a float, integer, and categorical hyperparameter
cs = ConfigurationSpace(
    name="myspace",
    space={
        "a": (0.1, 1.5),  # UniformFloat
        "b": (2, 10),  # UniformInt
        "c": ["X", "Y"],  # Categorical
    },
)


# Some evaluation function that assesses the performance of some configuration
def eval_fun(config: Configuration) -> float:
    if config["c"] == "X":
        return math.sin(config["a"]) + config["b"]
    elif config["c"] == "Y":
        return math.cos(config["a"] * config["b"]) + 1.5
    else:
        return 0


cs.seed(42)  # set some random seed for reproducibility

## Create ExplanationTask from Black-Box Function

Maybe the easiest way to create an explanation task is to ask HyperSHAP to take care of everything. To this end, we can simply create an explanation task by just providing the configuration space and the black-box function. Additionally, we can specify the number of times the black-box function may be sampled and the type of regressor we would like to use here.

In [4]:
from sklearn.ensemble import RandomForestRegressor

from hypershap.task import ExplanationTask

# creating the ExplanationTask
et = ExplanationTask.from_function(
    config_space=cs,
    function=eval_fun,
    n_samples=1000,  # optional
    base_model=RandomForestRegressor(),  # optional
)

## Create an ExplanationTask from Evaluation Data

Maybe you already have some evaluated configurations that you would like to simply use to serve as a backbone in HyperSHAP for explaining the impact of hyperparameters on performance, then you can also simply provide that data to create an ExplanationTask. For the example, we will first sample a number of random configurations and evaluate those random configurations. Then, we will provide this data to the HyperSHAP to create an ExplanationTask. Optionally, we can choose again the type of base model we would like to build the surrogate model with.

In [5]:
from sklearn.ensemble import RandomForestRegressor

from hypershap.task import ExplanationTask

# mockup data
configuration_list = cs.sample_configuration(size=1_000)
performances = [eval_fun(configuration) for configuration in configuration_list]
data = list(zip(configuration_list, performances))

# creating the ExplanationTask
et = ExplanationTask.from_data(
    config_space=cs,
    data=data,
    base_model=RandomForestRegressor(),  # optional
)

# Create an ExplanationTask from Existing Model

Maybe you already have a model ready to be used to predict the performance of configurations. Please make sure that your model is trained based on the encoding the ConfigSpace package provides via the `get_array()` encoding. This is crucial to ensure proper functioning with the remaining framework of HyperSHAP.

In [6]:
import numpy as np
from sklearn.ensemble import RandomForestRegressor

from hypershap.task import ExplanationTask

# mockup surrogate model
X = np.array([configuration.get_array() for configuration in configuration_list])
y = np.array(performances)
model = RandomForestRegressor()
model.fit(X, y)

# creating the ExplanationTask
et = ExplanationTask.from_base_model(config_space=cs, base_model=model)