In [1]:
from rich.pretty import pprint

## The DACBO Benchmark
Let's take a look at DACBO. This is a benchmark for controlling hyperparameters of a Bayesian Optimization (BO) loop. First, let's make an instance of the benchmark:

In [None]:
from dacbench.benchmarks import DACBOBenchmark
bench = DACBOBenchmark()

Now let's take a look at the elements of the config in this benchmark:

In [3]:
pprint(list(bench.config.keys()))

The 'benchmark_info' tells us some things about this benchmark already:

In [4]:
pprint(bench.config["benchmark_info"])

The reward in this task has the following reward range:

In [5]:
pprint(bench.config["reward_range"])

The config also contains some standard keys like the seed, instance set, list of observation keys, action space class, HP search ranges, etc. By default, the agent controls the parameter $\alpha\in [0,1]$ in the Weighted Expected Improvement (WEI) acquisition function of the BO loop. For further configuration details, please refer to the dacboenv package.

## DACBO Instances
Now let's take a look at how a DACBO instance looks. To do so, we first read the default instance set and look at its only element:

In [6]:
pprint(bench.config["instance_set_path"])
bench.read_instance_set()
pprint(bench.config.instance_set[0])

{'task_ids': ['bbob/2/1/0', 'bbob/2/20/0'], 'inner_seeds': [1, 2, 3]}


As in the instance selection is handled internally by the DACBO environment, we only provide a list of target functions to be considered as offered by CARP-S as well as a list of inner seeds. A DACBO instance consists of a single inner seed and a target function. By default, the cross product of the selected inner seeds seeds and target functions is evaluated in a round robin manner.

## Running DACBO
Lastly, let's look at the DACBO benchmark in action. Because some observations rely on reference incumbent values, we first run SMAC to create a baseline. Additionally, the first BO run's initial design is evaluated upon resetting.

In [7]:
env = bench.get_environment()
pprint(env.reset())

[INFO][abstract_initial_design.py:139] Using 16 initial design configurations and 0 additional configurations.
[INFO][abstract_intensifier.py:307] Using only one seed for deterministic scenario.
[INFO][abstract_intensifier.py:517] Added config e9bc68 as new incumbent because there are no incumbents yet.
[INFO][abstract_intensifier.py:596] Added config 268bf5 and rejected config e9bc68 as incumbent because it is not better than the incumbents on 1 instances: 
[INFO][abstract_intensifier.py:596] Added config a4ff80 and rejected config 268bf5 as incumbent because it is not better than the incumbents on 1 instances: 
[INFO][abstract_intensifier.py:596] Added config 208457 and rejected config a4ff80 as incumbent because it is not better than the incumbents on 1 instances: 
[INFO][abstract_intensifier.py:596] Added config c81044 and rejected config 208457 as incumbent because it is not better than the incumbents on 1 instances: 
[INFO][abstract_intensifier.py:596] Added config 02dd32 and rej

If we take a step, we see the updated state:

In [8]:
action = env.action_space.sample()
state, reward, terminated, truncated, info = env.step(action)
pprint(state)

[INFO][dacboenv.py:359] Step: 1, instance: (1, 'bbob/2/20/0') Reward: -428.27284142743866, terminated: False, truncated: False, info: {}


Furthermore, we also get a reward and a truncation signal. Truncation will be set to true after a single BO run has finished and the next instance will be selected internally.

In [9]:
pprint(f"Reward {reward}")
pprint(f"Truncated {truncated}")