### In this notebook, we will give a tutorial on how to use `exp_loop` for  running experiments in `BoRisk`.

For this tutorial, we will be using the 2D `Branin` function, which is available at
`BoRisk/test_functions/function_picker.py`. `Branin` function will stand in for `F(x, w)` with first dimension of the input being `x` and the second dimension being `w`.

To use custom functions, implement your function as a subclass of
`SyntheticTestFunction`, and import it into `function_picker.py`. See
`BoRisk/test_functions/` for examples.

The `exp_loop` takes in a number of keyword arguments as input, which specify the
function as well as the experiment settings. The full list of arguments can be found in
 the docstrings of `exp_loop` and `Experiment`.

In [1]:
import torch
from BoRisk import exp_loop

args_dict = dict()

We will specify the details of our experiment by filling in `args_dict`.

For `function_name` we will use `branin` which maps to the `Branin` function in
`function_picker.py`. The seed is useful for synchronizing the initial samples when
benchmarking different algorithms, and can be specified as any dummy value here.

The `filename` is the name of the file in which to save the experiment output. It will
typically be appended depending on the values of some of the other arguments. This
helps avoid user error when the same `filename` is passed for two different experiments.
If the output file with the same name exists, `exp_loop` will read the existing output,
 reconstruct the experiment state and continue the experiment for the remaining
 iterations. This is helpful when an experiment gets killed due to a numerical or a
 memory error, or when we want to add additional iterations. The output files are
 placed in `exp_output/` by default.

For starters, we will set `iterations=5`, i.e., run only 5 BO iterations.

In [2]:
args_dict["function_name"] = "branin"
args_dict["seed"] = 0  # dummy
args_dict["filename"] = "tutorial_branin"
args_dict["iterations"] = 5

The `Experiment` class accepts many arguments which specify many of the details. We
will specify some of these to customize our experiment. The defaults can be found in
the docstring.

We will leave `dim_w = 1` as the default since we are using a 1D `W`. Note that `w` is
always assumed to be the last `dim_w` dimension of the function input, i.e., `F(X) = F
(x, w)` where `w = X[..., -dim_w:]`.

We can specify the observation noise level by setting `noise_std`.

We will reduce `num_fantasies` to `4` to speed up the tutorial. Similarly, we
will specify small values for the optimization options, including `num_restarts` and
`raw_multiplier`.

In [3]:
args_dict["dim_w"] = 1
args_dict["noise_std"] = 1.0
args_dict["num_fantasies"] = 4
args_dict["num_restarts"] = 10
args_dict["raw_multiplier"] = 10

We can specify the risk level `alpha`, which is set to `0.7` by default. Let's use
`0.5` for this example. To completely specify the risk measure, we can set the `CVaR`
argument, which is `False` by default. Let's set this to `True` so that our risk
measure is CVaR at risk level `alpha=0.5`.

The random sampling method `rho-random` presented in the paper can be used by
specifying `random_sampling=True` here, which we will not be using.

The `Experiment` class supports `dtype` and `device` arguments, which correspond to their
 definitions in `PyTorch`. For this experiment, we will use `cuda` if it is available.

In [4]:
args_dict["alpha"] = 0.5
args_dict["device"] = "cuda" if torch.cuda.is_available() else "cpu"

We have several arguments which specify which BoRisk acquisition function to use. The
`apx`, which is `True` by default, corresponds to `rhoKGapx` acquisition function,
which is the
recommended first choice, and will be used here. Setting `apx=False` would instead use
the `rhoKG` acquisition function, which is significantly more expensive to optimize.

Other alternatives include `apx_cvar` and `tts_apx_cvar`, which are two versions of a
`CVaR` specific approximation that is not presented in the paper. Additionally, we also
 have `one_shot`, which, paired with `apx=False`, uses the one-shot optimization
 approach to optimize `rhoKG`. This speeds up the optimization, but reduces the
 algorithm performance. One last argument is the `tts_frequency`, which specifies the
 frequency of two time scale optimization as explained in the paper.

The remaining arguments we will specify relate to the distribution of the random
variable `W`. By default, `W` is assumed to be continuous uniform in `[0, 1]^dim_w`. If
 using other continuous distributions, we recommend performing the inverse CDF
 transformation within the test function itself. The `Experiment` class does not have
 built-in support for non-uniform continuous distributions. When `W` is continuous,
  `fix_samples=True` is used to fix the samples `w_{1:L}` (see Section 4 of the paper)
  for the SAA approach. The `num_samples` argument is used to specify the number of
  samples `L` here.

If using a discrete distribution, the domain of `W`, scaled down to `[0, 1]^dim_w` can
be specified using the `w_samples` argument. If the distribution is non-uniform, the
probability mass of each sample of `W` can be specified using the `weights` argument.

For this example, let's use `w_samples = [0.0, 0.2, 0.5, 0.8, 1.0]` with
`weights = [0.1, 0.1, 0.3, 0.3, 0.2]`.

With `w_samples` specified, we can also specify the samples to use for initializing the
 GP model.

In [5]:
args_dict["w_samples"] = torch.tensor([0.0, 0.2, 0.5, 0.8, 1.0]).reshape(-1, 1)
args_dict["weights"] = torch.tensor([0.1, 0.1, 0.3, 0.3, 0.2])
args_dict["init_samples"] = torch.tensor(
    [
        [0.3, 0.0],
        [0.5, 0.5],
        [0.75, 1.0],
        [0.15, 0.8],
        [0.5, 0.8],
        [0.4, 0.2],
    ]
)

We are now ready to run the experiment. The next cell will initialize the experiment
and perform 5 iterations of BO.

In [6]:
args_dict["verbose"] = True

%time exp = exp_loop(**args_dict)


Starting iteration 0
Current best solution, value:  tensor([0.1500]) tensor(3.4822)
Candidate:  tensor([[0., 0.]])  KG value:  tensor(-23.4025)
Iteration completed in 0.4641389846801758
iter time: 0.619044303894043 
Starting iteration 1
Current best solution, value:  tensor([0.3000]) tensor(20.5553)
Candidate:  tensor([[0.5865, 0.5000]])  KG value:  tensor(-7.7518)
Iteration completed in 0.7036099433898926
iter time: 0.8880550861358643 
Starting iteration 2
Current best solution, value:  tensor([0.3000]) tensor(15.2603)
Candidate:  tensor([[0.5605, 0.2000]])  KG value:  tensor(-14.4466)
Iteration completed in 1.3589019775390625
iter time: 1.4711759090423584 
Starting iteration 3
Current best solution, value:  tensor([0.1500]) tensor(10.7187)
Candidate:  tensor([[0.9169, 0.5000]])  KG value:  tensor(0.4796)
Iteration completed in 1.485734224319458
iter time: 1.6614959239959717 
Starting iteration 4
Current best solution, value:  tensor([0.1500]) tensor(14.0855)
Candidate:  tensor([[1.00

The output file for this experiment can be found at
`exp_output/tutorial_branin_a=0.5_cont_weights.pt`. It contains more information than
we should ever need. Let's load it and verify that we indeed have the output for each
of 5 iterations.

In [7]:
output_file = "exp_output/tutorial_branin_a=0.5_cont_weights.pt"
output = torch.load(output_file)
print(output.keys())
print(output[0].keys())

dict_keys([0, 1, 2, 3, 4, 'final_solution', 'final_value'])
dict_keys(['state_dict', 'train_Y', 'train_X', 'current_best_sol', 'current_best_value', 'acqf_value', 'candidate', 'function', 'dim', 'dim_w', 'num_fantasies', 'num_restarts', 'raw_multiplier', 'alpha', 'q', 'num_repetitions', 'verbose', 'maxiter', 'CVaR', 'random_sampling', 'expectation', 'dtype', 'device', 'apx', 'apx_cvar', 'tts_apx_cvar', 'disc', 'tts_frequency', 'num_inner_restarts', 'inner_raw_multiplier', 'weights', 'fix_samples', 'one_shot', 'low_fantasies', 'random_w', 'noise_std', 'w_samples', 'init_samples', 'dim_x', 'num_samples', 'fixed_samples', 'passed', 'fit_count'])


We will now run the experiment for 15 more iterations. This is purely for demonstrating
 the warm-start / continue functionality built into `exp_loop`. This proved very useful
  when we were running rather expensive experiments on the cluster.

In [8]:
args_dict["iterations"] = 20
%time exp = exp_loop(**args_dict)

output_file = "exp_output/tutorial_branin_a=0.5_cont_weights.pt"
output = torch.load(output_file)
print(output.keys())

Starting iteration 5
Current best solution, value:  tensor([0.1500]) tensor(14.8123)
Candidate:  tensor([[0.2944, 0.5000]])  KG value:  tensor(-11.0697)
Iteration completed in 1.0468389987945557
iter time: 1.151777982711792 
Starting iteration 6
Current best solution, value:  tensor([0.1500]) tensor(12.3591)
Candidate:  tensor([[0.9743, 0.0000]])  KG value:  tensor(-11.3711)
Iteration completed in 1.722545862197876
iter time: 1.8498108386993408 
Starting iteration 7
Current best solution, value:  tensor([0.2944]) tensor(10.6478)
Candidate:  tensor([[0.8973, 1.0000]])  KG value:  tensor(-12.3192)
Iteration completed in 1.6161069869995117
iter time: 1.817112922668457 
Starting iteration 8
Current best solution, value:  tensor([0.1500]) tensor(7.6724)
Candidate:  tensor([[0.2543, 0.2000]])  KG value:  tensor(-16.1955)
Iteration completed in 2.8807051181793213
iter time: 3.071120023727417 
Starting iteration 9
Current best solution, value:  tensor([0.1500]) tensor(10.8443)
Candidate:  tens

Here, we chose to use the `verbose` argument to print out the best values found so far.
 The output contains the best solution the algorithm found at the beginning of each
 iteration, as well as at the end of the experiment. These values can easily be
 extracted using a for loop, and the solution can be evaluated with a chosen objective.
  A not-so-friendly example of how we did this for reading batches of experiment
  outputs can be found in `helper_fns/ex_output_read.py`.