<a href="https://colab.research.google.com/github/HiskeOverweg/bo_intro/blob/master/bo_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Bayesian optimization

You can run a cell by clicking on it and pressing shift+Enter. The cell below will install some required packages

In [1]:
!pip install git+https://github.com/HiskeOverweg/bo_intro.git --upgrade
!pip install botorch

Collecting git+https://github.com/HiskeOverweg/bo_intro.git
  Cloning https://github.com/HiskeOverweg/bo_intro.git to /tmp/pip-req-build-xgpxvzkt
  Running command git clone -q https://github.com/HiskeOverweg/bo_intro.git /tmp/pip-req-build-xgpxvzkt
Building wheels for collected packages: bo-intro
  Building wheel for bo-intro (setup.py) ... [?25l[?25hdone
  Created wheel for bo-intro: filename=bo_intro-0.1-cp37-none-any.whl size=4075 sha256=586365fa22cf3a73e5fff38a16f6fcee63120b5a56e474bf0e8dc5304edb4a5b
  Stored in directory: /tmp/pip-ephem-wheel-cache-wtwrwy_6/wheels/ba/76/c2/d1418048f26d6e4a8f33ccb92738e0d12c26f27fcb4bdcc822
Successfully built bo-intro
Installing collected packages: bo-intro
Successfully installed bo-intro-0.1
Collecting botorch
[?25l  Downloading https://files.pythonhosted.org/packages/99/41/21e2aac9ca831a2148ee7da44e00185c791d539c284cf0ebda34cd640e75/botorch-0.4.0-py3-none-any.whl (395kB)
[K     |████████████████████████████████| 399kB 7.5MB/s 
[?25hCollecti

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import botorch
import gpytorch
import bo_intro.datasets
from bo_intro.run_bayesian_optimization import run_bo_experiment

##Finding the maximum of the sine function on the interval [0, 2$\pi$]

We can run Bayesian optimization with 1 random starting point and 20 iterations on the sine function as follows:

In [None]:
config =  {'iterations':20, 'initial_observations':1, 'dataset':'sine', 'acquisition_function':'ei', 'noise':0}
x, y = run_bo_experiment(config, print_progress=True, seed=0)

**Exercise 1** Plot a sine function and the datapoints x, y queried by the Bayesian optimization algorithm

Let's fit a Gaussian process to the complete dataset. We can plot its mean and the confidence bound (2 standard deviations away from the mean).

In [None]:
def plot_gaussian_process(x, y):
  dataset = bo_intro.datasets.Sine()
  x_scaled = dataset.scale(torch.from_numpy(x))

  gaussian_process = botorch.models.SingleTaskGP(x_scaled, torch.from_numpy(y))
  mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood=gaussian_process.likelihood, model=gaussian_process)
  botorch.fit.fit_gpytorch_model(mll)

  x_test = torch.linspace(0, 1, 20, dtype=torch.double).unsqueeze(dim=1)
  posterior = gaussian_process.posterior(x_test)
  lower, upper = posterior.mvn.confidence_region()

  plt.plot(dataset.rescale(x_test), posterior.mean.detach())
  plt.plot(x, y, 'o')
  plt.fill_between(dataset.rescale(x_test).squeeze(), lower.detach(), upper.detach(), alpha=0.5);
  plt.xlim([0, 2*np.pi]);

plot_gaussian_process(x, y)

**Exercise 2** Do you understand the shape of the confidence bound?

**Exercise 3** Try adding some noise to the observations by adapting the 'noise' value in the config dictionary. The corresponding value is the standard deviation of the Gaussian distributed noise. Plot the obtained x and y values. Is the position of the maximum close to the expected maximum at $\pi$/2?

##Regret

The regret is defined as the difference between the true maximum of the function and the best value found so far.

**Exercise 4** Plot the regret for a dataset *without* any added noise, as a function of iteration number, using a logarithmic y-axis

In [None]:
running_max = np.maximum.accumulate(y)

Since Bayesian optimization is a stochastic algorithm it can be useful to evaluate the regret over a few different initializations of the algorithm.

**Exercise 5** Run the algorithm 5 times with different random seeds and make a plot of the average regret as a function of iteration number

##Comparing acquisition functions

Let us now compare a few acquisition functions. You can specify the key 'acquisition_function' in the config dictionary to switch to 'random' or 'ucb' (Upper Confidence Bound).

**Exercise 6** Repeat exercise 5 with a random acquisition function. Which acquisition function leads to the lowest regret?

##Exploring vs exploiting
The upper confidence bound acquisition function is defined as $\mu + \beta \sigma$, where $\mu$ and $\sigma$ are the mean and standard deviation of the Gaussian process and $\beta$ is a constant. By increasing $\beta$ we can make the search more explorative. The default value is $\beta = 3$, but you can change it by specifying for instance 'beta':500 in the config dictionary.

**Exercise 7** Plot a sine function and the datapoints x, y queried by the Bayesian optimization algorithm with ucb acquisition function and 'beta':500.

## Optimizing a 2-dimensional function

**Exercise 8** Try optimizing the [negative Branin function](https://www.sfu.ca/~ssurjano/branin.html) by specifying 'dataset':'branin' in the config dictionary. Make a plot of regret vs iteration number