## Session #3: Bayesian Optimization with Gaussian Processes

Now we will use a Gaussian Process model to do Bayesian Optimization

Here the goal is to find the maximum of an unknown function. The only information we have about the function are the noisy observations we make by sampling from it. Because it is costly to sample, our goal is to get a good estimate of the maximum with as few samples as possible. 

To this end we combine a GP regression model with the BO framework. 

## Your task: 

Your task is to 

a) define a Gaussian process to model the objective function

b) iteratively draw informative samples from the objective function and update the GP model

c) find a good exploration - exploitation trade off in the acquisition function

In [1]:
import numpy as np
import matplotlib.pyplot as plt 
%matplotlib inline
plt.style.use('seaborn-deep')

In [3]:
def gp_regression(xtrain, ytrain, xtest, sigma_noise=.1, l=.1): 

    # calculate the covariance matrix 
    k11 = calculate_covariance_matrix(xtrain, xtrain, l=l) + sigma_noise ** 2 * np.eye(xtrain.shape[0])
    k12 = calculate_covariance_matrix(xtrain, xtest, l=l)
    k22 = calculate_covariance_matrix(xtest, xtest, l=l)
    k21 = calculate_covariance_matrix(xtest, xtrain, l=l)
    
    # Use the formulas above to define the mean function and the covariance matrix of the predictive distribution
    # the mean function
    invers_training_K = np.linalg.inv(k11)
    m = k21.dot(invers_training_K).dot(ytrain)
    # the covariance matrix. 
    sigma = k22 - k21.dot(invers_training_K).dot(k12)
    
    return m.squeeze(), sigma.squeeze()

def plot_gp_regression_results(m, sigma, xtrain, ytrain, xtest, ytest): 

    std = np.sqrt(np.diag(sigma))
    
    upper_std = np.squeeze(m) + std
    lower_std = np.squeeze(m) - std
    
    plt.figure(figsize=(15, 5))
    plt.fill_between(xtest, upper_std, lower_std, alpha=0.4)
    plt.plot(xtest, m, 'r', label='Prediction mean')
    plt.plot(xtrain, ytrain, 'go', label='data')
    plt.plot(xtest, ytest)
    plt.title('Mean and variance of the predictive distr. with training data points')
    plt.legend(loc=0);
    
def acquistion_fun(m, std, kappa=1.): 
    return m + kappa * std

def add_data_point(x_new, y_new, xtrain, ytrain): 
    pass

In [7]:

for t in range(10):
    
    # calculate the predictive distribution with current evidence: 
    m, variance = gp_regression()
    
    # find the next most effictive sample by maximizing the acquisition function: 
    x_new = np.argmax(acquistion_fun(m, std, kappa=kappa))
    
    # sample the objective function a this position 
    y_new = objective_fun(x_new)
    
    # add the new data point to the trianing data set for the GP
    xtrain, ytrain = add_data_point(x_new, y_new, xtrain, ytrain)
    
        

TypeError: gp_regression() missing 3 required positional arguments: 'xtrain', 'ytrain', and 'xtest'