Gaussian Process Regression
===========================

In [None]:
%matplotlib inline

import plotting
import utils

import matplotlib.pyplot as plt
import numpy as np
import numpy.random as rnd
from sklearn.gaussian_process import GaussianProcessRegressor as GPR
from sklearn.gaussian_process import kernels

**1D case**

In [None]:
# Define function
def f_1D(x):
    return np.sin(2*np.pi*x) * np.exp(3*x)

In [None]:
# Setup data points
xs = np.linspace(0, 1, num=7)
fs = f_1D(xs)
ys = fs

In [None]:
# Setup GaussianProcessRegressor
kernel = kernels.RBF(length_scale=0.05)
gpr = GPR(kernel=kernel,
          alpha=1e-4,
          #optimizer=None,
          n_restarts_optimizer=30)
gpr.fit(xs[:, np.newaxis], ys)

In [None]:
# Plot
plotting.plot_gpr_1d(gpr, with_kernel=True, with_lml=False, n_samples=10)
plt.tight_layout()

**Exercises:**
* Fit the function to the data points. Plot again.
* Optimize the hyperparameter (i.e., the length scale). (Comment out 'optimizer=None'). Plot again. How did the length scale change?
* Change the number of data points. What is the minimum number of data points that leads to a "reasonable" approximation?

**2D case**

In [None]:
# Define function
def f_2D(x):
    x1 = x[..., 0]
    x2 = x[..., 1]

    return np.sin(2*np.pi*x1) * np.sin(2*np.pi*x2)

In [None]:
# Setup data points
x1s = np.linspace(0, 1, num=10)
x2s = x1s
xs = utils.cartesian_product(x1s, x2s)
fs = f_2D(xs)
ys = fs

In [None]:
# Setup GaussianProcessRegressor
kernel = kernels.RBF(length_scale=[1., 1.])
gpr = GPR(kernel=kernel,
          alpha=1e-2,
          n_restarts_optimizer=100)
gpr.fit(xs, ys)

In [None]:
# Plot
plotting.contour_gpr_2d(gpr)
plt.tight_layout()

plotting.surf_gpr_2d(gpr)
plt.tight_layout()

**Exercises:**
* What are the optimized length scales? Are they similar/equal? Why?
* Increase the frequency of the function only along one direction. Fit and plot again. Are the optimized length scales still similar? Do they have a relation related to the frequencies?
* Decrease the number of data points. What is the minimum number of data points that leads to a "reasonable" approximation?