Gaussian Process Regression
===========================

In [None]:
%matplotlib inline

import plotting
import utils

import matplotlib.pyplot as plt
import numpy as np
import numpy.random as rnd
from sklearn.gaussian_process import GaussianProcessRegressor as GPR
from sklearn.gaussian_process import kernels

**1D case**

In [None]:
# Define function
def f_1D(x):
    return np.sin(2*np.pi*x) * np.exp(3*x)

# Plot function
xs_plt = np.linspace(0, 1, num=100)
plt.plot(xs_plt, f_1D(xs_plt))

In [None]:
# Setup data points
xs = np.linspace(0, 1, num=7)
fs = f_1D(xs)
ys = fs

In [None]:
# Setup GaussianProcessRegressor
kernel = kernels.RBF(length_scale=0.05)
gpr = GPR(kernel=kernel,
          alpha=1e-4,
          optimizer=None,
          n_restarts_optimizer=30)
#gpr.fit(xs[:, np.newaxis], ys)

In [None]:
# Plot
plotting.plot_gpr_1d(gpr, with_kernel=True, with_lml=False, n_samples=10)
plt.tight_layout()

**Exercises:**
1. Fit the function to the data points.
(Comment in `gpr.fit(...)`.)
Plot again.
2. The current length scale hyperparameter is not suitable for the give data points.
Optimize the hyperparameter (i.e., the length scale).
(Comment out `optimizer=None`.)
Plot again.
How did the length scale change?
3. Change the number of data points.
What is the minimum number of data points that leads to a "reasonable" approximation?
4. Add noise to the data `ys` (Gaussian, zero mean, fixed standard deviation).
Let the GPR know about it by adjusting the `alpha` parameter in the constructor to the noise standard deviation.
Fit and plot again.
What has changed with the approximation?
5. Try different kernels from the `kernels` module.
See [https://scikit-learn.org/stable/modules/gaussian_process.html#kernels-for-gaussian-processes](https://scikit-learn.org/stable/modules/gaussian_process.html#kernels-for-gaussian-processes).

**2D case**

In [None]:
# Define function
def f_2D(x):
    x1 = x[..., 0]
    x2 = x[..., 1]

    return np.sin(2*np.pi*x1) * np.sin(2*np.pi*x2)

In [None]:
# Setup data points
x1s = np.linspace(0, 1, num=10)
x2s = x1s
xs = utils.cartesian_product(x1s, x2s)
fs = f_2D(xs)
ys = fs

In [None]:
# Setup GaussianProcessRegressor
kernel = kernels.RBF(length_scale=[1., 1.])
gpr = GPR(kernel=kernel,
          alpha=1e-2,
          n_restarts_optimizer=100)
gpr.fit(xs, ys)

In [None]:
# Plot
plotting.contour_gpr_2d(gpr)
plt.tight_layout()

plotting.surf_gpr_2d(gpr)
plt.tight_layout()

**Exercises:**
1. What are the optimized length scales?
Are they similar/equal?
Why?
2. Increase the frequency of the function only along one direction.
Fit and plot again.
Are the optimized length scales still similar?
Do they have a relation related to the frequencies?
3. Decrease the number of data points.
What is the minimum number of data points that leads to a "reasonable" approximation?