
<br>
=========================================================<br>
Gaussian Processes regression: basic introductory example<br>
=========================================================<br>
A simple one-dimensional regression example computed in two different ways:<br>
1. A noise-free case<br>
2. A noisy case with known noise-level per datapoint<br>
In both cases, the kernel's parameters are estimated using the maximum<br>
likelihood principle.<br>
The figures illustrate the interpolating property of the Gaussian Process<br>
model as well as its probabilistic nature in the form of a pointwise 95%<br>
confidence interval.<br>
Note that the parameter ``alpha`` is applied as a Tikhonov<br>
regularization of the assumed covariance between the training points.<br>


In [None]:
print(__doc__)

Author: Vincent Dubourg <vincent.dubourg@gmail.com><br>
        Jake Vanderplas <vanderplas@astro.washington.edu><br>
        Jan Hendrik Metzen <jhm@informatik.uni-bremen.de>s<br>
License: BSD 3 clause

In [None]:
import numpy as np
from matplotlib import pyplot as plt

In [None]:
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C

In [None]:
np.random.seed(1)

In [None]:
def f(x):
    """The function to predict."""
    return x * np.sin(x)

----------------------------------------------------------------------<br>
 First the noiseless case

In [None]:
X = np.atleast_2d([1., 3., 5., 6., 7., 8.]).T

Observations

In [None]:
y = f(X).ravel()

Mesh the input space for evaluations of the real function, the prediction and<br>
its MSE

In [None]:
x = np.atleast_2d(np.linspace(0, 10, 1000)).T

Instantiate a Gaussian Process model

In [None]:
kernel = C(1.0, (1e-3, 1e3)) * RBF(10, (1e-2, 1e2))
gp = GaussianProcessRegressor(kernel=kernel, n_restarts_optimizer=9)

Fit to data using Maximum Likelihood Estimation of the parameters

In [None]:
gp.fit(X, y)

Make the prediction on the meshed x-axis (ask for MSE as well)

In [None]:
y_pred, sigma = gp.predict(x, return_std=True)

Plot the function, the prediction and the 95% confidence interval based on<br>
the MSE

In [None]:
plt.figure()
plt.plot(x, f(x), 'r:', label=r'$f(x) = x\,\sin(x)$')
plt.plot(X, y, 'r.', markersize=10, label='Observations')
plt.plot(x, y_pred, 'b-', label='Prediction')
plt.fill(np.concatenate([x, x[::-1]]),
         np.concatenate([y_pred - 1.9600 * sigma,
                        (y_pred + 1.9600 * sigma)[::-1]]),
         alpha=.5, fc='b', ec='None', label='95% confidence interval')
plt.xlabel('$x$')
plt.ylabel('$f(x)$')
plt.ylim(-10, 20)
plt.legend(loc='upper left')

----------------------------------------------------------------------<br>
now the noisy case

In [None]:
X = np.linspace(0.1, 9.9, 20)
X = np.atleast_2d(X).T

Observations and noise

In [None]:
y = f(X).ravel()
dy = 0.5 + 1.0 * np.random.random(y.shape)
noise = np.random.normal(0, dy)
y += noise

Instantiate a Gaussian Process model

In [None]:
gp = GaussianProcessRegressor(kernel=kernel, alpha=dy ** 2,
                              n_restarts_optimizer=10)

Fit to data using Maximum Likelihood Estimation of the parameters

In [None]:
gp.fit(X, y)

Make the prediction on the meshed x-axis (ask for MSE as well)

In [None]:
y_pred, sigma = gp.predict(x, return_std=True)

Plot the function, the prediction and the 95% confidence interval based on<br>
the MSE

In [None]:
plt.figure()
plt.plot(x, f(x), 'r:', label=r'$f(x) = x\,\sin(x)$')
plt.errorbar(X.ravel(), y, dy, fmt='r.', markersize=10, label='Observations')
plt.plot(x, y_pred, 'b-', label='Prediction')
plt.fill(np.concatenate([x, x[::-1]]),
         np.concatenate([y_pred - 1.9600 * sigma,
                        (y_pred + 1.9600 * sigma)[::-1]]),
         alpha=.5, fc='b', ec='None', label='95% confidence interval')
plt.xlabel('$x$')
plt.ylabel('$f(x)$')
plt.ylim(-10, 20)
plt.legend(loc='upper left')

In [None]:
plt.show()