#  **PART 2A**: The basics: Gaussian process regression in F3DASM

This step-by-step tutorial with exercises can be followedin order to gain understanding of how regression works in F3DASM.

## 1. Import the necessary packages.

In [None]:
import f3dasm
import numpy as np
import matplotlib.pyplot as plt
import gpytorch

## 2. Define the hyperparameters

In [None]:
dimensionality = 1
numsamples = 15

kernel = gpytorch.kernels.ScaleKernel(gpytorch.kernels.CosineKernel())

## 3. Specify the problem

In [None]:
fun = f3dasm.functions.AlpineN2(
    dimensionality=dimensionality,
    scale_bounds=np.tile([0.0, 1.0], (dimensionality, 1)),
    )

Let's plot the function.

In [None]:
x_plot = np.linspace(0, 1, 500)[:, None]
y_plot = fun(x_plot)
plt.plot(x_plot, y_plot, 'b--', label='Exact')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()

Add the design space, sampler and finally the training data.

In [None]:
parameter_DesignSpace: f3dasm.DesignSpace = f3dasm.make_nd_continuous_design(
    bounds=np.tile([0.0, 1.0], (dimensionality, 1)),
    dimensionality=dimensionality,
)

sampler = f3dasm.sampling.SobolSequence(design=parameter_DesignSpace)

train_data: f3dasm.Data = sampler.get_samples(numsamples=numsamples)
train_data.add_output(output=fun(train_data))

Let's see how the training data looks like.

In [None]:
train_data.data

In [None]:
plt.plot(x_plot, y_plot, 'b--', label='Exact')
plt.scatter(train_data.data['input'], train_data.data['output'], c='b', label='Training data')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()

## 4. Regression and prediction

In [None]:
param = f3dasm.regression.gpr.Sogpr_Parameters(kernel=kernel)

regressor = f3dasm.regression.gpr.Sogpr(
    train_data=train_data, 
    design=train_data.design,
    parameter=param,
)

surrogate = regressor.train()

Let's evaluate the mean and the variance of the Gaussian process posterior.

In [None]:
x_plot_data = f3dasm.Data(design=train_data.design)
x_plot_data.add_numpy_arrays(input=x_plot, output=x_plot)
mean, var = surrogate.predict(test_input_data=x_plot_data)

ucb, lcb = [mean + 2 * (-1) ** k * np.sqrt(np.abs(var)) for k in range(2)]

Let's see how the prediction looks like.

In [None]:
plt.plot(x_plot, y_plot, 'b--', label='Exact')
plt.scatter(train_data.data['input'], train_data.data['output'], c='b', label='Training data')
plt.plot(x_plot, mean, color='purple', label='Prediction')
plt.fill_between(x_plot.flatten(), lcb.flatten(), ucb.flatten(), color='purple', alpha=.25, label='Confidence')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()

## 5. Exercises
1. Change the cosine GP kernel into an RBF kernel (`gpytorch.kernels.RBFKernel()`). What do you notice?
2. Change the function to regress from the AlpineN2 into the Schwefel function (`f3dasm.functions.Schwefel`). What do you notice?
3. Change the number of data points from `15` into a higher number $\le$`150`. Does the GP regress well?