# Emulating a toy model
In this tutorial, we will be emulating the toy model below:
$$ f(x_0, x_1, x_2, x_3) = \begin{bmatrix}x_0x_1+x_2x_3 \\ (x_0+1)^{x_1}-(x_2+1)^{x_3}\end{bmatrix} \ . $$
This is a model that takes four inputs and yield two outputs.

We need to import `skygp.GaussianEmulator`. First, we append the `skygp` library to `sys.path`:

In [None]:
import sys
sys.path.append('../skygp/')

**Skip this step, unless you are using Google Colab**<br>
If you are on Google Colab, you need to first `git clone` the respository.

In [None]:
# !rm -rf SkyrmeGaussianProcess/
# !git clone https://github.com/Fanurs/SkyrmeGaussianProcess.git
# sys.path.append('./SkyrmeGaussianProcess/skygp/')

Then, we import `skygp.GaussianEmulator` as well as other useful libraries.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import GaussianEmulator as gmu

## (1) Create some training data for Gaussian emulator

Translate the toy model's analytic formula into python function:

In [None]:
n_inputs, n_outputs = 4, 2
def toy(x):
    y = []
    y.append(x[0]*x[1] + x[2]*x[3])
    y.append((x[0]+1)**x[1] - (x[2]+1)**x[3])
    return np.array(y)

Prepare some training data.

**Typical constraint:** In general, the more training data we have, the easier it is to train the Gaussian emulator well. However, the reason we use a Gaussian emulator, at least for this particular project, is that the training data are computationally expensive to obtain. A reasonable number of training data would be of the order of $10^2$. We will use $100$ training data in this tutorial.

In [None]:
n_training = 30
np.random.seed(0) # fix seed for reproducibility
x_train = np.random.random(size=(n_training, n_inputs))
y_train = toy(x_train.T).T

Let's have a look at the typical values of the outputs:

In [None]:
pd.DataFrame(y_train, columns=[('y%d' % i) for i in range(n_outputs)]).describe()

The training data are typically not free from uncertainty. So here we introduce some noise to `y_train`:

In [None]:
y_train += np.random.normal(loc=0.0, scale=0.02, size=y_train.shape)

## (2) Train the Gaussian emulator

Construct a Gaussian emulator object:

In [None]:
emulator = gmu.GaussianEmulator()

Adjust the relevant parameters before training:

In [None]:
# number of training iterations
emulator.set_niterations(10)

Do the training/fitting:

In [None]:
emulator.fit(x_train, y_train)

## (3) Inspect results

Check the emulated results against training data. Here we choose to inspect the `ix=0` component of `x_train`.

In [None]:
fig, ax = plt.subplots(dpi=100, figsize=(4,3))
emulator.inspect_training_xslice(ix=0, ax=ax)
ax.legend()
ax.set_xlim(0,1)
plt.show()

Since we have the exact toy model, we can actually compare emulator's predictions beyond `x_train`.

In [None]:
x_check = np.concatenate(([np.linspace(0,1,100)],
                          [0.1*np.ones(100)],
                          [0.5*np.ones(100)],
                          [0.5*np.ones(100)]),
                         axis=0).T
y_pred = emulator.predict(x_check)
y_true = toy(x_check.T).T

In [None]:
plt.plot(x_check[:,0], y_pred, color='black')
plt.plot(x_check[:,0], y_true, color='red', linestyle='dashed')
plt.show()