In [None]:
%matplotlib inline

# Radial basis function (RBF) regression.

An [RBFRegressor][gemseo.mlearning.regression.algos.rbf.RBFRegressor] is an RBF model
based on [SciPy](https://scipy.org).

!!! info "See also"
    You can find more information about RBF models on
    [this wikipedia page](https://en.wikipedia.org/wiki/Radial_basis_function_interpolation).


In [None]:
from __future__ import annotations

from matplotlib import pyplot as plt
from numpy import array

from gemseo import create_design_space
from gemseo import create_discipline
from gemseo import sample_disciplines
from gemseo.mlearning import create_regression_model
from gemseo.mlearning.regression.algos.rbf_settings import RBF

## Problem

In this example,
we represent the function $f(x)=(6x-2)^2\sin(12x-4)$
by the [AnalyticDiscipline][gemseo.disciplines.analytic.AnalyticDiscipline].

!!! quote "References"
      Alexander I. J. Forrester, Andras Sobester, and Andy J. Keane.
      Engineering design via surrogate modelling: a practical guide. Wiley, 2008.



In [None]:
discipline = create_discipline(
    "AnalyticDiscipline",
    name="f",
    expressions={"y": "(6*x-2)**2*sin(12*x-4)"},
)

and seek to approximate it over the input space



In [None]:
input_space = create_design_space()
input_space.add_variable("x", lower_bound=0.0, upper_bound=1.0)

To do this,
we create a training dataset with 6 equispaced points:



In [None]:
training_dataset = sample_disciplines(
    [discipline], input_space, "y", algo_name="PYDOE_FULLFACT", n_samples=6
)

## Basics

### Training

Then,
we train an RBF regression model from these samples:



In [None]:
model = create_regression_model("RBFRegressor", training_dataset)
model.learn()

### Prediction

Once it is built,
we can predict the output value of $f$ at a new input point:



In [None]:
input_value = {"x": array([0.65])}
output_value = model.predict(input_value)
output_value

as well as its Jacobian value:



In [None]:
jacobian_value = model.predict_jacobian(input_value)
jacobian_value

### Plotting

You can see that the RBF model is pretty good on the right, but bad on the left:



In [None]:
test_dataset = sample_disciplines(
    [discipline], input_space, "y", algo_name="PYDOE_FULLFACT", n_samples=100
)
input_data = test_dataset.get_view(variable_names=model.input_names).to_numpy()
reference_output_data = test_dataset.get_view(variable_names="y").to_numpy().ravel()
predicted_output_data = model.predict(input_data).ravel()
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.grid()
plt.legend()
plt.show()

## Settings

The [RBFRegressor][gemseo.mlearning.regression.algos.rbf.RBFRegressor] has many options
defined in the [RBFRegressor_Settings][gemseo.mlearning.regression.algos.rbf_settings.RBFRegressor_Settings] Pydantic model.

### Function

The default RBF is the multiquadratic function $\sqrt{(r/\epsilon)^2 + 1}$
depending on a radius $r$ representing a distance between two points
and an adjustable constant $\epsilon$.
The RBF can be changed using the `function` option,
which can be either an [RBF][gemseo.mlearning.regression.algos.rbf_settings.RBF]:



In [None]:
model = create_regression_model("RBFRegressor", training_dataset, function=RBF.GAUSSIAN)
model.learn()
predicted_output_data_g = model.predict(input_data).ravel()

or a Python function:



In [None]:
def rbf(self, r: float) -> float:
    """Evaluate a cubic RBF.

    An RBF must take 2 arguments, namely `(self, r)`.

    Args:
        r: The radius.

    Returns:
        The RBF value.
    """
    return r**3


model = create_regression_model("RBFRegressor", training_dataset, function=rbf)
model.learn()
predicted_output_data_c = model.predict(input_data).ravel()

We can see that the predictions are different:



In [None]:
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.plot(input_data.ravel(), predicted_output_data_g, label="Regression - Gaussian RBF")
plt.plot(input_data.ravel(), predicted_output_data_c, label="Regression - Cubic RBF")
plt.grid()
plt.legend()
plt.show()

### Epsilon

Some RBFs depend on an `epsilon` parameter
whose default value is the average distance between input data.
This is the case of `"multiquadric"`, `"gaussian"` and `"inverse"` RBFs.
For example,
we can train a first multiquadric RBF model with an `epsilon` set to 0.5



In [None]:
model = create_regression_model("RBFRegressor", training_dataset, epsilon=0.5)
model.learn()
predicted_output_data_1 = model.predict(input_data).ravel()

a second one with an `epsilon` set to 1.0:



In [None]:
model = create_regression_model("RBFRegressor", training_dataset, epsilon=1.0)
model.learn()
predicted_output_data_2 = model.predict(input_data).ravel()

and a last one with an `epsilon` set to 2.0:



In [None]:
model = create_regression_model("RBFRegressor", training_dataset, epsilon=2.0)
model.learn()
predicted_output_data_3 = model.predict(input_data).ravel()

and see that this parameter represents the regularity of the regression model:



In [None]:
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.plot(input_data.ravel(), predicted_output_data_1, label="Regression - Epsilon(0.5)")
plt.plot(input_data.ravel(), predicted_output_data_2, label="Regression - Epsilon(1)")
plt.plot(input_data.ravel(), predicted_output_data_3, label="Regression - Epsilon(2)")
plt.grid()
plt.legend()
plt.show()

### Smooth

By default,
an RBF model interpolates the training points.
This is parametrized by the `smooth` option which is set to 0.
We can increase the smoothness of the model by increasing this value:



In [None]:
model = create_regression_model("RBFRegressor", training_dataset, smooth=0.1)
model.learn()
predicted_output_data_ = model.predict(input_data).ravel()

and see that the model is not interpolating:



In [None]:
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.plot(input_data.ravel(), predicted_output_data_, label="Regression - Smooth")
plt.grid()
plt.legend()
plt.show()

## Thin plate spline (TPS)

TPS regression is a specific case of RBF regression
where the RBF is the thin plate radial basis function for $r^2\log(r)$.
The [TPSRegressor][gemseo.mlearning.regression.algos.thin_plate_spline.TPSRegressor] class
deriving from [RBFRegressor][gemseo.mlearning.regression.algos.rbf.RBFRegressor]
implements this case:



In [None]:
model = create_regression_model("TPSRegressor", training_dataset)
model.learn()
predicted_output_data_ = model.predict(input_data).ravel()

We can see that the difference between this model
and the default multiquadric RBF model:



In [None]:
plt.plot(input_data.ravel(), reference_output_data, label="Reference")
plt.plot(input_data.ravel(), predicted_output_data, label="Regression - Basics")
plt.plot(input_data.ravel(), predicted_output_data_, label="Regression - TPS")
plt.grid()
plt.legend()
plt.show()

The [TPSRegressor][gemseo.mlearning.regression.algos.thin_plate_spline.TPSRegressor]
can be customized with the [TPSRegressor_Settings][gemseo.mlearning.regression.algos.thin_plate_spline_settings.TPSRegressor_Settings].

