Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benefit of the gradients on the 2-dimensional Rosenbrock function #388

Closed
BenoitPauwels opened this issue Nov 24, 2022 · 9 comments
Closed

Comments

@BenoitPauwels
Copy link

Hello,

I'm experimenting with KPLS and GEKPLS on the 2-dimensional Rosenbrock function by measuring their prediction accuracy (in terms of relative L2-distance).
In the results I get GEKPLS does not seem to benefit from the gradients.
I was expecting GEKPLS to be significantly more accurate than KPLS.
You'll find my script and the results I get hereunder. Am I doing something wrong?

from matplotlib import pyplot
from scipy.linalg import norm
from smt.surrogate_models import GEKPLS
from smt.surrogate_models import KPLS
from smt.problems import Rosenbrock
from smt.sampling_methods import LHS


class PredictionAccuracy:
    """Prediction accuracy of GEKPLS."""
    MODELS_CLASSES = [KPLS, GEKPLS]

    def __init__(self, problem, validation_size):
        self.__problem = problem
        self.__sampling_method = LHS(xlimits=problem.xlimits)
        self.__validation_inputs = self.__sampling_method(validation_size)
        self.__validation_outputs = self.__problem(self.__validation_inputs)

    def plot(self, sample_size, number_of_trainings, **options):
        """Plot the distributions of the prediction accuracy."""
        measures = self.__train_models(sample_size, number_of_trainings, **options)
        pyplot.boxplot(
            [measures[model_class] for model_class in self.MODELS_CLASSES],
            labels=[model_class.__name__ for model_class in self.MODELS_CLASSES],
            showmeans=True,
        )
        pyplot.ylim(bottom=0)
        pyplot.title("Distribution of the prediction accuracy")
        pyplot.savefig("prediction_accuracy.png")
        pyplot.close()

    def __train_models(self, sample_size, number_of_trainings, **options):
        """Train models and measure their prediction accuracy."""
        accuracy = {model_class: [] for model_class in self.MODELS_CLASSES}
        for _ in range(number_of_trainings):

            # Generate a sample of training inputs
            training_inputs = self.__sampling_method(sample_size)

            # Train the surrogate models and measure their prediction accuracy
            for model_class in self.MODELS_CLASSES:
                accuracy[model_class].append(
                    self.__measure_accuracy(
                        self.__train_model(
                            model_class,
                            training_inputs,
                            self.__problem(training_inputs),
                            **options
                        )
                    )
                )

        return accuracy

    def __train_model(self, model_class, inputs, outputs, **options):
        """Train a surrogate model."""
        # Prepare the options of the surrogate model
        model_options = dict(options)
        if model_class == GEKPLS:
            model_options["xlimits"] = self.__problem.xlimits

        # Set the training data
        model = model_class(**model_options)
        model.set_training_values(inputs, outputs)
        if model_class == GEKPLS:
            # Set the training derivatives
            for index in range(self.__problem.options["ndim"]):
                model.set_training_derivatives(
                    inputs, self.__problem(inputs, kx=index), index
                )

        # Train the model
        model.train()
        return model

    def __measure_accuracy(self, model):
        """Measure the prediction accuracy of a surrogate model."""
        return norm(
            model.predict_values(self.__validation_inputs) - self.__validation_outputs
        ) / norm(self.__validation_outputs)


if __name__ == "__main__":
    # Train the surrogate models and measure their prediction accuracy.
    PredictionAccuracy(Rosenbrock(ndim=2), validation_size=1000).plot(
        sample_size=10,
        number_of_trainings=20,
        n_comp=2,
        n_start=20,
    )

prediction_accuracy

I've tried changing the number of starting points and switching the optimizer to TNC but the results were similar.

Thank you for your time,
Benoît

@relf
Copy link
Member

relf commented Dec 7, 2022

Hi Benoit. Maybe rosenbrock is not the best function to make GEK shine. The surface changes smoothly and plain kriging fits it basically pretty well and derivatives does not bring much more information (at least it does not worsen the prediction 😅 ). I would say that benefits should be seen with a function with more strong changes between training points.

@BenoitPauwels
Copy link
Author

Hi Rémi,

Thank you for your answer.

On the figure attached to my previous message we can see that the median of the relative L2-errors of both KPLS and GEKPLS is about 50%. So neither KPLS nor GEKPLS seem to fit the Rosenbrock function pretty well.

You will find below a graph of the KPLS model that achieves median accuracy and the GEKPLS model trained at the same inputs (in orange, they look almost the same) and the Rosenbrock function (in blue).
models
As you can see the KPLS model does not fit the Rosenbrock function very well, there is clearly room for improvement. Adding the derivatives to the training data with GEKPLS does not really improve the accuracy.

Isn't it surprising? I would expect the derivatives to bring a lot of information, especially for such a smooth function.

Thank you for your time,
Benoît

@relf
Copy link
Member

relf commented Dec 15, 2022

We have detected some problems in GEKPLS in SMT GitHub master, not present in SMT 1.3. Do you use SMT 1.3?

@BenoitPauwels
Copy link
Author

Yes, I use SMT 1.3.

@Paul-Saves
Copy link
Contributor

Paul-Saves commented Dec 15, 2022

Hello Benoit,

  • KPLS and GEKPS are meant to reduce the dimension of the problem. Here, you chose n_comp=2 but Rosenbrock(ndim=2). Therefore, the PLS matrices are just the identity matrix with numerical errors for both KPLS and GEKPLS.

  • You chose to build a model with only 10 points (sample_size=10). But the Rosenbrock function grows quick on the sides. Therefore, if you have no points in that zones, the model just can't predict it.

=> What you are comparing is therefore a prediction error related to the sample size and some noise from an ill-conditioned PLS matrix.

If you chose a reduction of the model size from Rosenbrock(ndim=5) to n_comp=2 and if you chose a sample size of 50 points you would obtain the result below.
PredictionAccuracy(Rosenbrock(ndim=5), validation_size=1000).plot(
sample_size=50,
number_of_trainings=20,
n_comp=2,
n_start=20,

prediction_accuracy

So GEKPLS helps but need a certain amount of points and an effective reduction of the dimension for the model.

@BenoitPauwels
Copy link
Author

Hello Paul,

Thank you for your answer.

I totally understand your point about dimension reduction, but I was more interested in observing the contribution of the derivatives.

I chose a relatively small size of the training sample on purpose: I wanted to measure the contribution of the derivatives when the data is scarce. I hoped that the derivatives could compensate the data scarcity to some extent.
I tried to double the size of the training sample for the 2-dimensional Rosenbrock function: KPLS becomes more accurate than GEKPLS.
prediction_accuracy
It's counterintuitive to me that the derivatives worsen the accuracy. (Maybe it comes from ill-conditioning.)

Thank you for the 5-dimensional example. I observe that we gain only 5% of accuracy at the cost of computing 50 gradients.

I take note that GEKPLS requires an effective dimension reduction.

Cheers,
Benoît

@Paul-Saves
Copy link
Contributor

Hi,

I totally understand your point. What you want to compare is Kriging vs Gradient-Enhanced Kriging (GEK). In this case, we should obtain better performances with GEK, you are totally right !

Unfortunately, GEK is not implemented in SMT (it has a lot of limitations for a large number of points or an high-dimensional problem).

You are also right for the bad accuracy, as we add more badly approximate direction with GEKPLS, the numerical errors should add up and the accuracy decrease.

In fact, GEKPLS is based on the PLS-reduced principal components derivatives. In this case, not only the PLS directions are ill-conditioned but it do not corresponds to GEK as GEK do not use dimension reduction at the price of being more expensive.

https://arxiv.org/pdf/1708.02663.pdf

@BenoitPauwels
Copy link
Author

Thank you Paul for the explanation.
Cheers,
Benoît

@relf
Copy link
Member

relf commented Jan 11, 2023

Thanks Paul!

@relf relf closed this as completed Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants