Figure is shown incorrectly #25755

Mafmax · 2023-03-03T15:26:03Z

Describe the issue linked to the documentation

Second part of the figure doesn't show sampled functions

Suggest a potential alternative/fix

No response

glemaitre · 2023-03-09T16:37:52Z

This is not an issue with the figure. However, this is linked to the ConvergenceWarning obtain when generating this example: https://scikit-learn.org/stable/auto_examples/gaussian_process/plot_gpr_prior_posterior.html

We need to investigate if this is normal or how we could try to remove this warning to have a better figure.

jmloyola · 2023-03-17T20:45:43Z

My understanding of Gaussian Process is very limited, but I'll write down what I found in case it could help someone solve this.

It seems that the convergence problem was present from the beginning and has appeared in all the releases (v0.18, v0.19, and v0.21 don't show the warning in the rendered notebook but I ran the examples with versions v0.19 and v0.21 and they did appeared. For v0.18 I couldn't replicate the environment quickly.). The warning appears only for the dot-product kernel.

In order to avoid getting the warning, I added two parameters to the L-BFGS-B optimizer, either one works:

Increase the maximum number of line search steps (per iteration) to 40 (default is 20).
Remove the bounds.

Thus, we could edit the code for the GP with dot-product kernel like this:

import scipy.optimize
from sklearn.utils.optimize import _check_optimize_result
def my_optimizer(obj_func, initial_theta, bounds):
    opt_res = scipy.optimize.minimize(
        obj_func,
        initial_theta,
        method="L-BFGS-B",
        jac=True,
        bounds=bounds,
        options={'maxls': 40},
        # bounds=None,  # This is the other option without increasing the number of line search steps.
    )
    _check_optimize_result("lbfgs", opt_res)
    return opt_res.x, opt_res.fun

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct

kernel = ConstantKernel(0.1, (0.01, 10.0)) * (
    DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2
)
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, optimizer=my_optimizer)

...

Nevertheless, the GP fit does not appear to be good. It actually is fairly similar to the GP that raise a warning.

With warning the GP has:

Kernel parameters before fit:
0.316**2 * DotProduct(sigma_0=1) ** 2)
Kernel parameters after fit: 
0.674**2 * DotProduct(sigma_0=2.13) ** 2 
Log-likelihood: -7957695978.947

Without warning the GP has:

Kernel parameters before fit:
0.316**2 * DotProduct(sigma_0=1) ** 2)
Kernel parameters after fit: 
0.674**2 * DotProduct(sigma_0=2.13) ** 2 
Log-likelihood: -7957695978.947

On the other hand, the mean function from the GP posterior rendered on the notebook is different from what I got when I run it.
I run the code to replicate the graphs and these were the results:

While the rendered notebook shows:

Is this behavior expected?

Mafmax added Documentation Needs Triage Issue requires triage labels Mar 3, 2023

glemaitre added Needs Investigation Issue requires investigation and removed Needs Triage Issue requires triage labels Mar 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure is shown incorrectly #25755

Figure is shown incorrectly #25755

Mafmax commented Mar 3, 2023

glemaitre commented Mar 9, 2023

jmloyola commented Mar 17, 2023

Figure is shown incorrectly #25755

Figure is shown incorrectly #25755

Comments

Mafmax commented Mar 3, 2023

Describe the issue linked to the documentation

Suggest a potential alternative/fix

glemaitre commented Mar 9, 2023

jmloyola commented Mar 17, 2023