Prevent division by zero in GPR when y_train is constant #19703

afonari · 2021-03-17T13:29:26Z

This is merged PR of two PRs: #18388, #19361.
This fixes: #18318.

sklearn/gaussian_process/tests/test_gpr.py

ogrisel · 2021-03-17T13:52:02Z

Also please fix the linting problem reported by the CI and expand the test by testing with multi-target data: a Y matrix where one column is a constant 2. for instance and the other is random normal data:

n_samples = X.shape[0]
rng = np.random.RandomState(0)
Y = np.concatenate([
    rng.normal(size=(n_samples, 1)),  # non-constant target
    np.full(shape=(n_samples, 1), fill_value=2)  # constant target
], axis=1)

ogrisel

Thanks for improving the tests. LGTM. Just a few more suggestions below.

I am no GPR specialist so I would appreciate it if others (e.g. @jaburke166 @boricles, @sobkevich, @jmetzen, @plgreenLIRU) could have a look.

sklearn/gaussian_process/tests/test_gpr.py

afonari · 2021-03-19T15:22:09Z

Added a commented test as discussed.

ogrisel

LGTM.

afonari · 2021-03-24T14:14:18Z

Can this be merged?

doc/whats_new/v0.24.rst

…nto patch-2

cmarmo

Last detail, then LGTM.

doc/whats_new/v0.24.rst

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

sobkevich · 2021-03-30T21:52:53Z

Thanks for improving the tests. LGTM. Just a few more suggestions below.

I am no GPR specialist so I would appreciate it if others (e.g. @jaburke166 @boricles, @sobkevich, @jmetzen, @plgreenLIRU) could have a look.

Hi,
with the current approach I had problems with test test_predict_cov_vs_std(kernel) if constant y is added to it.
If this test is run for constant y there will be:

(test_predict_cov_vs_std[kernel1-y1])
kernel = RBF(length_scale=1), y = array([1., 1., 1., 1., 1., 1.])

    @pytest.mark.parametrize('kernel,y',
                             list(product(kernels, [y, y_with_zero_std])))
    def test_predict_cov_vs_std(kernel, y):
        if sys.maxsize <= 2 ** 32 and sys.version_info[:2] == (3, 6):
            pytest.xfail("This test may fail on 32bit Py3.6")
    
        # Test that predicted std.-dev. is consistent with cov's diagonal.
        gpr = GaussianProcessRegressor(kernel=kernel).fit(X, y)
        y_mean, y_cov = gpr.predict(X2, return_cov=True)
        y_mean, y_std = gpr.predict(X2, return_std=True)
>       assert_almost_equal(np.sqrt(np.diag(y_cov)), y_std)
E       AssertionError: 
E       Arrays are not almost equal to 7 decimals
E       
E       Mismatch: 100%
E       Max absolute difference: 0.00182264
E       Max relative difference: inf
E        x: array([6.5100511e-06, 4.4185900e-06, 4.1690509e-06, 4.8057573e-06,
E              5.8756981e-06])
E        y: array([1.5440852e-03, 1.8270599e-03, 0.0000000e+00, 1.0005933e-05,
E              1.4597007e-03])

But I really don't know if it is important to save this equality for case where y is constant

afonari · 2021-03-30T22:44:05Z

with the current approach I had problems with test test_predict_cov_vs_std(kernel) if constant y is added to it.

Note that for the fixed kernel: RBF(length_scale=1.0, length_scale_bounds="fixed") it passes. I also tried settings length_scale_bounds="fixed" to other kernels from the kernels list and those pass too. So maybe it is expected to fail for variable length kernels.

sobkevich · 2021-03-31T09:19:21Z

with the current approach I had problems with test test_predict_cov_vs_std(kernel) if constant y is added to it.

Note that for the fixed kernel: RBF(length_scale=1.0, length_scale_bounds="fixed") it passes. I also tried settings length_scale_bounds="fixed" to other kernels from the kernels list and those pass too. So maybe it is expected to fail foar variable length?

Maybe)

afonari · 2021-04-06T19:47:28Z

Anything to be done here?

afonari · 2021-04-19T13:24:32Z

Ping.

glemaitre

I merge main in the branch. LGTM. I just move the code of the test regarding multitarget in the related PR.

afonari · 2021-04-21T18:51:24Z

Thanks a lot everyone!

glemaitre · 2021-04-21T19:26:43Z

The bug regarding the covariance and standard deviation is solved here: #19939

…n#19703) Co-authored-by: Sasha Fonari <fonari@schrodinger.com> Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Co-authored-by: Sasha Fonari <fonari@schrodinger.com> Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update test_gpr.py

d7c9a5e

github-actions bot added the module:gaussian_process label Mar 17, 2021

Fix and add docs

28e4aa0

ogrisel reviewed Mar 17, 2021

View reviewed changes

Fix author name + linter

a81d5be

Fix comments

7fa6290

cmarmo added this to the 0.24.2 milestone Mar 17, 2021

ogrisel approved these changes Mar 17, 2021

View reviewed changes

sklearn/gaussian_process/tests/test_gpr.py Outdated Show resolved Hide resolved

sklearn/gaussian_process/tests/test_gpr.py Outdated Show resolved Hide resolved

afonari mentioned this pull request Mar 18, 2021

FIX GaussianProcessRegression(normalize_y=True).predict(X, return_cov=True) #19706

Closed

Add multi-target commented test

33bd3f5

ogrisel approved these changes Mar 19, 2021

View reviewed changes

cmarmo reviewed Mar 24, 2021

View reviewed changes

doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved

Sasha Fonari added 2 commits March 24, 2021 13:32

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

2dd59ad

…nto patch-2

Move docs to 0.24.2 section

f8dfd30

cmarmo added Bug Regression labels Mar 24, 2021

cmarmo approved these changes Mar 24, 2021

View reviewed changes

doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved

Update author in docs

b2689f9

Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>

cmarmo added the Waiting for Reviewer label Mar 26, 2021

thomasjpfan mentioned this pull request Apr 8, 2021

Fix to check if len(y) > 1 when normalize option is enabled in GaussianProcess._gpr.fit #19474

Closed

glemaitre self-requested a review April 21, 2021 15:08

Merge remote-tracking branch 'origin/main' into pr/afonari/19703

7a67419

TST cleaning test

e83bb59

glemaitre mentioned this pull request Apr 21, 2021

FIX Use cho_solve when return_std=True for GaussianProcessRegressor #19939

Merged

glemaitre approved these changes Apr 21, 2021

View reviewed changes

glemaitre merged commit b84afe5 into scikit-learn:main Apr 21, 2021

cmarmo mentioned this pull request Apr 21, 2021

Handling division by zero std for GaussianProcessRegressor #19361

Closed

cmarmo removed the Waiting for Reviewer label Apr 21, 2021

glemaitre added the To backport PR merged in master that need a backport to a release branch defined based on the milestone. label Apr 21, 2021

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

nxorable mentioned this pull request Apr 28, 2021

ValueError: array must not contain infs or NaNs scikit-optimize/scikit-optimize#910

Open

alonfnt mentioned this pull request Jul 17, 2021

Advanced Tour notebook gives an error at Section 1.1 bayesian-optimization/BayesianOptimization#231

Closed

glemaitre mentioned this pull request Dec 16, 2021

FIX make GPR works with multi-target and normalize_y=False #21996

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent division by zero in GPR when y_train is constant #19703

Prevent division by zero in GPR when y_train is constant #19703

afonari commented Mar 17, 2021

ogrisel commented Mar 17, 2021 •

edited

ogrisel left a comment

afonari commented Mar 19, 2021

ogrisel left a comment

afonari commented Mar 24, 2021

cmarmo left a comment

sobkevich commented Mar 30, 2021

afonari commented Mar 30, 2021 •

edited

sobkevich commented Mar 31, 2021

afonari commented Apr 6, 2021

afonari commented Apr 19, 2021

glemaitre left a comment

afonari commented Apr 21, 2021

glemaitre commented Apr 21, 2021

Prevent division by zero in GPR when y_train is constant #19703

Prevent division by zero in GPR when y_train is constant #19703

Conversation

afonari commented Mar 17, 2021

ogrisel commented Mar 17, 2021 • edited

ogrisel left a comment

Choose a reason for hiding this comment

afonari commented Mar 19, 2021

ogrisel left a comment

Choose a reason for hiding this comment

afonari commented Mar 24, 2021

cmarmo left a comment

Choose a reason for hiding this comment

sobkevich commented Mar 30, 2021

afonari commented Mar 30, 2021 • edited

sobkevich commented Mar 31, 2021

afonari commented Apr 6, 2021

afonari commented Apr 19, 2021

glemaitre left a comment

Choose a reason for hiding this comment

afonari commented Apr 21, 2021

glemaitre commented Apr 21, 2021

ogrisel commented Mar 17, 2021 •

edited

afonari commented Mar 30, 2021 •

edited