FIX add attributes X_offset_ and X_scale_ in ARDRegression needed in predict #18607

fhaselbeck · 2020-10-12T13:55:26Z

ARDRegression is crashing during prediction when normalize is set true as in predict self.X_offset_ and self.X_scale_ are used to transform but are not set in fit method (in contrast to BayesianRidge).

Added this part to fit just like in BayesianRidge:

self.X_offset_ = X_offset_
self.X_scale_ = X_scale_

…in fit method of ARDRegression

NicolasHug

Thanks for the PR @fhaselbeck , looks good.

We would need a non-regression test for this. Also, we'll need to document the attributes in the Attributes section of the docstring.

glemaitre · 2020-10-12T15:25:59Z

FYI I posted a non-regression test in the associated issue.

…and ARDRegression

fhaselbeck · 2020-10-13T05:46:00Z

Just added the comments in the Attributes section.

I saw that there is already a test for return_std in test_bayes.py, so I did not add another one.

def test_return_std():
    # Test return_std option for both Bayesian regressors
    def f(X):
        return np.dot(X, w) + b

    def f_noise(X, noise_mult):
        return f(X) + np.random.randn(X.shape[0]) * noise_mult

    d = 5
    n_train = 50
    n_test = 10

    w = np.array([1.0, 0.0, 1.0, -1.0, 0.0])
    b = 1.0

    X = np.random.random((n_train, d))
    X_test = np.random.random((n_test, d))

    for decimal, noise_mult in enumerate([1, 0.1, 0.01]):
        y = f_noise(X, noise_mult)

        m1 = BayesianRidge()
        m1.fit(X, y)
        y_mean1, y_std1 = m1.predict(X_test, return_std=True)
        assert_array_almost_equal(y_std1, noise_mult, decimal=decimal)

        m2 = ARDRegression()
        m2.fit(X, y)
        y_mean2, y_std2 = m2.predict(X_test, return_std=True)
        assert_array_almost_equal(y_std2, noise_mult, decimal=decimal)

NicolasHug · 2020-10-13T13:11:48Z

There seem to be some linting errors (probably white spaces). Could you please check them?

We'd still need a non-regression test, for example as proposed by @glemaitre in the original issue. Alternatively you mayy parametrize the test you mentionned with

@pytest.mark.parametrize('normalize', (True, False))

(We can even parametrize it with the Estimator class)

ogrisel

LGTM assuming the comments by @NicolasHug are addressed.

ogrisel · 2020-10-14T14:25:22Z

sklearn/linear_model/_bayes.py

+        if normalized, offset subtracted for centering data to mean zero
+
+    X_scale_ : float
+        if normalized, scaling parameter for division


Do we need to add the .. versionadded:: 0.24 directive for those attributes? It's a bugfix, they should always had been there so I am not sure.

I'd say we don't want them since it's indeed a bugfix

BTW @fhaselbeck we'll also need a bugfix entry in doc/whats_new/v0.24.rst ;)

cmarmo · 2020-10-27T11:25:21Z

Hi @fhaselbeck, thanks for your work so far. Will you be able to address comments? Please let us know if you need any help.

fhaselbeck · 2020-10-27T11:41:44Z

Sorry. I am currently very busy with the preparation of a paper. I will finalize this stuff afterwards, but it may take some time.

reshamas · 2020-10-27T17:17:33Z

@fhaselbeck @ogrisel
@Mariam-ke and I are reviewing this PR for learning purposes.

1) test proposed

We looked at the request for a test. Is this what is needed?
In file: sklearn/linear_model/tests/test_bayes.py, here is a proposed test:

def test_bayesian_ard_object(): 
    # Test BayesianRegression ARD classifier 
    X = np.array([[1], [2], [6], [8], [10]]) 
    y = np.array([1, 2, 6, 8, 10]) 
    clf = ARDRegression(normalize=True) 
    clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2]) 
    clf.predict([[1, 1]], return_std=True) 
    
    # Check that the model could approximately learn the identity function 
    test = [[1], [3], [4]] 
    assert_array_almost_equal(clf.predict(test), [1, 3, 4], 2)

2) bug fix entry proposed

in file: doc/whats_new/v0.24.rst

Here is a proposed entry:

:mod:`sklearn.linear_model`

- |Fix| Fixes bug in :class:`linear_model.ARDRegression` where ARDRegression is crashing during prediction when `normalize=True`. `self.X_offset_` and `self.X_scale_` are used to transform but are not set in fit method. :pr:`18607` by :user:`fhaselbeck`.

NicolasHug · 2020-10-27T17:40:39Z

@reshamas @Mariam-ke regarding the test, a simpler one is available on the original issue as mentioned above. Yours is good but we don't need this part:

     # Check that the model could approximately learn the identity function 
    test = [[1], [3], [4]] 
    assert_array_almost_equal(clf.predict(test), [1, 3, 4], 2)

because the failure wasn't about prediction quality.

Also a more explicit name and comment would be something like:

def test_ard_reg_predict_normalize_true(): 
    # Make sure predict doesn't raise any error when normalize=True and return_std=True
	# Non regression test for https://github.com/scikit-learn/scikit-learn/issues/18605

Note the mention that this is a non-regression test, with reference to the original issue

Re whatsnew message: we usually don't explain the details of the fix (this is often too complex, and irrelevant to users):

- |Fix| Fixed a bug that would result in an error when calling `predict(return_std=True)` on :class:`linear_model.ARDRegression`, when `normalize=True`. :pr:`18607` by :user:`fhaselbeck`.

glemaitre · 2020-10-27T20:28:34Z

I added the missing test and the what's new entry. When the CIs are green I will merge.

reshamas · 2020-10-27T20:50:49Z

Thanks @NicolasHug for the explanation.

fhaselbeck · 2020-10-28T08:34:48Z

Thanks a lot for finishing up and sorry that I had no time to do it on my own

fixed issue scikit-learn#18605, set self.X_offset_ and self.X_scale_ …

f0661a7

…in fit method of ARDRegression

github-actions bot added the module:linear_model label Oct 12, 2020

NicolasHug reviewed Oct 12, 2020

View reviewed changes

adding X_offset_ and X_scale_ to attributes section of BayesianRidge …

607ba68

…and ARDRegression

ogrisel approved these changes Oct 14, 2020

View reviewed changes

NicolasHug added this to the 0.24 milestone Oct 15, 2020

glemaitre self-assigned this Oct 27, 2020

glemaitre changed the title ~~fixed issue #18605, set self.X_offset_ and self.X_scale_ in fit metho…~~ FIX add attributes X_offset_ and X_scale_ in ARDRegression needed in predict Oct 27, 2020

finish up

abb5992

glemaitre merged commit bb9a8f0 into scikit-learn:master Oct 28, 2020

marenwestermann mentioned this pull request Nov 4, 2020

Ensure all attributes are documented #14312

Closed

cmarmo mentioned this pull request Dec 13, 2020

[MRG][DOC] Issue #14312 - Ensure All Attributes are Documented: Bayesian Ridge #16417

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX add attributes X_offset_ and X_scale_ in ARDRegression needed in predict #18607

FIX add attributes X_offset_ and X_scale_ in ARDRegression needed in predict #18607

Uh oh!

fhaselbeck commented Oct 12, 2020

Uh oh!

NicolasHug left a comment

Uh oh!

glemaitre commented Oct 12, 2020

Uh oh!

fhaselbeck commented Oct 13, 2020

Uh oh!

NicolasHug commented Oct 13, 2020

Uh oh!

ogrisel left a comment

Uh oh!

ogrisel Oct 14, 2020

Uh oh!

NicolasHug Oct 14, 2020

Uh oh!

cmarmo commented Oct 27, 2020

Uh oh!

fhaselbeck commented Oct 27, 2020

Uh oh!

reshamas commented Oct 27, 2020

Uh oh!

NicolasHug commented Oct 27, 2020

Uh oh!

glemaitre commented Oct 27, 2020

Uh oh!

reshamas commented Oct 27, 2020

Uh oh!

fhaselbeck commented Oct 28, 2020

Uh oh!

Uh oh!

Uh oh!

FIX add attributes X_offset_ and X_scale_ in ARDRegression needed in predict #18607

FIX add attributes X_offset_ and X_scale_ in ARDRegression needed in predict #18607

Uh oh!

Conversation

fhaselbeck commented Oct 12, 2020

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Oct 12, 2020

Uh oh!

fhaselbeck commented Oct 13, 2020

Uh oh!

NicolasHug commented Oct 13, 2020

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel Oct 14, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug Oct 14, 2020

Choose a reason for hiding this comment

Uh oh!

cmarmo commented Oct 27, 2020

Uh oh!

fhaselbeck commented Oct 27, 2020

Uh oh!

reshamas commented Oct 27, 2020

1) test proposed

2) bug fix entry proposed

Uh oh!

NicolasHug commented Oct 27, 2020

Uh oh!

glemaitre commented Oct 27, 2020

Uh oh!

reshamas commented Oct 27, 2020

Uh oh!

fhaselbeck commented Oct 28, 2020

Uh oh!

Uh oh!