MNT Use check_scalar in AdaBoostRegressor #21605

genvalen · 2021-11-09T05:39:45Z

Reference Issues/PRs

Addresses #20724
#DataUmbrella

What does this implement/fix? Explain your changes.

Summary of changes to AdaBoostRegressor:

Add tests to ensure estimator raises proper errors when invalid arguments are passed in.
Use the helper function check_scalar from sklearn.utils to validate the scalar parameters.

Test and validation progress:

n_estimators
learning_rate

References

Any other comments?

genvalen · 2021-11-11T00:59:05Z

I noticed the BaseWeightBoosting class' fit method has one parameter validation starting on line 114.

        # Check parameters
        if self.learning_rate <= 0:
            raise ValueError("learning_rate must be greater than zero")

Should I delete this one?
Or maybe update it to be consistent with the check_scalar output message?:

f"learning_rate == {self.learning_rate}, must be > 0."

thomasjpfan · 2021-11-24T14:27:46Z

sklearn/ensemble/_weight_boosting.py

+        check_scalar(
+            self.n_estimators,
+            "n_estimators",
+            target_type=numbers.Integral,
+            min_val=1,
+            include_boundaries="left",
+        )
+
+        check_scalar(
+            self.learning_rate,
+            "learning_rate",
+            target_type=numbers.Real,
+            min_val=0,
+            include_boundaries="neither",
+        )


Should I delete this one?

We can move both of these checks into BaseWeightBoosting, and replace the check for self.learning_rate:

scikit-learn/sklearn/ensemble/_weight_boosting.py

Lines 115 to 116 in c9e5067

if self.learning_rate <= 0:

raise ValueError("learning_rate must be greater than zero")

Thank you. In that case I will remove the checks I added to AdaBoostClassifier too. They are the exact same checks as what I am about to move into BaseWeightBoosting.

…nto AdaBoostRegressor_add_check_scaler

reshamas · 2021-12-10T19:40:27Z

@genvalen After discussion with @glemaitre, this PR should be prefixed with "MAINT" (and yes, confirming it does mean for "maintenance")

genvalen · 2021-12-10T19:48:28Z

Okay! So just to be clear, the proper abbreviation is "MAINT" and not "MNT"? Or is either one fine?

reshamas · 2021-12-10T19:54:07Z

I am seeing both here:

…nto AdaBoostRegressor_add_check_scaler

glemaitre

All is good for me. LGTM.
Thanks @genvalen

thomasjpfan

Thanks for following up!

sklearn/ensemble/tests/test_weight_boosting.py

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

thomasjpfan

LGTM

reshamas · 2022-01-07T15:14:07Z

@glemaitre @thomasjpfan @jjerphan @ogrisel
I am looking at review comments on my PR and comparing with @genvalen.
I would like to document some consistency points for the check_scalar PRs:

PR prefix should be MAINT (not MNT)
check_scalar call should include explicitly include name (Ex: name="n_estimators", (not "n_estimators", ))
Interval ranges should use the text must be (not should be)
any others?

jjerphan · 2022-01-07T15:18:42Z

@glemaitre @thomasjpfan @jjerphan @ogrisel I am looking at review comments on my PR and comparing with @genvalen. I would like to document some consistency points for the check_scalar PRs:
1. PR prefix should be `MAINT` (not `MNT`)

2. `check_scalar` call should include explicitly include `name="n_estimators",`  (not `"n_estimators", `)

3. Interval ranges should use the text `must be` (not `should be`)

4. any others?

Thanks for following up, @reshamas.

1., 2., 3. look good to me. I would also make sure error messages in tests are present.

reshamas · 2022-01-07T15:22:01Z

@jjerphan
I am not sure I understand the below comment. Aren't error messages needed when a test is defined?

I would also make sure error messages in tests are present.

jjerphan · 2022-01-07T15:24:38Z

sklearn/ensemble/tests/test_weight_boosting.py

+    reg = AdaBoostRegressor(loss="foo")
+    with pytest.raises(ValueError):
+        reg.fit(X, y_class)
+
+    clf = AdaBoostClassifier(algorithm="foo")
+    with pytest.raises(ValueError):
+        clf.fit(X, y_class)


@reshamas: I should have been clearer.

Here for instance, there's a check for ValueErrors being raised but their error messages aren't checked.

Should we re-open this one? Error messages are important and helpful, correct?

Here we go: #22144

Follow-up of scikit-learn#21605

reshamas · 2022-01-07T17:50:04Z

FYI, for reference, @genvalen
PR #22144

genvalen · 2022-01-07T18:43:14Z

@reshamas Noted. Thanks! I'll incorporate these consistency points moving forward.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

genvalen added 2 commits November 8, 2021 23:31

Add tests

12e1627

Validate scalar parameters with check_scalar

8482313

github-actions bot added the module:ensemble label Nov 9, 2021

thomasjpfan added the No Changelog Needed label Nov 9, 2021

reshamas added the Sprint label Nov 16, 2021

thomasjpfan reviewed Nov 24, 2021

View reviewed changes

genvalen added 2 commits November 24, 2021 12:43

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

68fd69c

…nto AdaBoostRegressor_add_check_scaler

Move checks from AdaBoostRegressor to BaseWeightBoosting

9b8a673

reshamas mentioned this pull request Dec 8, 2021

Use the function check_scalar for parameters validation #21927

Closed

41 tasks

genvalen added 4 commits December 10, 2021 17:38

Update tests

28e2af0

Remove redundant checks from AdaBoostClassifer

0335ea2

Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…

94d420d

…nto AdaBoostRegressor_add_check_scaler

Minor clean up and format with black

2cabe56

genvalen requested a review from thomasjpfan December 10, 2021 23:39

glemaitre approved these changes Dec 14, 2021

View reviewed changes

thomasjpfan reviewed Dec 14, 2021

View reviewed changes

sklearn/ensemble/tests/test_weight_boosting.py Outdated Show resolved Hide resolved

sklearn/ensemble/tests/test_weight_boosting.py Outdated Show resolved Hide resolved

sklearn/ensemble/tests/test_weight_boosting.py Outdated Show resolved Hide resolved

genvalen and others added 3 commits December 14, 2021 17:27

Update sklearn/ensemble/tests/test_weight_boosting.py

563c5f0

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

Update sklearn/ensemble/tests/test_weight_boosting.py

2ae607d

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

Ensure validation happens in

360e7d0

cmarmo added the Waiting for Reviewer label Dec 15, 2021

thomasjpfan approved these changes Dec 16, 2021

View reviewed changes

thomasjpfan merged commit f21f1d7 into scikit-learn:main Dec 16, 2021

genvalen deleted the AdaBoostRegressor_add_check_scaler branch December 17, 2021 03:56

genvalen mentioned this pull request Dec 30, 2021

DOC Update documentation of weight boosting estimators w/ranges #22097

Merged

reshamas removed the Waiting for Reviewer label Jan 7, 2022

jjerphan reviewed Jan 7, 2022

View reviewed changes

jjerphan added a commit to jjerphan/scikit-learn that referenced this pull request Jan 7, 2022

TST Assert error messages for AdaBoost estimators

25edd20

Follow-up of scikit-learn#21605

jjerphan mentioned this pull request Jan 7, 2022

TST Assert error messages for AdaBoost estimators #22144

Merged

This was referenced Jan 8, 2022

MAINT Use check_scalar in BaseDecisionTree #21990

Merged

[MRG] DOC Update documentation of weight boosting estimators w/ ranges #22155

Merged

mathijs02 pushed a commit to mathijs02/scikit-learn that referenced this pull request Dec 27, 2022

MNT Use check_scalar in AdaBoostRegressor (scikit-learn#21605)

95f0a48

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNT Use check_scalar in AdaBoostRegressor #21605

MNT Use check_scalar in AdaBoostRegressor #21605

genvalen commented Nov 9, 2021 •

edited

genvalen commented Nov 11, 2021 •

edited

thomasjpfan Nov 24, 2021 •

edited

genvalen Dec 10, 2021 •

edited

reshamas commented Dec 10, 2021

genvalen commented Dec 10, 2021

reshamas commented Dec 10, 2021

glemaitre left a comment

thomasjpfan left a comment

thomasjpfan left a comment

reshamas commented Jan 7, 2022 •

edited

jjerphan commented Jan 7, 2022

reshamas commented Jan 7, 2022

jjerphan Jan 7, 2022

reshamas Jan 7, 2022

jjerphan Jan 7, 2022

reshamas commented Jan 7, 2022

genvalen commented Jan 7, 2022

	if self.learning_rate <= 0:
	raise ValueError("learning_rate must be greater than zero")

MNT Use check_scalar in AdaBoostRegressor #21605

MNT Use check_scalar in AdaBoostRegressor #21605

Conversation

genvalen commented Nov 9, 2021 • edited

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

genvalen commented Nov 11, 2021 • edited

thomasjpfan Nov 24, 2021 • edited

Choose a reason for hiding this comment

genvalen Dec 10, 2021 • edited

Choose a reason for hiding this comment

reshamas commented Dec 10, 2021

genvalen commented Dec 10, 2021

reshamas commented Dec 10, 2021

glemaitre left a comment

Choose a reason for hiding this comment

thomasjpfan left a comment

Choose a reason for hiding this comment

thomasjpfan left a comment

Choose a reason for hiding this comment

reshamas commented Jan 7, 2022 • edited

jjerphan commented Jan 7, 2022

reshamas commented Jan 7, 2022

jjerphan Jan 7, 2022

Choose a reason for hiding this comment

reshamas Jan 7, 2022

Choose a reason for hiding this comment

jjerphan Jan 7, 2022

Choose a reason for hiding this comment

reshamas commented Jan 7, 2022

genvalen commented Jan 7, 2022

genvalen commented Nov 9, 2021 •

edited

genvalen commented Nov 11, 2021 •

edited

thomasjpfan Nov 24, 2021 •

edited

genvalen Dec 10, 2021 •

edited

reshamas commented Jan 7, 2022 •

edited