Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT Deprecate SAMME.R algorithm from AdaBoostClassifier #26830

Merged
merged 9 commits into from Sep 7, 2023

Conversation

StefanieSenger
Copy link
Contributor

@StefanieSenger StefanieSenger commented Jul 13, 2023

Reference Issues/PRs

Fixes #26784

What does this implement/fix? Explain your changes.

This PR deprecates the SAMME.R algorithm as an algorithm in the AdaBoostClassifier as discussed in issue #26784.

@StefanieSenger StefanieSenger changed the title deprecate SAMME.R algorithm MNT Deprecate SAMME.R algorithm from AdaBoostClassifier Jul 13, 2023
@github-actions
Copy link

github-actions bot commented Jul 13, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 0e5656e. Link to the linter CI: here

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM.

sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrinjalali
Copy link
Member

Could you also make sure this parameter is set in all examples correctly so that examples and our documentation won't raise a warning?

@glemaitre glemaitre self-requested a review July 24, 2023 08:47
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good. We need to make the changes suggested by @adrinjalali.

I think that we should remove the following example: https://scikit-learn.org/dev/auto_examples/ensemble/plot_adaboost_hastie_10_2.html#sphx-glr-auto-examples-ensemble-plot-adaboost-hastie-10-2-py. We need to change the user guide figure and probably redirect this example to the one that you edited earlier.

The redirection is done in the doc/conf.py file:

scikit-learn/doc/conf.py

Lines 283 to 302 in 75f3e47

# redirects dictionary maps from old links to new links
redirects = {
"documentation": "index",
"auto_examples/feature_selection/plot_permutation_test_for_classification": (
"auto_examples/model_selection/plot_permutation_tests_for_classification"
),
"modules/model_persistence": "model_persistence",
"auto_examples/linear_model/plot_bayesian_ridge": (
"auto_examples/linear_model/plot_ard"
),
"auto_examples/model_selection/grid_search_text_feature_extraction.py": (
"auto_examples/model_selection/plot_grid_search_text_feature_extraction.py"
),
"auto_examples/miscellaneous/plot_changed_only_pprint_parameter": (
"auto_examples/miscellaneous/plot_estimator_representation"
),
"auto_examples/decomposition/plot_beta_divergence": (
"auto_examples/applications/plot_topics_extraction_with_nmf_lda"
),
}

we need to map the previous example name to the new one.

sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/tests/test_weight_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/_weight_boosting.py Outdated Show resolved Hide resolved
@StefanieSenger
Copy link
Contributor Author

Thank you, @glemaitre and @adrinjalali, for the review and the directions. I have tried to implement this; lets hope most of it correctly.

Some comments about some things I was unsure about:

I deleted the whole example plot-adaboost-hastie-10-2-py. Is that correct?

I changed where AdaBoostClassifier was instantiated with the default algorithms to now use “SAMME”, but was not sure about benchmarks/bench_20newsgroups.py, this one should stay as it is, right?

I modified the expected output lines of the example in the docstring of AdaBoostClassifier by hand so that it passes the doctest. Is this the correct way of dealing with this? I investigating how to run this automatic, but I didn’t manage to find out.

In the user guide (doc/modules/ensemble.rst), li. 1580-1581 needs to be changed in version 1.6. How would we document this, so it’s not forgotten (since .. /n TODO(1.6) will again show up on IDEs?)? Those two lines I mean:

For multi-class classification, :class:`AdaBoostClassifier` implements
AdaBoost-SAMME and AdaBoost-SAMME.R [ZZRH2009]_.

Looking forward to your further reviews!

@adrinjalali
Copy link
Member

I deleted the whole example plot-adaboost-hastie-10-2-py. Is that correct?

Sounds good.

I changed where AdaBoostClassifier was instantiated with the default algorithms to now use “SAMME”, but was not sure about benchmarks/bench_20newsgroups.py, this one should stay as it is, right?

I think we should fix the benchmark as well.

I modified the expected output lines of the example in the docstring of AdaBoostClassifier by hand so that it passes the doctest. Is this the correct way of dealing with this? I investigating how to run this automatic, but I didn’t manage to find out.

We basically copy the code to a python REPL (not ipython),

In the user guide (doc/modules/ensemble.rst), li. 1580-1581 needs to be changed in version 1.6. How would we document this, so it’s not forgotten (since .. /n TODO(1.6) will again show up on IDEs?)? Those two lines I mean:

For multi-class classification, :class:`AdaBoostClassifier` implements
AdaBoost-SAMME and AdaBoost-SAMME.R [ZZRH2009]_.

We can already remove SAMME.R from the docs, our user guides shouldn't mention something which is already deprecated.

@StefanieSenger
Copy link
Contributor Author

I believe the tests will only be changed, once SAMME.R is entirely abandoned in 1.6, right?

@glemaitre
Copy link
Member

Some of the failures are still legitimate. You can reproduce the error locally by running pytest with the following option: -Werror::FutureWarning:

pytest -vsl sklearn -Werror::FutureWarning

Regarding the fixes:

  • you need to either set algorithm="SAMME" when it makes sense
  • or decorate a test to filter the warning using @pytest.mark.filterwarnings("ignore:The SAMME.R algorithm") and add a # TODO(remove in 1.6) to remember to remove and change the test in the future.

For the docstring test, we will need to modify the instantiation of AdaBoostClassifier to use "SAMME".

@glemaitre
Copy link
Member

@StefanieSenger would you mind resolving the conflicts? I can make a final review after.

@StefanieSenger
Copy link
Contributor Author

Sure, @glemaitre, I have just resolved the conflicts. :) I hadn't noticed them before.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I merge main in this branch. Let see if the CI will pass.

@StefanieSenger
Copy link
Contributor Author

Thanks, @glemaitre. The CI failure is not related, I believe?

@glemaitre
Copy link
Member

Nop something link to GBDT and the new loss. We are tracking there: #27312

@glemaitre glemaitre merged commit c634b8a into scikit-learn:main Sep 7, 2023
21 of 24 checks passed
@glemaitre
Copy link
Member

Merging then.

@glemaitre
Copy link
Member

Thanks @StefanieSenger

@StefanieSenger StefanieSenger deleted the deprecate_samme.r branch September 7, 2023 20:40
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…#26830)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC deprecate the SAMME.R algorithm in AdaBoostClassifier
3 participants