New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Enable percentage for n_features_to_select in RFE(Fix PR #16228) #17090
[MRG] Enable percentage for n_features_to_select in RFE(Fix PR #16228) #17090
Conversation
Hi @lschwetlick, do you mind fixing the conflicts? Your PR will close a number of stalled PR, I think we should push it to be reviewed. Thanks! |
Yes, I hope it's fixed it now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments otherwise LGTM. Thanks @lschwetlick !
Please add an entry to the change log at doc/whats_new/v0.24.rst
. Like the other entries there, please reference this pull request with :pr:
and credit yourself (and other contributors if applicable) with :user:
.
Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
Alright, I integrated your suggestions @rth . Looks good? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @lschwetlick !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @lschwetlick !
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Hopefully @thomasjpfan is there to catch all the things I miss :) |
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…learn into n_features_to_select
@thomasjpfan did I get everything? |
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
def test_rfe_negative_n_features(): | ||
clf = SVC(kernel="linear") | ||
with pytest.raises(ValueError, match=r"n_features_to_select must be *"): | ||
RFE(estimator=clf, n_features_to_select=-1, step=0.1).fit(...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fit(...)
suggestion means that ...
should be replaced with something meaningful. In this case:
iris = load_iris()
rfe = RFE(estimator=clf, n_features_to_select=-1, step=0.1)
with pytest.raises(ValueError, match=r"n_features_to_select must be *"):
rfe.fit(iris.data, iris.target)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes of course, sorry, I should have seen that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few small updates to this PR. I have one more thought regarding error when n_features_to_select
is a float and > 1.0.
After taking a look at the trees code, I made the error message more explicit and added the case you asked for! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @lschwetlick !
LGTM
…earn#16228) (scikit-learn#17090) * added code from PR scikit-learn#14627 * fixed error handling None as n_features_to_select * added test for error message and percentage passing * linting * more linting * even more linting * sklearn/feature_selection/_rfe.py make int casting simpler Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com> * exchange redundant test case with small float * add what's new * removed useless import * lint * Fix whats_new/v0.24.rst Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * move whats new section * disambiguate choice between 1 feature and 100% of features Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * move checking for negatives to _fit * update error to include nones Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * markdown style in what's new Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * Update sklearn/feature_selection/tests/test_rfe.py Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * forgot to delete line * fix test for negative n_features error * BUG Fixes issues * added case for float >1 and more detailed error messages * lint * more lint * more lint * CLN Minor adjustments * BLD Force build on ci Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…earn#16228) (scikit-learn#17090) * added code from PR scikit-learn#14627 * fixed error handling None as n_features_to_select * added test for error message and percentage passing * linting * more linting * even more linting * sklearn/feature_selection/_rfe.py make int casting simpler Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com> * exchange redundant test case with small float * add what's new * removed useless import * lint * Fix whats_new/v0.24.rst Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * move whats new section * disambiguate choice between 1 feature and 100% of features Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * move checking for negatives to _fit * update error to include nones Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * markdown style in what's new Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * Update sklearn/feature_selection/tests/test_rfe.py Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * forgot to delete line * fix test for negative n_features error * BUG Fixes issues * added case for float >1 and more detailed error messages * lint * more lint * more lint * CLN Minor adjustments * BLD Force build on ci Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
This brings PR #16228 up to date, adds tests and fixes some issues with it.
That PR was meant to Fix #14567, close #14627 and close #15529 by adding the option to select n features by percentage instead of by absolute number.
The changes PR #16228 made to
isotronic.py
have since been added already and are not included in this PR. Here I fix some bugs from PR #16228 and add tests for the new functionality.Ping @noatamir