Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Replacing grid_scores_ by cv_results_ in _rfe.py and test_rfe.py #20161

Conversation

wowry
Copy link
Contributor

@wowry wowry commented May 28, 2021

Reference Issues/PRs

Continues #16961
Partially fixes #11198
Based on #16392

What does this implement/fix? Explain your changes.

This PR replaces grid_scores_ by cv_results_ in _rfy.py and test_rfe.py.

@wowry wowry changed the title Replacing grid_scores_ by cv_results_ in _rfe.py and test_rfe.py [WIP] Replacing grid_scores_ by cv_results_ in _rfe.py and test_rfe.py May 28, 2021
@wowry wowry changed the title [WIP] Replacing grid_scores_ by cv_results_ in _rfe.py and test_rfe.py [MRG] Replacing grid_scores_ by cv_results_ in _rfe.py and test_rfe.py May 28, 2021
@wowry wowry requested a review from ogrisel May 29, 2021 21:36
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a few minor remarks below (ignore the third that I just left for other reviewers).

sklearn/feature_selection/tests/test_rfe.py Outdated Show resolved Hide resolved
sklearn/feature_selection/tests/test_rfe.py Outdated Show resolved Hide resolved
scores_rev = scores[:, ::-1]
self.cv_results_ = {}
self.cv_results_["mean_score"] = np.mean(scores_rev, axis=0)
self.cv_results_["std_score"] = np.std(scores_rev, axis=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should not use mean_test_score / std_test_score / split{i}_test_score instead.

  1. this would be more consistent with GridSearchCV
  2. this would be more future proof in case we want to add an option to also compute the training score in RFECV although this is probably a case of YAGNI.

Edit actuall ignore that remark. #16392 already uses the mean_score notation...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being consistent with GridSearchCV looks like a net win to me. I am okay with deprecating mean_score in GraphicalLassoCV and switch to using *_test_score instead.

@ogrisel Do you think deprecating would be too much of a hassle?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with both approaches.

@ogrisel
Copy link
Member

ogrisel commented Jun 1, 2021

We probably need an entry to document the deprecation attribute in doc/whats_new/v1.0.rst.

@wowry wowry requested a review from ogrisel June 12, 2021 09:19
@wowry
Copy link
Contributor Author

wowry commented Jun 26, 2021

The black code style has been applied.

@wowry wowry requested a review from thomasjpfan June 30, 2021 03:59
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still fine with this PR but I think you would get a faster second reviewer approval if you address this comment: https://github.com/scikit-learn/scikit-learn/pull/20161/files#r646027286

The handling of the deprecation cycle of the same attributes in GraphicalLassoCV can be tackled in a separate PR.

@thomasjpfan
Copy link
Member

I am +1 on changing the names of the keys as described in #20161 (comment)

Would it be okay to make this change in this PR?

@wowry
Copy link
Contributor Author

wowry commented Jul 22, 2021

@thomasjpfan
Certainly. I appreciate your help.

Comment on lines 732 to 733
self.cv_results_["mean_score"] = np.mean(scores_rev, axis=0)
self.cv_results_["std_score"] = np.std(scores_rev, axis=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that @thomasjpfan mentioned in its previous post was to change both entries using mean_test_score and std_test_score here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I was mistaken about that.
Would it be okay to just change the names?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should be enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. Thank you so much.

sklearn/feature_selection/_rfe.py Outdated Show resolved Hide resolved
sklearn/feature_selection/tests/test_rfe.py Outdated Show resolved Hide resolved
@glemaitre
Copy link
Member

LGTM. I pushed a tiny nitpick for the f-string styling.
I will merge once the CIs are green.

@wowry
Copy link
Contributor Author

wowry commented Jul 30, 2021

Thank you for all of you.

@glemaitre glemaitre merged commit daec880 into scikit-learn:main Jul 30, 2021
@wowry wowry deleted the Replacing-grid_scores_-by-cv_results-in-_rfe.py branch July 30, 2021 16:16
samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021
…cikit-learn#20161)

Co-authored-by: arka204 <kmichalik204@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
gregcaporaso added a commit to qiime2/q2-sample-classifier that referenced this pull request Apr 30, 2024
gregcaporaso added a commit to qiime2/q2-sample-classifier that referenced this pull request May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate grid_scores_ everywhere, replace by cv_results_
5 participants