Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEA Add metadata routing to SelectFromModel #27490

Merged
merged 16 commits into from Oct 30, 2023

Conversation

StefanieSenger
Copy link
Contributor

Reference Issues/PRs

Towards #22893

What does this implement/fix? Explain your changes.

Adds metadata routing to SelectFromModel.
The routing is added in the fit and partial_fit methods.

@github-actions
Copy link

github-actions bot commented Sep 28, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2479530. Link to the linter CI: here

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, could you please move the estimator from unsupported, to supported in metadata_routing.rst?

sklearn/feature_selection/_from_model.py Outdated Show resolved Hide resolved
sklearn/feature_selection/_from_model.py Outdated Show resolved Hide resolved
sklearn/feature_selection/_from_model.py Outdated Show resolved Hide resolved
doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved
sklearn/feature_selection/_from_model.py Outdated Show resolved Hide resolved
sklearn/feature_selection/_from_model.py Outdated Show resolved Hide resolved
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@glemaitre @OmarManzoor another easy one to review.

@glemaitre glemaitre self-requested a review October 5, 2023 09:49
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Only a small change in the documentation and a more general question.

sklearn/feature_selection/_from_model.py Show resolved Hide resolved
sklearn/feature_selection/_from_model.py Show resolved Hide resolved
@@ -387,7 +408,7 @@ def threshold_(self):
# SelectFromModel.estimator is not validated yet
prefer_skip_nested_validation=False
)
def partial_fit(self, X, y=None, **fit_params):
def partial_fit(self, X, y=None, **params):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we making the call consistent by calling it **params instead of **fit_params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, it might avoid confusion to call those params like the function they are routing from (partial_fit_params, in this case). Imagining several sets of params being routed between several methods, this would help to read the code more easily.

I'm not sure if Adrin shares the same sentiment about the term 'partial_fit_params,' @adrinjalali?

For me, the more explicit, the better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me fit_params or partial_fit_params makes sense when those parameters are only dispatched to the fit or partial_fit_params from the underlying learners. If there is a scorer for instance, I am fine with params.

This is a bit the reason that I was a bit surprised here because we only pass it to the partial_fit_params of the sub-estimator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't care enough about these names really, users can't pass them explicitly anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So let's be intuitive and consistent for the sake of the documentation :).

Comment on lines +435 to +437
`**partial_fit_params` are routed to the sub-estimator, if
`enable_metadata_routing=True` is set via
:func:`~sklearn.set_config`, which allows for aliasing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`**partial_fit_params` are routed to the sub-estimator, if
`enable_metadata_routing=True` is set via
:func:`~sklearn.set_config`, which allows for aliasing.
Only available if `enable_metadata_routing=True`,
which can be set by using
``sklearn.set_config(enable_metadata_routing=True)``.
See :ref:`Metadata Routing User Guide <metadata_routing>` for
more details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I would suggest to leave this part below .. versionchanged:: 1.4 mostly away. We already wrote all of this before and after. And it looks difficult to read.

What about this?

         **partial_fit_params : dict
            - If `enable_metadata_routing=False` (default):

                Parameters directly passed to the `partial_fit` method of the
                sub-estimator. They are ignored if `prefit=True`.

            - If `enable_metadata_routing=True`:

                Parameters safely routed to the `partial_fit` method of the
                sub-estimator. They are ignored if `prefit=True`.

                .. versionchanged:: 1.4
                    See :ref:`Metadata Routing User Guide <metadata_routing>` for
                    more details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre you okay with this version?

@adrinjalali
Copy link
Member

CI failing @StefanieSenger

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I am fine with this version. We can merge once the CI is green.

@StefanieSenger
Copy link
Contributor Author

I've dealt with the errors in the docstring and hope now the CI will be all green.
Thanks for reviewing! :)

@glemaitre glemaitre merged commit 2af3fb8 into scikit-learn:main Oct 30, 2023
27 checks passed
@glemaitre
Copy link
Member

Thanks @StefanieSenger Merging.

RUrlus pushed a commit to RUrlus/scikit-learn that referenced this pull request Oct 30, 2023
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2023
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
@StefanieSenger StefanieSenger deleted the routing_SelectFromModel branch November 11, 2023 14:28
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants