Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT Deprecates _estimator_type and replaces by a estimator tag #17806

Open
wants to merge 49 commits into
base: main
Choose a base branch
from

Conversation

alfaro96
Copy link
Member

@alfaro96 alfaro96 commented Jul 1, 2020

Reference Issues/PRs

Closes #16469.

What does this implement/fix? Explain your changes.

This PR deprecates the _estimator_type private attribute, replaced by a newer estimator tag.

@alfaro96 alfaro96 changed the title MNT Deprecate _estimator_type and replace by estimator_tags MNT Deprecated _estimator_type and replaced by a estimator tag Jul 1, 2020
@alfaro96 alfaro96 changed the title MNT Deprecated _estimator_type and replaced by a estimator tag [WIP] MNT Deprecated _estimator_type and replaced by a estimator tag Jul 1, 2020
@alfaro96 alfaro96 marked this pull request as draft July 1, 2020 17:31
@alfaro96 alfaro96 changed the title [WIP] MNT Deprecated _estimator_type and replaced by a estimator tag [WIP] MNT Deprecates _estimator_type and replaces by a estimator tag Jul 1, 2020
@rth
Copy link
Member

rth commented Jul 1, 2020

Thanks! There was a very similar PR by either @thomasjpfan or @NicolasHug and there were concerns there about how putting estimator tags could be potentially redundant with determination made by mixins (as far as I remember). I'm not saying it's a concern in this PR but it would be useful to find that discussion.

@alfaro96
Copy link
Member Author

alfaro96 commented Jul 2, 2020

Thanks! There was a very similar PR by either @thomasjpfan or @NicolasHug and there were concerns there about how putting estimator tags could be potentially redundant with determination made by mixins (as far as I remember). I'm not saying it's a concern in this PR but it would be useful to find that discussion.

Thanks for the advice @rth!

I have not found the PR that you mention, but I am +1 about the fact that _estimator_type private attribute or even using the estimator tags is redundant given the information provided by the mixins (ClassifierMixin, RegressorMixin, etc.).

Do you think that should be (or may be) replaced by the determination made with mixins?

@rth
Copy link
Member

rth commented Jul 3, 2020

I also cannot find it. Maybe I'm mis-remembering, too many PRs... Anyway, this looks very reasonable to me.

@rth
Copy link
Member

rth commented Jul 3, 2020

Please also add estimator_type to the list of estimator tags the documentation https://scikit-learn.org/stable/developers/develop.html#estimator-tags

@alfaro96
Copy link
Member Author

alfaro96 commented Jul 3, 2020

I also cannot find it. Maybe I'm mis-remembering, too many PRs... Anyway, this looks very reasonable to me.

Do not worry about the mis-remembering.

Maybe @thomasjpfan and @NicolasHug could help us to recover (remember) this discussion?

@alfaro96
Copy link
Member Author

alfaro96 commented Jul 3, 2020

Please also add estimator_type to the list of estimator tags the documentation https://scikit-learn.org/stable/developers/develop.html#estimator-tags

Sure, I will include the estimator_type tag to the list of estimator tags (alphabetically sorted, right?).

In fact, I still need to fix an issue regarding some estimators with multiple levels of inheritance (e.g., LogisticRegression).

I will commit these changes as soon as possible!

@NicolasHug
Copy link
Member

was a very similar PR by either @thomasjpfan or @NicolasHug and there were concerns there about how putting estimator tags could be potentially redundant with determination made by mixins (as far as I remember).

Maybe you were referring to this? #16622 (comment)

I personally don't see how mixins can help when we're dealing with execution-time properties: what about meta-estimators like e.g. GridSearchCV which can be both a regressor and a classifier depending on their base_estimator?

@alfaro96
Copy link
Member Author

alfaro96 commented Jul 3, 2020

was a very similar PR by either @thomasjpfan or @NicolasHug and there were concerns there about how putting estimator tags could be potentially redundant with determination made by mixins (as far as I remember).

Maybe you were referring to this? #16622 (comment)

I personally don't see how mixins can help when we're dealing with execution-time properties: what about meta-estimators like e.g. GridSearchCV which can be both a regressor and a classifier depending on their base_estimator?

I was thinking about using isinstance(estimator, ClassifierMixin) to determine whether an estimator is a classifier (similarly for regressor, outlier, etc.). However, the use isinstance may potentially introduce an undesired overhead.

@alfaro96 alfaro96 marked this pull request as ready for review July 3, 2020 15:51
@alfaro96 alfaro96 changed the title [WIP] MNT Deprecates _estimator_type and replaces by a estimator tag MNT Deprecates _estimator_type and replaces by a estimator tag Jul 3, 2020
@amueller
Copy link
Member

Sorry for coming late to the party, but I think this should be an actual backward compatible deprecation. Otherwise we're changing the results of downstream libraries without warning. This is private in the sense that it's hidden from users, but it's actually part of the contract with downstream libraries and we can't just change their results imho.

@alfaro96
Copy link
Member Author

The _estimator_type private property is now deprecated to avoid undesired errors in the downstream libraries.

@jnothman
Copy link
Member

Is the plan here to get #18143 right before continuing??

@alfaro96
Copy link
Member Author

Is the plan here to get #18143 right before continuing??

Yep, I am waiting for #18143 to get merged before continuing working on this PR. Thus, I can address the comments in #18143 to make easier the review of this PR.

WDYT?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following most likely needs to inherit BaseEstimator for the other test to pass:

class MyClassifier(ClassifierMixin):
def fit(self, X, y):
self.classes_ = [0, 1]
return self

Comment on lines +634 to +635
estimator = self.steps[-1][1]
estimator_tags = estimator._get_tags()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test for this logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not seen this logic in the tests for the pipeline module.

doc/glossary.rst Outdated Show resolved Hide resolved
sklearn/feature_selection/tests/test_rfe.py Outdated Show resolved Hide resolved
sklearn/model_selection/tests/test_search.py Outdated Show resolved Hide resolved
sklearn/feature_selection/tests/test_rfe.py Outdated Show resolved Hide resolved
@alfaro96
Copy link
Member Author

The remaining tests should pass when #18614 is merged.

Base automatically changed from master to main January 22, 2021 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate _estimator_type, replace by estimator tag
6 participants