Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: sklearn 0.22 #571

Merged
merged 6 commits into from
Dec 16, 2019
Merged

Bugfix: sklearn 0.22 #571

merged 6 commits into from
Dec 16, 2019

Conversation

BenjaminBossan
Copy link
Collaborator

Currently, skorch won't work with sklearn 0.22 because some code was changed that is more geared towards sklearn-internal use, which we, however, relied upon.

A list of failures:

  1. check_is_fitted was changed (see Deperecated check_is_fitted call #570) in a way that we won't raise a NotInitializedError anymore despite the net not being initialized. This is now solved by basically porting the old check_is_fitted behavior to skorch.

  2. safe_indexing was changed (probably here) in a way that it no longer works with tuples as indices; the following code works in sklearn 0.21 but not in 0.22:

from sklearn.utils import safe_indexing  # or _safe_indexing
import numpy as np


X = np.zeros((5, 3))
i = ([2, 4, 1, 0, 3], [1, 0, 2, 2, 1])
safe_indexing(X, i)

This is solved by not using safe_indexing in case of tuples anymore.

  1. Our convert_sklearn_metric_function now fails with make_scorer(...) because that returns an sklearn _PredictScorer object (or similar), which we don't detect. This is now detected.

The changes were tested with scikit-learn=0.21.3 and scikit-learn==0.22 (no GPU).

Furthermore, pylint now complains about imports that are not at top level (basically about all our tests), which is why I blankly removed that warning.

BenjaminBossan added 5 commits December 5, 2019 13:20
`check_is_fitted` was changed (see #570) in a way that we won't raise
a `NotInitializedError` anymore despite the net not being
initialized. This is now solved by basically porting the old
`check_is_fitted` behavior to skorch.
sklearn's (_)safe_indexing used to work with tuples as indices, now
doesn't.
In sklearn 0.22, make_scorer returns a special object that must be
treated differently in skorch.
@BenjaminBossan
Copy link
Collaborator Author

I'm not so happy about these breaking changes being made by sklearn, since they affect public API and, at least from skimming, are not announced in the changelog.

@thomasjpfan
Copy link
Member

The util module has been in this gray area regarding changing behavior, as stated in https://scikit-learn.org/stable/developers/utilities.html under “warning”. Historically this policy was put into place, because public functions were put into utils without throughly testing its functionality.

I see how these types of changes can be a bit painful. Going forward, if sklearn wants to better support an ecosystem of estimators, I think sklearn needs to consider some of its utils as “first class” (this means deprecations) and not this gray area. I’ll be sure to bring this up with the rest of the team.

@BenjaminBossan
Copy link
Collaborator Author

The util module has been in this gray area regarding changing behavior, as stated in https://scikit-learn.org/stable/developers/utilities.html under “warning”

Interesting. I didn't know that.

sklearn is a huge codebase, so these things are bound to happen. I think it's especially important to communicate these kind of changes in the release notes.

I’ll be sure to bring this up with the rest of the team.

Great, thanks!

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if (
hasattr(module, 'startswith') and
module.startswith('sklearn.metrics.') and
not module.startswith('sklearn.metrics.scorer') and
not module.startswith('sklearn.metrics.tests.')
not module.startswith('sklearn.metrics.tests.') and
not scoring.__class__.__name__ in scorer_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the scoring.__class__.__name__ check needed? make_scorer returned these objects in 0.21 as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand your suggestion. We need this new line to to check if we get any of these objects. I tried to remove the new line and get failing tests with sklearn 0.22 but not 0.21.

@ottonemo ottonemo merged commit ab8e6ed into master Dec 16, 2019
@BenjaminBossan BenjaminBossan deleted the bugfix/sklearn-0.22 branch February 2, 2020 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants