Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: sklearn 0.22 #571

Merged
merged 6 commits into from Dec 16, 2019
Merged

Bugfix: sklearn 0.22 #571

merged 6 commits into from Dec 16, 2019

Conversation

@BenjaminBossan
Copy link
Collaborator

BenjaminBossan commented Dec 5, 2019

Currently, skorch won't work with sklearn 0.22 because some code was changed that is more geared towards sklearn-internal use, which we, however, relied upon.

A list of failures:

  1. check_is_fitted was changed (see #570) in a way that we won't raise a NotInitializedError anymore despite the net not being initialized. This is now solved by basically porting the old check_is_fitted behavior to skorch.

  2. safe_indexing was changed (probably here) in a way that it no longer works with tuples as indices; the following code works in sklearn 0.21 but not in 0.22:

from sklearn.utils import safe_indexing  # or _safe_indexing
import numpy as np


X = np.zeros((5, 3))
i = ([2, 4, 1, 0, 3], [1, 0, 2, 2, 1])
safe_indexing(X, i)

This is solved by not using safe_indexing in case of tuples anymore.

  1. Our convert_sklearn_metric_function now fails with make_scorer(...) because that returns an sklearn _PredictScorer object (or similar), which we don't detect. This is now detected.

The changes were tested with scikit-learn=0.21.3 and scikit-learn==0.22 (no GPU).

Furthermore, pylint now complains about imports that are not at top level (basically about all our tests), which is why I blankly removed that warning.

BenjaminBossan added 5 commits Dec 5, 2019
`check_is_fitted` was changed (see #570) in a way that we won't raise
a `NotInitializedError` anymore despite the net not being
initialized. This is now solved by basically porting the old
`check_is_fitted` behavior to skorch.
sklearn's (_)safe_indexing used to work with tuples as indices, now
doesn't.
In sklearn 0.22, make_scorer returns a special object that must be
treated differently in skorch.
@BenjaminBossan BenjaminBossan added the bug label Dec 5, 2019
@BenjaminBossan BenjaminBossan requested review from thomasjpfan and ottonemo Dec 5, 2019
@BenjaminBossan BenjaminBossan self-assigned this Dec 5, 2019
@BenjaminBossan

This comment has been minimized.

Copy link
Collaborator Author

BenjaminBossan commented Dec 5, 2019

I'm not so happy about these breaking changes being made by sklearn, since they affect public API and, at least from skimming, are not announced in the changelog.

BenjaminBossan
@thomasjpfan

This comment has been minimized.

Copy link
Member

thomasjpfan commented Dec 5, 2019

The util module has been in this gray area regarding changing behavior, as stated in https://scikit-learn.org/stable/developers/utilities.html under “warning”. Historically this policy was put into place, because public functions were put into utils without throughly testing its functionality.

I see how these types of changes can be a bit painful. Going forward, if sklearn wants to better support an ecosystem of estimators, I think sklearn needs to consider some of its utils as “first class” (this means deprecations) and not this gray area. I’ll be sure to bring this up with the rest of the team.

@BenjaminBossan

This comment has been minimized.

Copy link
Collaborator Author

BenjaminBossan commented Dec 5, 2019

The util module has been in this gray area regarding changing behavior, as stated in https://scikit-learn.org/stable/developers/utilities.html under “warning”

Interesting. I didn't know that.

sklearn is a huge codebase, so these things are bound to happen. I think it's especially important to communicate these kind of changes in the release notes.

I’ll be sure to bring this up with the rest of the team.

Great, thanks!

Copy link
Member

thomasjpfan left a comment

LGTM

if (
hasattr(module, 'startswith') and
module.startswith('sklearn.metrics.') and
not module.startswith('sklearn.metrics.scorer') and
not module.startswith('sklearn.metrics.tests.')
not module.startswith('sklearn.metrics.tests.') and
not scoring.__class__.__name__ in scorer_names

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Dec 13, 2019

Member

Is the scoring.__class__.__name__ check needed? make_scorer returned these objects in 0.21 as well.

This comment has been minimized.

Copy link
@BenjaminBossan

BenjaminBossan Dec 14, 2019

Author Collaborator

I don't quite understand your suggestion. We need this new line to to check if we get any of these objects. I tried to remove the new line and get failing tests with sklearn 0.22 but not 0.21.

@ottonemo ottonemo merged commit ab8e6ed into master Dec 16, 2019
4 checks passed
4 checks passed
Travis CI - Branch Build Passed
Details
Travis CI - Pull Request Build Passed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.