Bugfix: sklearn 0.22 #571

BenjaminBossan · 2019-12-05T12:28:39Z

Currently, skorch won't work with sklearn 0.22 because some code was changed that is more geared towards sklearn-internal use, which we, however, relied upon.

A list of failures:

check_is_fitted was changed (see Deperecated check_is_fitted call #570) in a way that we won't raise a NotInitializedError anymore despite the net not being initialized. This is now solved by basically porting the old check_is_fitted behavior to skorch.
safe_indexing was changed (probably here) in a way that it no longer works with tuples as indices; the following code works in sklearn 0.21 but not in 0.22:

from sklearn.utils import safe_indexing  # or _safe_indexing
import numpy as np


X = np.zeros((5, 3))
i = ([2, 4, 1, 0, 3], [1, 0, 2, 2, 1])
safe_indexing(X, i)

This is solved by not using safe_indexing in case of tuples anymore.

Our convert_sklearn_metric_function now fails with make_scorer(...) because that returns an sklearn _PredictScorer object (or similar), which we don't detect. This is now detected.

The changes were tested with scikit-learn=0.21.3 and scikit-learn==0.22 (no GPU).

Furthermore, pylint now complains about imports that are not at top level (basically about all our tests), which is why I blankly removed that warning.

Very unnecessary :/

`check_is_fitted` was changed (see #570) in a way that we won't raise a `NotInitializedError` anymore despite the net not being initialized. This is now solved by basically porting the old `check_is_fitted` behavior to skorch.

sklearn's (_)safe_indexing used to work with tuples as indices, now doesn't.

In sklearn 0.22, make_scorer returns a special object that must be treated differently in skorch.

BenjaminBossan · 2019-12-05T12:31:22Z

I'm not so happy about these breaking changes being made by sklearn, since they affect public API and, at least from skimming, are not announced in the changelog.

thomasjpfan · 2019-12-05T14:09:51Z

The util module has been in this gray area regarding changing behavior, as stated in https://scikit-learn.org/stable/developers/utilities.html under “warning”. Historically this policy was put into place, because public functions were put into utils without throughly testing its functionality.

I see how these types of changes can be a bit painful. Going forward, if sklearn wants to better support an ecosystem of estimators, I think sklearn needs to consider some of its utils as “first class” (this means deprecations) and not this gray area. I’ll be sure to bring this up with the rest of the team.

BenjaminBossan · 2019-12-05T16:08:59Z

The util module has been in this gray area regarding changing behavior, as stated in https://scikit-learn.org/stable/developers/utilities.html under “warning”

Interesting. I didn't know that.

sklearn is a huge codebase, so these things are bound to happen. I think it's especially important to communicate these kind of changes in the release notes.

I’ll be sure to bring this up with the rest of the team.

Great, thanks!

thomasjpfan

LGTM

thomasjpfan · 2019-12-13T17:14:14Z

skorch/callbacks/scoring.py

        if (
                hasattr(module, 'startswith') and
                module.startswith('sklearn.metrics.') and
                not module.startswith('sklearn.metrics.scorer') and
-                not module.startswith('sklearn.metrics.tests.')
+                not module.startswith('sklearn.metrics.tests.') and
+                not scoring.__class__.__name__ in scorer_names


Is the scoring.__class__.__name__ check needed? make_scorer returned these objects in 0.21 as well.

I don't quite understand your suggestion. We need this new line to to check if we get any of these objects. I tried to remove the new line and get failing tests with sklearn 0.22 but not 0.21.

BenjaminBossan added 5 commits December 5, 2019 13:20

Remove import-outside-toplevel from pylint

47f0c5b

Very unnecessary :/

Make check_is_fitted work again

e9f35a9

`check_is_fitted` was changed (see #570) in a way that we won't raise a `NotInitializedError` anymore despite the net not being initialized. This is now solved by basically porting the old `check_is_fitted` behavior to skorch.

Fix multi_indexing with sklearn 0.22

43b2a59

sklearn's (_)safe_indexing used to work with tuples as indices, now doesn't.

Remove unnecessary import

af4068c

Make convert_sklearn_metric_function work again

9202c86

In sklearn 0.22, make_scorer returns a special object that must be treated differently in skorch.

BenjaminBossan added the bug label Dec 5, 2019

BenjaminBossan requested review from thomasjpfan and ottonemo December 5, 2019 12:28

BenjaminBossan self-assigned this Dec 5, 2019

Update CHANGELOG

658042e

BenjaminBossan mentioned this pull request Dec 6, 2019

Unify predict_proba for neural net classifiers #572

Closed

thomasjpfan approved these changes Dec 13, 2019

View reviewed changes

ottonemo approved these changes Dec 16, 2019

View reviewed changes

ottonemo merged commit ab8e6ed into master Dec 16, 2019

BenjaminBossan deleted the bugfix/sklearn-0.22 branch February 2, 2020 12:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix: sklearn 0.22 #571

Bugfix: sklearn 0.22 #571

BenjaminBossan commented Dec 5, 2019

BenjaminBossan commented Dec 5, 2019

thomasjpfan commented Dec 5, 2019

BenjaminBossan commented Dec 5, 2019

thomasjpfan left a comment

thomasjpfan Dec 13, 2019

BenjaminBossan Dec 14, 2019

Bugfix: sklearn 0.22 #571

Bugfix: sklearn 0.22 #571

Conversation

BenjaminBossan commented Dec 5, 2019

BenjaminBossan commented Dec 5, 2019

thomasjpfan commented Dec 5, 2019

BenjaminBossan commented Dec 5, 2019

thomasjpfan left a comment

Choose a reason for hiding this comment

thomasjpfan Dec 13, 2019

Choose a reason for hiding this comment

BenjaminBossan Dec 14, 2019

Choose a reason for hiding this comment