[MRG+ ½] Fixing #7155 in stochastic_gradient.py #7159

yl565 · 2016-08-07T20:07:32Z

Fixes #7155
Now calling predict_proba/predict_log_proba of SGDClassifier throw the correct error when not fitted and calling predict_proba/predict_log_proba of GridSearchCV will not throw error when estimator=SGDClassifier and param_grid={'loss':['log']}

raghavrv · 2016-08-07T21:11:08Z

Thanks for the PR! Could you also add a non regression test please?

yl565 · 2016-08-08T00:17:11Z

Hi @raghavrv, thanks for your comments! It is exciting that I'm submitting my first PR to sklearn. I'm fairly new to the unit testing procedure. Do you mean add some test to the linear_model/tests/test_sgd.py and tests/test_grid_search.py for the bugs in #7155?

agramfort · 2016-08-08T12:58:59Z

sklearn/linear_model/stochastic_gradient.py

+                            all_or_any=all)
+            is_fitted = True
+        except _NotFittedError:
+            is_fitted = False            


I am not sure it's the right fix.

is check_is_fitted really needed to know if predict_proba is supported?

can't you just remove the call to check_is_fitted ?

If I remove the call, GridSearchCV would throw loss hinge error when predict_proba is called. The problem came from this line. It tried to get the predict_proba attribute of SGDClassifier class but the default loss is hinge. So I changed it to raise the probability estimates are not available for loss=hingeerror only if the estimator has already been fitted.

raghavrv · 2016-08-08T18:05:37Z

It is exciting that I'm submitting my first PR to sklearn.

Indeed it is ;)

I'm fairly new to the unit testing procedure. Do you mean add some test to the linear_model/tests/test_sgd.py and tests/test_grid_search.py for the bugs in #7155?

Adding a test along the lines of

def test_sgd_loss_param_grid_search():
    # Make sure the predict_proba works when loss is specified
    # as one of the parameters in the param_grid
    rng = np.random.RandomState(0)
    X = rng.random_sample([5, 2])
    y = rng.random_integers(0, 1, [5, ])
    param_grid = {
        'loss': ['log'],
    }
    clf_grid = grid_search.GridSearchCV(estimator=SGDClassifier(random_state=42),
                                        param_grid=param_grid)
    clf_grid.fit(X, y).predict_proba(X)

added to linear_model/tests/test_sgd.py would be good. (This is an issue with sgd (as we are hacking it a bit) and hence should go into sgd tests.)

yl565 · 2016-08-08T21:35:39Z

@raghavrv Your instruction is really helpful! I added the test along with a test to make sure when predict_proba is called the error raised is not fitted but not AttributeError: probability estimates are not available for loss='hinge'. Now running test_sgd.py with the original stochastic_gradient.py would give the following errors:

======================================================================
ERROR: test_loss_param_grid_search (sklearn.linear_model.tests.test_sgd.DenseSGDClassifierTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/tests/test_sgd.py", line 836, in test_loss_param_grid_search
    clf_grid.predict_proba(X)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/utils/metaestimators.py", line 35, in __get__
    self.get_attribute(obj)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 756, in predict_proba
    self._check_proba()
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 721, in _check_proba
    " loss=%r" % self.loss)
AttributeError: probability estimates are not available for loss='hinge'

======================================================================
ERROR: test_not_fitted_error (sklearn.linear_model.tests.test_sgd.DenseSGDClassifierTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/tests/test_sgd.py", line 846, in test_not_fitted_error
    clf.predict_proba, X)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 756, in predict_proba
    self._check_proba()
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 721, in _check_proba
    " loss=%r" % self.loss)
AttributeError: probability estimates are not available for loss='hinge'

======================================================================
ERROR: test_loss_param_grid_search (sklearn.linear_model.tests.test_sgd.SparseSGDClassifierTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/tests/test_sgd.py", line 837, in test_loss_param_grid_search
    clf_grid.predict_log_proba(X)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/utils/metaestimators.py", line 35, in __get__
    self.get_attribute(obj)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 822, in predict_log_proba
    self._check_proba()
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 721, in _check_proba
    " loss=%r" % self.loss)
AttributeError: probability estimates are not available for loss='hinge'

======================================================================
ERROR: test_not_fitted_error (sklearn.linear_model.tests.test_sgd.SparseSGDClassifierTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/tests/test_sgd.py", line 846, in test_not_fitted_error
    clf.predict_proba, X)
  File "/home/yichuanliu/anaconda3/envs/scikit-learn-dev/lib/python3.5/unittest/case.py", line 1258, in assertRaisesRegex
    return context.handle('assertRaisesRegex', args, kwargs)
  File "/home/yichuanliu/anaconda3/envs/scikit-learn-dev/lib/python3.5/unittest/case.py", line 176, in handle
    callable_obj(*args, **kwargs)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/tests/test_sgd.py", line 44, in predict_proba
    return super(SparseSGDClassifier, self).predict_proba(X)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 756, in predict_proba
    self._check_proba()
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/linear_model/stochastic_gradient.py", line 721, in _check_proba
    " loss=%r" % self.loss)
AttributeError: probability estimates are not available for loss='hinge'

----------------------------------------------------------------------
Ran 168 tests in 1.292s

FAILED (errors=4)

nelson-liu · 2016-08-08T21:40:52Z

sklearn/linear_model/stochastic_gradient.py

@@ -32,7 +32,7 @@
 from .sgd_fast import Huber
 from .sgd_fast import EpsilonInsensitive
 from .sgd_fast import SquaredEpsilonInsensitive
-
+from ..exceptions import NotFittedError as _NotFittedError


i think you should add a newline after this, to replace the one you removed.

raghavrv · 2016-08-08T22:36:21Z

sklearn/linear_model/stochastic_gradient.py

@@ -757,6 +764,7 @@ def predict_proba(self):
        return self._predict_proba

    def _predict_proba(self, X):
+        check_is_fitted(self, ["t_", "coef_", "intercept_"], all_or_any=all)


Now this gets checked twice no? Do we want that?

yl565 · 2016-08-08T22:38:52Z

@nelson-liu Thanks for the comments! Newlines added.

raghavrv · 2016-08-08T22:59:42Z

Also why do we even set the fit params like t_ at init? I think we should fix that too.

Additionally I think we could simplify this by combining the _check_proba, _predict_proba and predict_proba into a single method with a decorator. The decorator would be similar to the if_delegete_has_method decorator. It can be made to check the loss and raise AttributeError. That was the fix I had in mind. But I am not sure if that would work as I haven't tried it yet... Maybe you can try that. That will make the code more readable I feel...

yl565 · 2016-08-09T03:48:48Z

@raghavrv Following your idea, I used decorator on a single method predict_proba. Also removed setting t_ and coef_ at init. Seems to be working as expected now?

raghavrv · 2016-08-09T07:49:38Z

sklearn/linear_model/stochastic_gradient.py

@@ -496,7 +501,7 @@ def partial_fit(self, X, y, classes=None, sample_weight=None):
        Returns
        -------
        self : returns an instance of self.
-        """
+        """            


trailing white space

yl565 · 2016-08-09T14:06:10Z

@raghavrv PEP8 problem fixed!

raghavrv · 2016-08-09T14:27:51Z

Thanks! When loss='hinge' we would want an AttributeError instead of a NotFittedError error I think when the estimator is not fitted yet...

Before further reviews - @MechCoder and @agramfort does this approach seem okay?

raghavrv · 2016-08-09T14:50:01Z

@yl565 Could you revert back the PEP8 changes unrelated to this PR please?

yl565 · 2016-08-09T15:31:17Z

@raghavrv Thanks for all your help and comments! Will revert the PEP8 unrelated to this PR later tonight.

When loss='hinge' we would want an AttributeError instead of a NotFittedError error I think when the estimator is not fitted yet...

I have tried that but it will give the following testing error:

======================================================================
FAIL: sklearn.tests.test_common.test_non_meta_estimators('SGDClassifier', <class 'sklearn.linear_model.stochastic_gradient.SGDClassifier'>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yichuanliu/anaconda3/envs/scikit-learn-dev/lib/python3.5/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/utils/testing.py", line 343, in wrapper
    return fn(*args, **kwargs)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/utils/estimator_checks.py", line 1060, in check_estimators_unfitted
    est.predict_proba, X)
  File "/home/data/Dropbox/Python/scikit-learn/sklearn/utils/testing.py", line 424, in assert_raise_message
    (message, error_message))
AssertionError: Error message does not include the expected string: 'fit'. Observed error message: "probability estimates are notavailable for loss='hinge'

raghavrv · 2016-08-09T15:38:48Z

While initializing the SGDClassifier for that test, you will have to special case and set loss='log'...

BTW can you check why when loss='hinge', this line passes and invokes the test?

raghavrv · 2016-08-09T16:07:57Z

I was thinking along the lines of

from sklearn.model_selection import GridSearchCV
from sklearn.exceptions import NotFittedError
from sklearn.utils.validation import check_is_fitted

class _CheckProba(object):
    def __init__(self, fn):
        self.fn = fn

    def __call__(self, *args, **kw):
        return self.fn(*args, **kw)

    def __get__(self, obj, type=None):
        if obj.loss not in ("log", "modified_huber"):
            raise AttributeError("probability estimates are not available for"
                                 " loss=%r" % obj.loss)
        return self.__class__(self.fn.__get__(obj, type))

class Estim(object):
    def __init__(self, loss='hinge'):
        self.loss=loss

    @_CheckProba
    def predict_proba(self, X):
        check_is_fitted(self, 'coef_')
        return np.arange(X.shape[0])

    def fit(self, X, y=None):
        self.coef_ = 1
        return self

    def __repr__(self):
        return "Estim(loss=%r)" % self.loss

Test the following ...

>>> X = [[1, 2], [2, 3]]
>>> y = [0, 1]
>>> hasattr(Estim().fit(X, y), 'predict_proba')
False
>>> hasattr(Estim(loss='log').fit(X, y), 'predict_proba')
True
>>> Estim(loss='log').predict_proba(X)
NotFittedError

lesteve · 2016-09-06T11:36:29Z

sklearn/utils/metaestimators.py

+        fn_check_obj(obj) with obj as the metaestimator instance.
+        The function will be executed before ducktyping. Typically used for
+        checking if the estimator has been fitted.
+


I would leave the doctests as they were or add them as tests to sklearn/utils/tests/test_metaestimators.py.

You could also add additional tests in the same file as you did at some point of this PR. Adding tests for if_fitted_delegate_has_method would be good too.

amueller · 2016-09-06T16:28:54Z

Only being able to sniff out supported methods after fitting seems to be somewhat of a problem :( This "only" happens with GridSearchCV and RandomizedSearchCV, right?
This seems to have weird impact on the estimator tags (#6599), because they are now only defined after fitting... Basically all possible tags can change during fit based on what steps a grid-search put into a pipeline.

jnothman · 2016-09-07T00:02:49Z

It happens when capabilities can't be determined from parameters alone. Currently that means GridSearchCV and RandomizedSearchCV, I think.

jnothman · 2016-09-07T12:38:00Z

Other estimators which could delegate to a GridSearchCV may be a problem too. A pipeline with a gridsearch as its last step, for instance, has indeterminate predict_proba support until after fitting.

Then in terms of duck-typing I see two choices:

only ever test for attributes after fitting
specify a set of attributes that can be tested for before fitting, the remainder only after. The same may be said of tags.

For reference, here is a quick-and-dirty listing of attributes we hasattr (on things that mostly aren't data):

1   "Lock"
1   "_idf_diag"
1   "_optimizer"
1   "alpha"
1   "alphas"
1   "args"
1   "compute_labels"
1   "connected_components"
1   "f_cached"
1   "fit_predict"
1   "iteritems"
1   "kind"
1   "lsqr"
1   "min_weight_fraction_leaf"
1   "n_nonzero_coefs"
1   "penalty"
1   "predict_log_proba"
1   "random_state"
1   "score"
1   "sorted_indices"
1   "sparsify"
1   "split"
1   "toarray"
1   attr
1   hasattr(MetaEst(HasNoPredict(
1   hasattr(MetaEst(HasPredict(
1   reg
2   "dtype"
2   "get_feature_names"
2   "get_params"
2   "item"
2   "items"
2   "kernel"
3   "data"
3   "fit_transform"
3   "l1_ratio"
3   "n_iter"
3   "transform"
3   method
4   "fit"
4   "partial_fit"
5   "decision_function"
5   "n_clusters"
5   "predict"
6   "n_components"
8   "predict_proba"

For example: predict_proba is a "only after fit" case. I think predict should be stable over fitting.

yl565 · 2016-09-07T15:58:01Z

So maybe we apply @if_delegate_has_method to stable method like predict and apply @if_fitted_delegate_has_method to predict_proba? Or we specify in metaestimators.py a list of attributes that should be checked if fitted?

amueller · 2016-09-07T21:15:20Z

Sooo we want to warn people if they check hasattr before fit on the ones that are not stable?

predict can be not stable if I grid-search a Pipeline that can have as steps either PCA or SVC.

jnothman · 2016-09-07T22:09:15Z

More practically, can we separate the long-term solution from one which
supports the case broken in this bug, but does not break anything new? I
think the answer is the back-off delegation that was implemented in this PR
a few days ago...

On 8 September 2016 at 01:58, Yichuan Liu notifications@github.com wrote:

So maybe we apply @if_delegate_has_method to stable method like predict
and apply @if_fitted_delegate_has_method to predict_proba? Or we specify
in metaestimators.py a list of attributes that should be checked if
fitted?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7159 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz66jUOpqd4w5lYdzAOJNqYvoxPmwOks5qnt8KgaJpZM4Jek2m
.

amueller · 2016-09-09T15:09:30Z

@jnothman hm it kinda solves the problem after fit. I'm ok with it. Should we still target that for 0.18?

yl565 · 2016-09-09T15:22:07Z

@jnothman Do you think it is better to revert back to 265fc70?

amueller · 2016-09-09T15:23:56Z

@yl565 I think that was the suggestion.

amueller · 2016-09-09T17:25:27Z

sklearn/utils/metaestimators.py

-    This allows ducktyping of the decorated method based on ``attribute_name``.
+    This allows ducktyping of the decorated method based on
+    ``delegate.attribute_name`` where ``delegate`` is the first item in
+    ``delegate_names`` that is an attribute of the base object.


I find this sentence hard to understand. maybe make it two sentences. And say "Here delegate is the first item in delegate_names for which hasattr(object, delegate) is true? not sure.

How about the following:

Using this class to create a decorator for a class method object.attribute_name.
An AttributeError will be raised if none of the delegates (specified in
delegate_names) is an attribute of object (i.e. hasattr(object, delegate )
is False) or none of the delegates has an attribute attribute_name (i.e.
hasattr(delegate, attribute_name) is False).

This allows ducktyping of the decorated method object.attribute_name based on
object.delegate.attribute_name where delegate is the first item in
delegate_names for which hasattr(object, delegate ) is True).

I don't like " for a class method object.attribute_name":

attribute_name can be any attribute, not necessarily a method

a class method named "object.attribute_name" is weird or even contradictory.

Let's just spit the second sentence as @amueller suggested.

ogrisel · 2016-09-11T12:55:21Z

Besides the unaddressed comments still visible in the diff view of this PR on github, LGTM.

jnothman · 2016-09-12T13:44:49Z

sklearn/utils/metaestimators.py

@@ -14,16 +14,22 @@ class _IffHasAttrDescriptor(object):
    """Implements a conditional property using the descriptor protocol.

    Using this class to create a decorator will raise an ``AttributeError``
-    if the ``attribute_name`` is not present on the base object.
+    if none of the delegates (specified in ``delegate_names``) is an attribute
+    of the base object or none of the delegates has an attribute


"none of the delegates has" -> "the first found delegate does not have"

ogrisel · 2016-09-12T18:27:17Z

@yl565 let me finish this PR for you. We want to have it merged into based before cutting the 0.18.X release branch tonight.

yl565 · 2016-09-12T18:44:47Z

@ogrisel Sorry for couldn't response earlier, have been busy preparing a paper. Please go head. Thanks.

ogrisel · 2016-09-12T18:49:41Z

No problem!

[ci skip]

ogrisel · 2016-09-13T07:28:40Z

#7401 has been merged. Thanks for the fix @yl565!

[ci skip]

@yl565

Update data to latest available on figshare (v5). Expression and mutation have not changed since v4, so do not expect changes from this upgrade. Update sklearn to 0.18.0. See what's new at http://scikit-learn.org/0.18/whats_new.html. Particularly exciting is: > Fix incomplete `predict_proba` method delegation from `model_selection.GridSearchCV` to `linear_model.SGDClassifier` (scikit-learn/scikit-learn#7159) by Yichuan Liu (@yl565). `3.TCGA-MLexample_Pathway.ipynb` had incorrect data location. Fixed so all root notebooks would execute.

[ci skip]

@yl565

* Upgrades: sklearn to 0.18.0, data to v5 Update data to latest available on figshare (v5). Expression and mutation have not changed since v4, so do not expect changes from this upgrade. Update sklearn to 0.18.0. See what's new at http://scikit-learn.org/0.18/whats_new.html. Particularly exciting is: > Fix incomplete `predict_proba` method delegation from `model_selection.GridSearchCV` to `linear_model.SGDClassifier` (scikit-learn/scikit-learn#7159) by Yichuan Liu (@yl565). `3.TCGA-MLexample_Pathway.ipynb` had incorrect data location. Fixed so all root notebooks would execute. * Rerun with up-to-date local conda env * Fix import deprecation warning * Fix deprecation warning with cv grid scores

[ci skip]

Fix scikit-learn#7155

5da25f5

yl565 changed the title ~~[MRG] Fixing #7155 in stochastic_gradient.py~~ [WIP] Fixing #7155 in stochastic_gradient.py Aug 7, 2016

Fix test errors

dcbe371

agramfort reviewed Aug 8, 2016
View reviewed changes

add_tests

3aa26ac

yl565 changed the title ~~[WIP] Fixing #7155 in stochastic_gradient.py~~ [MRG] Fixing #7155 in stochastic_gradient.py Aug 8, 2016

nelson-liu reviewed Aug 8, 2016
View reviewed changes

yl565 added 2 commits August 8, 2016 18:11

fix_py26_error

db59c22

add spaces

9aed2b5

raghavrv reviewed Aug 8, 2016
View reviewed changes

yl565 added 2 commits August 8, 2016 22:44

use decorator, do not set fit params at init

bc1a700

fix coef_ initialization error raised in passive_aggressive.py

91b648e

raghavrv reviewed Aug 9, 2016
View reviewed changes

fix pep8 and redundent _CheckProba in predict_log_proba

899d0d1

Fix pep8

62de9c3

lesteve reviewed Sep 6, 2016
View reviewed changes

revert back to 265fc70

fa6b974

amueller reviewed Sep 9, 2016
View reviewed changes

jnothman reviewed Sep 12, 2016
View reviewed changes

ogrisel mentioned this pull request Sep 12, 2016

[MRG] FIX 7155: GridSearchCV predict_proba delegation to SDGClassifier #7401

Merged

ogrisel added a commit that referenced this pull request Sep 13, 2016

DOC add what's new entry for #7159

3d8d0de

[ci skip]

ogrisel closed this Sep 13, 2016

rsmith54 pushed a commit to rsmith54/scikit-learn that referenced this pull request Sep 14, 2016

DOC add what's new entry for scikit-learn#7159

0b9f8d5

[ci skip]

dhimmel mentioned this pull request Sep 30, 2016

Upgrades: sklearn to 0.18.0, data to v5 cognoma/machine-learning#53

Merged

TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016

DOC add what's new entry for scikit-learn#7159

926db3b

[ci skip]

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

DOC add what's new entry for scikit-learn#7159

a957c31

[ci skip]

paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

DOC add what's new entry for scikit-learn#7159

e0b7d47

[ci skip]

[MRG+ ½] Fixing #7155 in stochastic_gradient.py #7159

[MRG+ ½] Fixing #7155 in stochastic_gradient.py #7159

Conversation

yl565 commented Aug 7, 2016

raghavrv commented Aug 7, 2016

yl565 commented Aug 8, 2016

Choose a reason for hiding this comment

yl565 Aug 8, 2016 • edited

Choose a reason for hiding this comment

raghavrv commented Aug 8, 2016

yl565 commented Aug 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yl565 commented Aug 8, 2016

raghavrv commented Aug 8, 2016

yl565 commented Aug 9, 2016

Choose a reason for hiding this comment

yl565 commented Aug 9, 2016

raghavrv commented Aug 9, 2016 • edited

raghavrv commented Aug 9, 2016 • edited

yl565 commented Aug 9, 2016

raghavrv commented Aug 9, 2016

raghavrv commented Aug 9, 2016

lesteve Sep 6, 2016 • edited

Choose a reason for hiding this comment

amueller commented Sep 6, 2016

jnothman commented Sep 7, 2016

jnothman commented Sep 7, 2016

yl565 commented Sep 7, 2016

amueller commented Sep 7, 2016 • edited

jnothman commented Sep 7, 2016

amueller commented Sep 9, 2016

yl565 commented Sep 9, 2016

amueller commented Sep 9, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ogrisel commented Sep 11, 2016

Choose a reason for hiding this comment

ogrisel commented Sep 12, 2016

yl565 commented Sep 12, 2016

ogrisel commented Sep 12, 2016

ogrisel commented Sep 13, 2016

yl565 Aug 8, 2016 •

edited

raghavrv commented Aug 9, 2016 •

edited

raghavrv commented Aug 9, 2016 •

edited

lesteve Sep 6, 2016 •

edited

amueller commented Sep 7, 2016 •

edited