New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVR(SVC()) fails with decision_function_shape="ovr" #5495

Closed
amueller opened this Issue Oct 20, 2015 · 10 comments

Comments

Projects
None yet
2 participants
@amueller
Member

amueller commented Oct 20, 2015

This fails in predict:

from sklearn.svm import SVC
from sklearn.multiclass import OneVsRestClassifier

from sklearn.datasets import make_blobs

X, y = make_blobs()

clf = OneVsRestClassifier(SVC(decision_function_shape='ovr')).fit(X, y)
clf.predict(X)
@dohmatob

This comment has been minimized.

Show comment
Hide comment
@dohmatob

dohmatob Oct 20, 2015

Contributor

The _ovr_decision_function says it requires the confidences param to be of shape (n_samples, n_classifiers), but (n_samples,) is propagated instead.

Contributor

dohmatob commented Oct 20, 2015

The _ovr_decision_function says it requires the confidences param to be of shape (n_samples, n_classifiers), but (n_samples,) is propagated instead.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Oct 20, 2015

Member

Is there a test for the output shape of decision_function in test_common? Is it (n_samples, 1) usually? Then that is a bug in decision_function_shape='ovr' and should be fixed and back ported.

Member

amueller commented Oct 20, 2015

Is there a test for the output shape of decision_function in test_common? Is it (n_samples, 1) usually? Then that is a bug in decision_function_shape='ovr' and should be fixed and back ported.

@amueller amueller added this to the 0.17 milestone Oct 20, 2015

@dohmatob

This comment has been minimized.

Show comment
Hide comment
@dohmatob

dohmatob Oct 20, 2015

Contributor

Yes, the decision_function method of BaseSVC needs more coverage, especially the part invoking _ovr_decision_function. Also, the latter should be reviewed by someone more familiar with the code base.

Contributor

dohmatob commented Oct 20, 2015

Yes, the decision_function method of BaseSVC needs more coverage, especially the part invoking _ovr_decision_function. Also, the latter should be reviewed by someone more familiar with the code base.

@dohmatob

This comment has been minimized.

Show comment
Hide comment
@dohmatob

dohmatob Oct 21, 2015

Contributor

Shouldn't the for loops (on the 'classifiers') written in the _ovr_decision_function be in OneVsRestClassifer.predict method instead ? The OneVsRestClassifer.predict and the _ovr_decision_function function need to be reconciled (they seem incoherent as they stand).

Contributor

dohmatob commented Oct 21, 2015

Shouldn't the for loops (on the 'classifiers') written in the _ovr_decision_function be in OneVsRestClassifer.predict method instead ? The OneVsRestClassifer.predict and the _ovr_decision_function function need to be reconciled (they seem incoherent as they stand).

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Oct 21, 2015

Member

I'm not sure I understand your last comment.
for the use with OneVsRestClassifier the _ovr_decision_function is invoked with only two classes. But in general that is not true.

Member

amueller commented Oct 21, 2015

I'm not sure I understand your last comment.
for the use with OneVsRestClassifier the _ovr_decision_function is invoked with only two classes. But in general that is not true.

@dohmatob

This comment has been minimized.

Show comment
Hide comment
@dohmatob

dohmatob Oct 21, 2015

Contributor

Sorry, the comment is confusing indeed. Repaired.

Contributor

dohmatob commented Oct 21, 2015

Sorry, the comment is confusing indeed. Repaired.

@dohmatob

This comment has been minimized.

Show comment
Hide comment
@dohmatob

dohmatob Oct 21, 2015

Contributor

I'll look at this issue. See WIP here: PR #5508

Contributor

dohmatob commented Oct 21, 2015

I'll look at this issue. See WIP here: PR #5508

@dohmatob

This comment has been minimized.

Show comment
Hide comment
@dohmatob

dohmatob Oct 21, 2015

Contributor

BTW, this is a detailed traceback:

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.pyc in _ovr_decision_function(predictions, confidences, n_classes)
480 for i in range(n_classes):
481 for j in range(i + 1, n_classes):
--> 482 sum_of_confidences[:, i] -= confidences[:, k]
483 sum_of_confidences[:, j] += confidences[:, k]
484 votes[predictions[:, k] == 0, i] += 1

IndexError: too many indices

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.py(482)_ovr_decision_function()
481 for j in range(i + 1, n_classes):
--> 482 sum_of_confidences[:, i] -= confidences[:, k]
483 sum_of_confidences[:, j] += confidences[:, k]

ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/svm/base.py(553)decision_function()
552 if self.decision_function_shape == 'ovr':
--> 553 return ovr_decision_function(dec < 0, dec, len(self.classes))
554 return dec

ipdb> dec.shape
(100,)
ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.py(85)_predict_binary()
84 # probabilities of the positive class
---> 85 score = estimator.predict_proba(X)[:, 1]
86 return score

ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.py(232)predict()
231 for i, e in enumerate(self.estimators_):
--> 232 pred = _predict_binary(e, X)
233 np.maximum(maxima, pred, out=maxima)

ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/tests/test_multiclass.py(551)test_ovr_svc_issue_5495()
549 X, y = make_blobs(random_state=0)
550 clf = OneVsRestClassifier(SVC(decision_function_shape="ovr")).fit(X, y)
--> 551 assert_equal(len(clf.predict(X)), len(y)) # just to invoke .predict meth

ipdb>

Contributor

dohmatob commented Oct 21, 2015

BTW, this is a detailed traceback:

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.pyc in _ovr_decision_function(predictions, confidences, n_classes)
480 for i in range(n_classes):
481 for j in range(i + 1, n_classes):
--> 482 sum_of_confidences[:, i] -= confidences[:, k]
483 sum_of_confidences[:, j] += confidences[:, k]
484 votes[predictions[:, k] == 0, i] += 1

IndexError: too many indices

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.py(482)_ovr_decision_function()
481 for j in range(i + 1, n_classes):
--> 482 sum_of_confidences[:, i] -= confidences[:, k]
483 sum_of_confidences[:, j] += confidences[:, k]

ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/svm/base.py(553)decision_function()
552 if self.decision_function_shape == 'ovr':
--> 553 return ovr_decision_function(dec < 0, dec, len(self.classes))
554 return dec

ipdb> dec.shape
(100,)
ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.py(85)_predict_binary()
84 # probabilities of the positive class
---> 85 score = estimator.predict_proba(X)[:, 1]
86 return score

ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/multiclass.py(232)predict()
231 for i, e in enumerate(self.estimators_):
--> 232 pred = _predict_binary(e, X)
233 np.maximum(maxima, pred, out=maxima)

ipdb> u

/home/elvis/CODE/FORKED/scikit-learn/sklearn/tests/test_multiclass.py(551)test_ovr_svc_issue_5495()
549 X, y = make_blobs(random_state=0)
550 clf = OneVsRestClassifier(SVC(decision_function_shape="ovr")).fit(X, y)
--> 551 assert_equal(len(clf.predict(X)), len(y)) # just to invoke .predict meth

ipdb>

@dohmatob

This comment has been minimized.

Show comment
Hide comment
@dohmatob

dohmatob Oct 21, 2015

Contributor

The issue turned out to be caused by the fact that _ovr_decision_function was being called in binary classif scenario.

Contributor

dohmatob commented Oct 21, 2015

The issue turned out to be caused by the fact that _ovr_decision_function was being called in binary classif scenario.

amueller added a commit that referenced this issue Oct 23, 2015

Merge pull request #5508 from dohmatob/fixes-5495
[WIP] FIX  fails with decision_function_shape="ovr" [#5495]
@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Oct 23, 2015

Member

fixed in #5508. Thanks

Member

amueller commented Oct 23, 2015

fixed in #5508. Thanks

@amueller amueller closed this Oct 23, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment