Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CalibratedClassifierCV in conjunction with SVC and multiclass results in index out of bounds error #4634

Closed
adjgiulio opened this issue Apr 26, 2015 · 11 comments
Labels

Comments

@adjgiulio
Copy link

Reported this on StackOverflow:
http://stackoverflow.com/questions/29873981/error-with-sklearn-calibratedclassifiercv-and-svm

kf = StratifiedShuffleSplit(y, n_iter=1, test_size=0.2)
clf = svm.SVC(C=1,probability=True)            
sig_clf = CalibratedClassifierCV(clf, method="isotonic", cv=kf)
sig_clf.fit(X, y)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/g/anaconda/lib/python2.7/site-packages/sklearn/calibration.py", line 166, in fit
    calibrated_classifier.fit(X[test], y[test])
  File "/home/g/anaconda/lib/python2.7/site-packages/sklearn/calibration.py", line 309, in fit
    calibrator.fit(this_df, Y[:, k], sample_weight)
IndexError: index 9 is out of bounds for axis 1 with size 9

This is not an issue with any other classifiers I have used, including RandomForestClassifier and LogisticRegression.

@adjgiulio adjgiulio changed the title CalibratedClassifierCV in conjunction with SVC and multiclass results is index out of bounds error CalibratedClassifierCV in conjunction with SVC and multiclass results in index out of bounds error Apr 27, 2015
@agramfort
Copy link
Member

agramfort commented Apr 27, 2015 via email

@adjgiulio
Copy link
Author

from sklearn.cross_validation import StratifiedShuffleSplit
from sklearn.calibration import CalibratedClassifierCV
from sklearn import svm, datasets
import numpy as np

X, y = datasets.make_classification(n_samples=10000, n_features=80,
                                    n_informative=70, n_redundant=10,n_classes=9)


kf = StratifiedShuffleSplit(y, n_iter=1, test_size=0.2, random_state=42)  
clf = svm.SVC(C=5,random_state=42)    
sig_clf = CalibratedClassifierCV(clf, method="isotonic", cv=kf)

sig_clf.fit(X, y)

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-5-0d4781c68bdb> in <module>()
      1 
----> 2 sig_clf.fit(X, y)

/home/g/anaconda/lib/python2.7/site-packages/sklearn/calibration.pyc in fit(self, X, y, sample_weight)
    164                                               sample_weight[test])
    165                 else:
--> 166                     calibrated_classifier.fit(X[test], y[test])
    167                 self.calibrated_classifiers_.append(calibrated_classifier)
    168 

/home/g/anaconda/lib/python2.7/site-packages/sklearn/calibration.pyc in fit(self, X, y, sample_weight)
    307                 raise ValueError('method should be "sigmoid" or '
    308                                  '"isotonic". Got %s.' % self.method)
--> 309             calibrator.fit(this_df, Y[:, k], sample_weight)
    310             self.calibrators_.append(calibrator)
    311 

IndexError: index 9 is out of bounds for axis 1 with size 9

@amueller
Copy link
Member

I can reproduce on master.

@amueller
Copy link
Member

This is caused by the weird OvO decision function of SVC. I think we should make SVC return a decision function of proper shape. Maybe add an "aggregate_decision_function=None`` and change the default from false to true in a deprecation cycle?

@amueller amueller added the Bug label Apr 27, 2015
@amueller
Copy link
Member

As I said on SO, you can work around this by using OneVsRestClassifier(SVC()) for now.

@adjgiulio
Copy link
Author

Thanks

@raghavrv
Copy link
Member

This is caused by the weird OvO decision function of SVC.

@amueller You had expressed the same concern at #3891 (the OvO decision function PR)... Should I make the SVC's decision function's shape consistent as yourself and @mblondel had wanted... :)

@amueller
Copy link
Member

@ragv see #4638. I want this done, but maybe we should finish your pending projects first.

@raghavrv
Copy link
Member

yep!! sorry didn't know this was a "moderate" level issue ;)

@amueller
Copy link
Member

I think you should have no problem fixing it, still I think there is enough work in your existing PRs ;)

@amueller
Copy link
Member

amueller commented Jun 8, 2015

Fixed via #4714. You can pass decision_function_shape='ovr' to make this work in master.

@amueller amueller closed this as completed Jun 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants