New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) #9593

Merged
merged 42 commits into from Oct 19, 2017

Conversation

Projects
None yet
4 participants
@reiinakano
Contributor

reiinakano commented Aug 21, 2017

Reference Issue

Fixes #9589

What does this implement/fix? Explain your changes.

I figured that the purpose of this code (original)

if method in ['decision_function', 'predict_proba', 'predict_log_proba']:
        n_classes = len(set(y))
        predictions_ = np.zeros((X_test.shape[0], n_classes))
        if method == 'decision_function' and len(estimator.classes_) == 2:
                predictions_[:, estimator.classes_[-1]] = predictions
        else:
            predictions_[:, estimator.classes_] = predictions
        predictions = predictions_

is to handle cases where the cross-validation done in cross_val_predict doesn't properly stratify the classes i.e. one fold in the CV may not have all classes present.

However, it misses an edge case for decision_function when len(estimator.classes_) == 2 (train split has only 2 classes) and n_classes == 2 (total data has only 2 classes). It wrongly assumes that the output will be of shape (n_samples, n_classes), when in fact, this is only true for decision_function if n_classes > 2. For the correct behavior, we must first check if the total number of classes is greater than 2, and if it is not, we must stick to an output shape of (n_samples,).

Any other comments?

While this PR fixes the case raised in #9589, it still doesn't take into account the fact that for sklearn.svm.SVC, decision_function does not always follow the shape (n_samples, n_classes) or (n_samples,). If decision_function_shape is not set to ovr, the shape is (n_samples, n_classes * (n_classes-1) / 2), and if different train splits have different numbers of classes (not stratified), this inconsistency can't be fixed by the approach currently taken by cross_val_predict.

I would also like to point out my personal opinion that cross_val_predict should probably just throw an error if the classes are not stratified properly (by this I mean some classes are entirely absent in some splits) instead of handling it automatically as it tries to do now. Metrics like log loss and ROC might be affected if for some splits, entire columns are zeroed out. Idk, it just feels inconsistent. Of course, you guys have the final say on this.

What I mean by inconsistent:

from sklearn.model_selection import cross_val_predict, KFold
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

X, y = load_iris(True)
cross_val_predict(LogisticRegression(), X, y, cv=KFold().split(X), method='predict_proba')

returns (array cut down in size)

array([[  0.00000000e+00,   9.99987841e-01,   1.21585683e-05],
       [  0.00000000e+00,   9.99963190e-01,   3.68104260e-05],
       [  0.00000000e+00,   9.99970224e-01,   2.97757119e-05],
       [  0.00000000e+00,   9.99978481e-01,   2.15185386e-05],
       [  0.00000000e+00,   9.99993136e-01,   6.86428459e-06],
       [  1.20030529e-01,   0.00000000e+00,   8.79969471e-01],
       [  1.13183668e-01,   0.00000000e+00,   8.86816332e-01],
       [  6.75506225e-02,   0.00000000e+00,   9.32449377e-01],
       [  1.07273496e-01,   0.00000000e+00,   8.92726504e-01],
       [  1.26854747e-01,   0.00000000e+00,   8.73145253e-01],
       [  1.16881976e-01,   0.00000000e+00,   8.83118024e-01],
       [  2.45077442e-04,   9.99754923e-01,   0.00000000e+00],
       [  2.70475201e-04,   9.99729525e-01,   0.00000000e+00],
       [  6.59523734e-04,   9.99340476e-01,   0.00000000e+00],
       [  4.07574999e-04,   9.99592425e-01,   0.00000000e+00],
       [  2.42451670e-03,   9.97575483e-01,   0.00000000e+00],
       [  2.03959503e-03,   9.97960405e-01,   0.00000000e+00]])

Notice how there are "zeroed out" columns.

And even worse:

from sklearn.model_selection import cross_val_predict, KFold
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

X, y = load_iris(True)
cross_val_predict(LogisticRegression(), X, y, cv=KFold().split(X), method='decision_function')

returns (array cut down in size)

array([[  0.        ,   0.        , -11.31746426],
       [  0.        ,   0.        , -10.20969263],
       [  0.        ,   0.        , -10.42178776],
       [  0.        ,   0.        ,  -9.60397656],
       [  0.        ,   0.        , -11.30005126],
       [  0.        ,   0.        , -11.1906238 ],
       [  0.        ,   0.        , -10.05510072],
       [  0.        ,   0.        , -10.74657422],
       [  0.        ,   0.        ,  -9.20295991],
       [  0.        ,   8.3136912 ,   0.        ],
       [  0.        ,   6.77281268,   0.        ],
       [  0.        ,   7.79874366,   0.        ],
       [  0.        ,   7.29615134,   0.        ],
       [  0.        ,   7.9199709 ,   0.        ],
       [  0.        ,   9.163118  ,   0.        ],
       [  0.        ,   5.8858718 ,   0.        ],
       [  0.        ,   6.37425957,   0.        ],
       [  0.        ,   6.66261924,   0.        ],
       [  0.        ,   6.19296233,   0.        ]])

which is hardly of use to anyone.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Aug 21, 2017

Member

Thanks!

Running ROC over the output of cross_val_predict is a really bad idea in any case...

Member

jnothman commented Aug 21, 2017

Thanks!

Running ROC over the output of cross_val_predict is a really bad idea in any case...

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Aug 21, 2017

Member

Yes, perhaps 0 isn't the best pick, particularly for predict_proba. NaN?

Member

jnothman commented Aug 21, 2017

Yes, perhaps 0 isn't the best pick, particularly for predict_proba. NaN?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Aug 21, 2017

Member

I think better not to error in the context of using this in ClassifierChain or a stacking meta-estimator.

Member

jnothman commented Aug 21, 2017

I think better not to error in the context of using this in ClassifierChain or a stacking meta-estimator.

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Aug 21, 2017

Contributor

I think better not to error in the context of using this in ClassifierChain or a stacking meta-estimator.

Hmm, you're right. One note though, I can't speak for ClassifierChain but for stacking, these zeroed out columns would mess with how the secondary learner learns and I wouldn't trust the results. Perhaps display a warning?

Yes, perhaps 0 isn't the best pick, particularly for predict_proba. NaN?

NaN makes it clearer that these columns weren't considered at all for this particular split, but wouldn't that break things? For instance, I frequently use np.argmax(cross_val_predict(..., method='predict_proba)) to recover the true labels for calculating accuracy, etc. Of course, I always make sure my folds are stratified so I'm probably not going to face it, but still..

Also, wouldn't that break ClassifierChains and stacking meta-estimators anyway?

Contributor

reiinakano commented Aug 21, 2017

I think better not to error in the context of using this in ClassifierChain or a stacking meta-estimator.

Hmm, you're right. One note though, I can't speak for ClassifierChain but for stacking, these zeroed out columns would mess with how the secondary learner learns and I wouldn't trust the results. Perhaps display a warning?

Yes, perhaps 0 isn't the best pick, particularly for predict_proba. NaN?

NaN makes it clearer that these columns weren't considered at all for this particular split, but wouldn't that break things? For instance, I frequently use np.argmax(cross_val_predict(..., method='predict_proba)) to recover the true labels for calculating accuracy, etc. Of course, I always make sure my folds are stratified so I'm probably not going to face it, but still..

Also, wouldn't that break ClassifierChains and stacking meta-estimators anyway?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Aug 21, 2017

Member
Member

jnothman commented Aug 21, 2017

@jnothman

Could you please add a non-regression test.

reiinakano added some commits Aug 22, 2017

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Aug 22, 2017

Contributor

I've added the necessary unit tests.

Although at this point, I'd like to point out there's another regression for the case of SVC with decision_function_shape=ovo

Code to test:

from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_predict

X, y = load_digits(return_X_y=True)
cross_val_predict(SVC(kernel='linear', decision_function_shape='ovo'), X, y, method='decision_function').shape

Expected Results
In previous versions, this is (1797, 45)

Actual Results

ValueError: shape mismatch: value array of shape (602,45) could not be broadcast to indexing result of shape (10,602)

This error is due to the shape of an SVC decision function with decision_function_shape set to 'ovr' which is (n_samples, n_classes * (n_classes-1) / 2)

Should I include the fix in this PR?

Contributor

reiinakano commented Aug 22, 2017

I've added the necessary unit tests.

Although at this point, I'd like to point out there's another regression for the case of SVC with decision_function_shape=ovo

Code to test:

from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_predict

X, y = load_digits(return_X_y=True)
cross_val_predict(SVC(kernel='linear', decision_function_shape='ovo'), X, y, method='decision_function').shape

Expected Results
In previous versions, this is (1797, 45)

Actual Results

ValueError: shape mismatch: value array of shape (602,45) could not be broadcast to indexing result of shape (10,602)

This error is due to the shape of an SVC decision function with decision_function_shape set to 'ovr' which is (n_samples, n_classes * (n_classes-1) / 2)

Should I include the fix in this PR?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Aug 22, 2017

Member
Member

jnothman commented Aug 22, 2017

@jnothman jnothman changed the title from [WIP] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) to [MRG+1] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) Aug 22, 2017

@jnothman jnothman added this to the 0.19.1 milestone Aug 22, 2017

@reiinakano reiinakano changed the title from [MRG+1] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) to [WIP] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) Aug 22, 2017

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Aug 22, 2017

Contributor

Hi @jnothman, I changed this back to WIP since there was a conflict with a recent merge. I also figured out a better fix that also fixes the SVC "ovo" regression.

Contributor

reiinakano commented Aug 22, 2017

Hi @jnothman, I changed this back to WIP since there was a conflict with a recent merge. I also figured out a better fix that also fixes the SVC "ovo" regression.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Aug 22, 2017

Member

Okay. Let me know when it's MRG, i.e. ready for review.

Member

jnothman commented Aug 22, 2017

Okay. Let me know when it's MRG, i.e. ready for review.

reiinakano added some commits Aug 22, 2017

Merge remote-tracking branch 'origin/fix-cross-val-predict-decision-f…
…unction' into fix-cross-val-predict-decision-function

@reiinakano reiinakano changed the title from [WIP] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) to [MRG] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) Aug 22, 2017

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Aug 22, 2017

Contributor

@jnothman Ready for review

Contributor

reiinakano commented Aug 22, 2017

@jnothman Ready for review

@jnothman

Did you mean to add a test for the OvO case?

Show outdated Hide outdated sklearn/model_selection/_validation.py

reiinakano added some commits Aug 22, 2017

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Aug 22, 2017

Contributor

Yup, sorry. Added unit test and addressed comments.

Contributor

reiinakano commented Aug 22, 2017

Yup, sorry. Added unit test and addressed comments.

@jnothman jnothman changed the title from [MRG] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) to [MRG+1] Fix cross_val_predict behavior for binary classification in decision_function (Fixes #9589) Aug 22, 2017

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Aug 30, 2017

Member

A solution to some more of this will be in #9532 but that will probably still take a while.

Member

amueller commented Aug 30, 2017

A solution to some more of this will be in #9532 but that will probably still take a while.

@amueller

I think the fix is good but I'd like to see a more explicit test for SVC(decision_function_shape='ovo') and some more guards that what we do is actually correct.

else:
predictions_[:, estimator.classes_] = predictions
predictions = predictions_
if n_classes != len(estimator.classes_):

This comment has been minimized.

@amueller

amueller Aug 30, 2017

Member

Should we raise a warning in this case?

@amueller

amueller Aug 30, 2017

Member

Should we raise a warning in this case?

This comment has been minimized.

@amueller

amueller Aug 30, 2017

Member

This implicitly checks whether n_classes > 2, right? If there was an estimator that would work for 1 class, the bug would still appear, right? Should we add an explicit check that n_classes > 2 to guard against that?

@amueller

amueller Aug 30, 2017

Member

This implicitly checks whether n_classes > 2, right? If there was an estimator that would work for 1 class, the bug would still appear, right? Should we add an explicit check that n_classes > 2 to guard against that?

This comment has been minimized.

@amueller

amueller Aug 30, 2017

Member

I don't see where the ovo case is addressed. We should also raise an error if predictions.shape[1] != len(estimator.n_classes_), right?

The test for this case is in test_cross_val_predict_with_method. I suggest we add a test with SVC(decision_function_shape='ovo') and make sure a helpful error is thrown.

@amueller

amueller Aug 30, 2017

Member

I don't see where the ovo case is addressed. We should also raise an error if predictions.shape[1] != len(estimator.n_classes_), right?

The test for this case is in test_cross_val_predict_with_method. I suggest we add a test with SVC(decision_function_shape='ovo') and make sure a helpful error is thrown.

This comment has been minimized.

@reiinakano

reiinakano Aug 30, 2017

Contributor

@amueller I agree that a warning should be raised in this case.

I don't follow your statement about n_classes > 2. Which bug are you referring to? Could you give a specific example?

For your third point, what I've fixed here is the following regression of cross_val_predict behavior when decision_function_shape=ovo

Code to test:

from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_predict

X, y = load_digits(return_X_y=True)
cross_val_predict(SVC(kernel='linear', decision_function_shape='ovo'), X, y, method='decision_function').shape

Expected Results
In previous versions, this is (1797, 45)

Actual Results

ValueError: shape mismatch: value array of shape (602,45) could not be broadcast to indexing result of shape (10,602)

I've also included the test for the ovo case in Line 792-798 in test_validation. Should I move this somewhere else?

Basically, the logic is that if the folds are properly stratified (n_classes == len(estimator.classes_)), it doesn't need to go through all the rest of the code (that handles special cases of n_classes != len(estimator.classes_)) and the predictions can be passed as is without reshaping. This is what caused the two regressions referenced in this PR.

I also acknowledge that the ovo case still throws an error if the folds are not properly stratified. However, this has always been the case even in previous versions, and is due to the weird shape of decision_function for ovo.

@reiinakano

reiinakano Aug 30, 2017

Contributor

@amueller I agree that a warning should be raised in this case.

I don't follow your statement about n_classes > 2. Which bug are you referring to? Could you give a specific example?

For your third point, what I've fixed here is the following regression of cross_val_predict behavior when decision_function_shape=ovo

Code to test:

from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_predict

X, y = load_digits(return_X_y=True)
cross_val_predict(SVC(kernel='linear', decision_function_shape='ovo'), X, y, method='decision_function').shape

Expected Results
In previous versions, this is (1797, 45)

Actual Results

ValueError: shape mismatch: value array of shape (602,45) could not be broadcast to indexing result of shape (10,602)

I've also included the test for the ovo case in Line 792-798 in test_validation. Should I move this somewhere else?

Basically, the logic is that if the folds are properly stratified (n_classes == len(estimator.classes_)), it doesn't need to go through all the rest of the code (that handles special cases of n_classes != len(estimator.classes_)) and the predictions can be passed as is without reshaping. This is what caused the two regressions referenced in this PR.

I also acknowledge that the ovo case still throws an error if the folds are not properly stratified. However, this has always been the case even in previous versions, and is due to the weird shape of decision_function for ovo.

This comment has been minimized.

@amueller

amueller Aug 30, 2017

Member

I also acknowledge that the ovo case still throws an error if the folds are not properly stratified. However, this has always been the case even in previous versions, and is due to the weird shape of decision_function for ovo.

That was the case I meant. I just wanted a better error in this case, and I want a test for that error.

@amueller

amueller Aug 30, 2017

Member

I also acknowledge that the ovo case still throws an error if the folds are not properly stratified. However, this has always been the case even in previous versions, and is due to the weird shape of decision_function for ovo.

That was the case I meant. I just wanted a better error in this case, and I want a test for that error.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Aug 30, 2017

Member

I don't follow your statement about n_classes > 2. Which bug are you referring to? Could you give a specific example?

from sklearn.linear_model import RidgeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_predict, KFold

X, y = load_iris(return_X_y=True)
X = X[:100]
y = y[:100]
res = cross_val_predict(RidgeClassifier(), X, y, method='decision_function', cv=KFold(2))

ValueError: shape mismatch: value array of shape (50,) could not be broadcast to indexing result of shape (1,50)

Member

amueller commented Aug 30, 2017

I don't follow your statement about n_classes > 2. Which bug are you referring to? Could you give a specific example?

from sklearn.linear_model import RidgeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_predict, KFold

X, y = load_iris(return_X_y=True)
X = X[:100]
y = y[:100]
res = cross_val_predict(RidgeClassifier(), X, y, method='decision_function', cv=KFold(2))

ValueError: shape mismatch: value array of shape (50,) could not be broadcast to indexing result of shape (1,50)

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Aug 30, 2017

Member

This is the case when some classifiers have 1 class and some have 2 classes.

Member

amueller commented Aug 30, 2017

This is the case when some classifiers have 1 class and some have 2 classes.

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Aug 30, 2017

Contributor

Got it. Didn't know some classifiers could be fit on a single class. So that's another regression. How do you want to handle this case? It sounds similar to the case if method == 'decision_function' and len(estimator.classes_) == 2, which is currently handled by doing predictions_[:, estimator.classes_[-1]] = predictions.

If that's fine, I think the fix is to simply change the former line to if method == 'decision_function' and len(estimator.classes_) <= 2.

This would fix the errors, but frankly, I'd still prefer a stern warning that something went wrong during cross-validation using cross_val_predict.

For the error in the ovo case, should I include that in this PR as well? I didn't include it previously since this PR primarily fixes regressions and I didn't want to include a new "feature".

Contributor

reiinakano commented Aug 30, 2017

Got it. Didn't know some classifiers could be fit on a single class. So that's another regression. How do you want to handle this case? It sounds similar to the case if method == 'decision_function' and len(estimator.classes_) == 2, which is currently handled by doing predictions_[:, estimator.classes_[-1]] = predictions.

If that's fine, I think the fix is to simply change the former line to if method == 'decision_function' and len(estimator.classes_) <= 2.

This would fix the errors, but frankly, I'd still prefer a stern warning that something went wrong during cross-validation using cross_val_predict.

For the error in the ovo case, should I include that in this PR as well? I didn't include it previously since this PR primarily fixes regressions and I didn't want to include a new "feature".

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Aug 30, 2017

Member

Not sure if the 1 class thing is a regression or was always broken ;)
And I think I'd like the error / test for OVO in this PR.

Sorry, I don't understand your proposal for the 1class / 2class case. The result in the 2 class case should be 1d, not 2d. That's the original bug this is trying to solve, right?

Member

amueller commented Aug 30, 2017

Not sure if the 1 class thing is a regression or was always broken ;)
And I think I'd like the error / test for OVO in this PR.

Sorry, I don't understand your proposal for the 1class / 2class case. The result in the 2 class case should be 1d, not 2d. That's the original bug this is trying to solve, right?

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Aug 30, 2017

Contributor

It's a regression. The code you wrote above only breaks on 0.19. :)

Sorry, I don't understand your proposal for the 1class / 2class case. The result in the 2 class case should be 1d, not 2d. That's the original bug this is trying to solve, right?

The case I'm talking about is when the original dataset has >2 classes, and a particular fold has <=2 classes and the method is decision_function. The original bug is still fixed by the if n_classes != len(estimator.classes_): check.

Contributor

reiinakano commented Aug 30, 2017

It's a regression. The code you wrote above only breaks on 0.19. :)

Sorry, I don't understand your proposal for the 1class / 2class case. The result in the 2 class case should be 1d, not 2d. That's the original bug this is trying to solve, right?

The case I'm talking about is when the original dataset has >2 classes, and a particular fold has <=2 classes and the method is decision_function. The original bug is still fixed by the if n_classes != len(estimator.classes_): check.

@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Oct 10, 2017

Contributor

How about erroring out imbalanced classes in decision_function but zeroing out columns in predict_proba and log_predict_proba? Literally all the complication in this PR is because of the different weird ways decision_function behaves.

Contributor

reiinakano commented Oct 10, 2017

How about erroring out imbalanced classes in decision_function but zeroing out columns in predict_proba and log_predict_proba? Literally all the complication in this PR is because of the different weird ways decision_function behaves.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Oct 10, 2017

Member
Member

jnothman commented Oct 10, 2017

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Oct 10, 2017

Member
Member

jnothman commented Oct 10, 2017

@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Oct 10, 2017

Member

I read a bit the discussion in the related PRs, sorry it's taking me some time to catch up. Basically here is my summary:

  • I would be a favour of a warning when the training fold does not contain all the classes. #8773 (comment) agrees with that as well.
  • the fix for the wrong shape in cross_validation_predict with method=decision_function (which consists in only applying the logic from #7889 when n_classes != n_classes_train) is fine
  • method='decision_function' has a number of edge cases. Raising an exception for method='decision_function' if n_classes_train != n_classes may be an easy way to avoid tackling all these special cases.
    1. SVC with ovo with uncommon number of columns for decision function n_classes_train * (n_classes_train - 1) / 2 rather than n_classes. Tackled in this PR already.
    2. n_classes_train <= 2 and n_classes >= 3, you get a 1d array as the output of decision_function and then it is not clear what to do with it. I would be in favour of raising an exception in this case.
    3. n_classes_train == 1 and n_classes == 2. No idea what to do in this case. I don't know how common this is for classifier to be fittable with a single class and have decision_function. I quickly looked using all_estimators and I found ['AdaBoostClassifier', 'LinearDiscriminantAnalysis', 'RidgeClassifier', 'RidgeClassifierCV'].
  • predictions should be initialized to 0 if method='predict_proba' but -inf (or maybe np.finfo(X.dtype).min) if method='predict_log_proba'.
Member

lesteve commented Oct 10, 2017

I read a bit the discussion in the related PRs, sorry it's taking me some time to catch up. Basically here is my summary:

  • I would be a favour of a warning when the training fold does not contain all the classes. #8773 (comment) agrees with that as well.
  • the fix for the wrong shape in cross_validation_predict with method=decision_function (which consists in only applying the logic from #7889 when n_classes != n_classes_train) is fine
  • method='decision_function' has a number of edge cases. Raising an exception for method='decision_function' if n_classes_train != n_classes may be an easy way to avoid tackling all these special cases.
    1. SVC with ovo with uncommon number of columns for decision function n_classes_train * (n_classes_train - 1) / 2 rather than n_classes. Tackled in this PR already.
    2. n_classes_train <= 2 and n_classes >= 3, you get a 1d array as the output of decision_function and then it is not clear what to do with it. I would be in favour of raising an exception in this case.
    3. n_classes_train == 1 and n_classes == 2. No idea what to do in this case. I don't know how common this is for classifier to be fittable with a single class and have decision_function. I quickly looked using all_estimators and I found ['AdaBoostClassifier', 'LinearDiscriminantAnalysis', 'RidgeClassifier', 'RidgeClassifierCV'].
  • predictions should be initialized to 0 if method='predict_proba' but -inf (or maybe np.finfo(X.dtype).min) if method='predict_log_proba'.
@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Oct 10, 2017

Contributor

There is another case for decision_function

iv. n_classes_train > 2 and n_classes > 2 but n_classes_train != n_classes. Currently we set missing classes to 0. It has been mentioned earlier that this does not make sense. How should I handle this?

Contributor

reiinakano commented Oct 10, 2017

There is another case for decision_function

iv. n_classes_train > 2 and n_classes > 2 but n_classes_train != n_classes. Currently we set missing classes to 0. It has been mentioned earlier that this does not make sense. How should I handle this?

@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Oct 11, 2017

Member

iv. n_classes_train > 2 and n_classes > 2 but n_classes_train != n_classes. Currently we set missing classes to 0. It has been mentioned earlier that this does not make sense. How should I handle this?

For me this is not really an edge case. This is the main case we are trying to solve. I was proposing to do this:

predictions should be initialized to 0 if method='predict_proba' but -inf (or maybe np.finfo(X.dtype).min) if method='predict_log_proba'.

For decision_function either we do the same as predict_log_proba or we decide that we always raise an exception, given that there are many edge cases for decision_function ...

Member

lesteve commented Oct 11, 2017

iv. n_classes_train > 2 and n_classes > 2 but n_classes_train != n_classes. Currently we set missing classes to 0. It has been mentioned earlier that this does not make sense. How should I handle this?

For me this is not really an edge case. This is the main case we are trying to solve. I was proposing to do this:

predictions should be initialized to 0 if method='predict_proba' but -inf (or maybe np.finfo(X.dtype).min) if method='predict_log_proba'.

For decision_function either we do the same as predict_log_proba or we decide that we always raise an exception, given that there are many edge cases for decision_function ...

reiinakano added some commits Oct 11, 2017

raise error in decision_function special cases. change predict_log_pr…
…oba missing classes to minimum numpy value
@reiinakano

This comment has been minimized.

Show comment
Hide comment
@reiinakano

reiinakano Oct 11, 2017

Contributor

I think I've mostly sorted this out. For the non-special case of decision_function (iv. n_classes_train > 2 and n_classes > 2 but n_classes_train != n_classes.), I handle it the same as predict_log_proba. For the other special cases, there is an error message.

Apologies for the confusing code previously, I should've done it this way in the first place, but I didn't expect decision_function to have so many edge cases.

Contributor

reiinakano commented Oct 11, 2017

I think I've mostly sorted this out. For the non-special case of decision_function (iv. n_classes_train > 2 and n_classes > 2 but n_classes_train != n_classes.), I handle it the same as predict_log_proba. For the other special cases, there is an error message.

Apologies for the confusing code previously, I should've done it this way in the first place, but I didn't expect decision_function to have so many edge cases.

@lesteve

I think we still need a warning in any case when n_classes_training != n_classes.

I would still like feedback from some core devs about #9593 (comment), especially about the default value for missing classes for predict_log_proba and decision_function (personally I would favour -inf but maybe there are consumers of cross_val_predict which would raise an error with infinite values)

reiinakano added some commits Oct 12, 2017

@caioaao caioaao referenced this pull request Oct 13, 2017

Closed

[MRG+1] Stacking classifier with pipelines API #8960

7 of 7 tasks complete
@lesteve

I think we still need a warning in any case when n_classes_training != n_classes.

You have not tackled this comment AFAICT. It would be good to add a test for this with assert_warns as well.

reiinakano added some commits Oct 13, 2017

@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Oct 17, 2017

Codecov Report

Merging #9593 into master will decrease coverage by 0.18%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9593      +/-   ##
==========================================
- Coverage   96.16%   95.97%   -0.19%     
==========================================
  Files         336      336              
  Lines       62442    62566     +124     
==========================================
+ Hits        60046    60049       +3     
- Misses       2396     2517     +121
Impacted Files Coverage Δ
sklearn/model_selection/_validation.py 96.87% <100%> (+0.08%) ⬆️
sklearn/model_selection/tests/test_validation.py 98.43% <100%> (-0.34%) ⬇️
sklearn/utils/fixes.py 65.76% <0%> (-15.32%) ⬇️
sklearn/utils/tests/test_deprecation.py 85.29% <0%> (-14.71%) ⬇️
sklearn/utils/deprecation.py 85.41% <0%> (-10.42%) ⬇️
sklearn/utils/tests/test_utils.py 91.19% <0%> (-8.81%) ⬇️
sklearn/_build_utils/__init__.py 55.1% <0%> (-6.13%) ⬇️
sklearn/utils/tests/test_estimator_checks.py 90.72% <0%> (-5.97%) ⬇️
sklearn/datasets/mldata.py 91.25% <0%> (-5%) ⬇️
sklearn/utils/extmath.py 93.77% <0%> (-2.4%) ⬇️
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 534f68b...bc405c8. Read the comment docs.

codecov bot commented Oct 17, 2017

Codecov Report

Merging #9593 into master will decrease coverage by 0.18%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9593      +/-   ##
==========================================
- Coverage   96.16%   95.97%   -0.19%     
==========================================
  Files         336      336              
  Lines       62442    62566     +124     
==========================================
+ Hits        60046    60049       +3     
- Misses       2396     2517     +121
Impacted Files Coverage Δ
sklearn/model_selection/_validation.py 96.87% <100%> (+0.08%) ⬆️
sklearn/model_selection/tests/test_validation.py 98.43% <100%> (-0.34%) ⬇️
sklearn/utils/fixes.py 65.76% <0%> (-15.32%) ⬇️
sklearn/utils/tests/test_deprecation.py 85.29% <0%> (-14.71%) ⬇️
sklearn/utils/deprecation.py 85.41% <0%> (-10.42%) ⬇️
sklearn/utils/tests/test_utils.py 91.19% <0%> (-8.81%) ⬇️
sklearn/_build_utils/__init__.py 55.1% <0%> (-6.13%) ⬇️
sklearn/utils/tests/test_estimator_checks.py 90.72% <0%> (-5.97%) ⬇️
sklearn/datasets/mldata.py 91.25% <0%> (-5%) ⬇️
sklearn/utils/extmath.py 93.77% <0%> (-2.4%) ⬇️
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 534f68b...bc405c8. Read the comment docs.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Oct 17, 2017

Member

I would like to see this merged and 0.19.1 released by the end of the week. Is that reasonable, @amueller?

Member

jnothman commented Oct 17, 2017

I would like to see this merged and 0.19.1 released by the end of the week. Is that reasonable, @amueller?

default score needs to be assigned to all instances for that class if
``method`` produces columns per class, as in {'decision_function',
'predict_proba', 'predict_log_proba'}. For ``predict_proba`` this value is
0. In order to ensure finite output, we approximate negative infinity by

This comment has been minimized.

@jnothman

jnothman Oct 19, 2017

Member

I think this behaviour and note is sufficient for a first release.

@jnothman

jnothman Oct 19, 2017

Member

I think this behaviour and note is sufficient for a first release.

@amueller amueller merged commit ebc8730 into scikit-learn:master Oct 19, 2017

6 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.16%)
Details
codecov/project 96.17% (+0.01%) compared to 534f68b
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
lgtm analysis: Python No alert changes
Details

@reiinakano reiinakano deleted the reiinakano:fix-cross-val-predict-decision-function branch Oct 19, 2017

jnothman added a commit to jnothman/scikit-learn that referenced this pull request Oct 20, 2017

[MRG+1] Fix cross_val_predict behavior for binary classification in d…
…ecision_function (Fixes #9589) (#9593)

* fix cross_val_predict for binary classification in decision_function

* Add unit tests

* Add unit tests

* Add unit tests

* better fix

* fix conflict

* fix broken

* only calculate n_classes if one of 'decision_function', 'predict_proba', 'predict_log_proba'

* add test for SVC ovo in cross_val_predict

* flake8 fix

* fix case of ovo and imbalanced folds for binary classification

* change assert_raises to assert_raise_message for ovo case

* fix flake8 linetoo long

* add comments and clearer tests

* improve comments and error message for OvO

* fix .format error with L

* use assert_raises_regex for better error message

* raise error in decision_function special cases. change predict_log_proba missing classes to minimum numpy value

* fix broken tests due to special cases of decision_function

* add modified test for decision_function behavior that does not trigger edge cases

* fix typos

* fix typos

* escape regex .

* escape regex .

* address comments. one unaddressed comment

* simplify code

* flake

* wrong classes range

* address comments. adjust error message

* add warning

* change warning to runtimewarning

* add test for the warning

* Use assert_warns_message rather than assert_warns

Other minor fixes

* Note on class-absent replacement values

* Improve error message

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+1] Fix cross_val_predict behavior for binary classification in d…
…ecision_function (Fixes #9589) (#9593)

* fix cross_val_predict for binary classification in decision_function

* Add unit tests

* Add unit tests

* Add unit tests

* better fix

* fix conflict

* fix broken

* only calculate n_classes if one of 'decision_function', 'predict_proba', 'predict_log_proba'

* add test for SVC ovo in cross_val_predict

* flake8 fix

* fix case of ovo and imbalanced folds for binary classification

* change assert_raises to assert_raise_message for ovo case

* fix flake8 linetoo long

* add comments and clearer tests

* improve comments and error message for OvO

* fix .format error with L

* use assert_raises_regex for better error message

* raise error in decision_function special cases. change predict_log_proba missing classes to minimum numpy value

* fix broken tests due to special cases of decision_function

* add modified test for decision_function behavior that does not trigger edge cases

* fix typos

* fix typos

* escape regex .

* escape regex .

* address comments. one unaddressed comment

* simplify code

* flake

* wrong classes range

* address comments. adjust error message

* add warning

* change warning to runtimewarning

* add test for the warning

* Use assert_warns_message rather than assert_warns

Other minor fixes

* Note on class-absent replacement values

* Improve error message

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+1] Fix cross_val_predict behavior for binary classification in d…
…ecision_function (Fixes #9589) (#9593)

* fix cross_val_predict for binary classification in decision_function

* Add unit tests

* Add unit tests

* Add unit tests

* better fix

* fix conflict

* fix broken

* only calculate n_classes if one of 'decision_function', 'predict_proba', 'predict_log_proba'

* add test for SVC ovo in cross_val_predict

* flake8 fix

* fix case of ovo and imbalanced folds for binary classification

* change assert_raises to assert_raise_message for ovo case

* fix flake8 linetoo long

* add comments and clearer tests

* improve comments and error message for OvO

* fix .format error with L

* use assert_raises_regex for better error message

* raise error in decision_function special cases. change predict_log_proba missing classes to minimum numpy value

* fix broken tests due to special cases of decision_function

* add modified test for decision_function behavior that does not trigger edge cases

* fix typos

* fix typos

* escape regex .

* escape regex .

* address comments. one unaddressed comment

* simplify code

* flake

* wrong classes range

* address comments. adjust error message

* add warning

* change warning to runtimewarning

* add test for the warning

* Use assert_warns_message rather than assert_warns

Other minor fixes

* Note on class-absent replacement values

* Improve error message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment