Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] SKF raises error if all n_labels for individual classes <n_folds #6182

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
4 participants
@dsquareindia
Copy link
Contributor

dsquareindia commented Jan 18, 2016

Addresses #6177.

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 18, 2016

@rvraghav93 @MechCoder could you please check this? Thanks!

@dsquareindia dsquareindia changed the title SKF raises error if n_labels<n_folds for individual classes SKF raises error if all n_labels for individual classes <n_folds Jan 18, 2016

" members, which is too few. The minimum"
" number of labels for any class cannot"
" be less than n_folds=%d."
% (min_labels, self.n_folds))

This comment has been minimized.

Copy link
@MechCoder

MechCoder Jan 18, 2016

Member

It should be np.all(self.n_folds > label_counts). It is fine if the least populated class has fewer than n_folds. It is a problem if all the classes have n_labels lesser than n_folds.

@dsquareindia dsquareindia changed the title SKF raises error if all n_labels for individual classes <n_folds [MRG] SKF raises error if all n_labels for individual classes <n_folds Jan 19, 2016

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 19, 2016

@MechCoder is this fine?
cc: @rvraghav93

@raghavrv

This comment has been minimized.

Copy link
Member

raghavrv commented Jan 19, 2016

This looks good! Thanks. @MechCoder or @amueller for a second review?

@raghavrv

This comment has been minimized.

Copy link
Member

raghavrv commented Jan 19, 2016

(Squash the commits please)

assert_raises(ValueError, cval.StratifiedKFold, y, 0)
assert_raises(ValueError, cval.StratifiedKFold, y, 1)
assert_raises(ValueError, cval.StratifiedKFold, y2, 0)
assert_raises(ValueError, cval.StratifiedKFold, y2, 1)

This comment has been minimized.

Copy link
@raghavrv

raghavrv Jan 19, 2016

Member

Could you check for the err message maybe? (assert_raises_msg)

This comment has been minimized.

Copy link
@raghavrv

raghavrv Jan 19, 2016

Member

(As the ValueError could sometimes be from a different error)

raise ValueError("All the n_labels for individual classes"
" are less than %d folds."
% (self.n_folds))
elif self.n_folds > min_labels:

This comment has been minimized.

Copy link
@MechCoder

MechCoder Jan 19, 2016

Member

elif -> if

@MechCoder

This comment has been minimized.

Copy link
Member

MechCoder commented Jan 19, 2016

Please update whatsnew

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 19, 2016

@rvraghav93 @MechCoder is this fine?

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 20, 2016

@rvraghav93 @MechCoder how can I restart this build? I this test failing is not related to my pr :/

@raghavrv

This comment has been minimized.

Copy link
Member

raghavrv commented Jan 20, 2016

Ah don't mind about that. appveyor gets whacky at times. (And I've confirmed that the current appveyor failure is unrelated to your PR).

As the commit hash is dependent on the commit time you could reset, re-commit and force push for the build to get restarted.

@@ -92,6 +92,9 @@ Enhancements
Bug fixes
.........

- :class:`StratifiedKFold` now raises error if all n_labels for individual classes is less than n_folds.
(`#6182 <https://github.com/scikit-learn/scikit-learn/pull/6182>`_) by `Raghav R V`_, `Manoj Kumar`_ and `Devashish Deshpande`_.

This comment has been minimized.

Copy link
@raghavrv

raghavrv Jan 20, 2016

Member

Whoa that was generous ;) Please add your name alone :)

This comment has been minimized.

Copy link
@dsquareindia

dsquareindia Jan 20, 2016

Author Contributor

Hahaha I wouldn't really like to do that though 🍻


skf_3 = StratifiedKFold(3)
skf_2 = StratifiedKFold(2)

This comment has been minimized.

Copy link
@raghavrv

raghavrv Jan 20, 2016

Member

Maybe keep this as 3, and instead adjust the y1 accordingly? (or maybe it doesn't matter as long as it raises the correct error message?)


# Error when number of folds is <= 1
assert_raises(ValueError, cval.KFold, 2, 0)
assert_raises(ValueError, cval.KFold, 2, 1)
assert_raises(ValueError, cval.StratifiedKFold, y, 0)
assert_raises(ValueError, cval.StratifiedKFold, y, 1)
# String to check if ValueError is not from a different error

This comment has been minimized.

Copy link
@raghavrv

raghavrv Jan 20, 2016

Member

This comment can be removed I think, as we use assert_raises_message for this reason only


cv = assert_warns_message(Warning, "The least populated class",
cval.StratifiedKFold, y, 3)
cval.StratifiedKFold, y1, 2)

This comment has been minimized.

Copy link
@raghavrv

raghavrv Jan 20, 2016

Member

Same comment as above. Maybe keep it 3 and adjust the test accordingly?

@raghavrv

This comment has been minimized.

Copy link
Member

raghavrv commented Jan 20, 2016

Thanks for addressing the comments. Looks good to me apart from the nitpicks. Wait for @jnothman or @MechCoder

@@ -519,6 +519,12 @@ def __init__(self, y, n_folds=3, shuffle=False,
unique_labels, y_inversed = np.unique(y, return_inverse=True)
label_counts = bincount(y_inversed)
min_labels = np.min(label_counts)
# Raise error when all the n_labels for individual classes
# are less than n_folds

This comment has been minimized.

Copy link
@MechCoder

MechCoder Jan 20, 2016

Member

Umm I think the code is self explanatory


# When n is not integer:
assert_raises(ValueError, cval.KFold, 2.5, 2)

# When n_folds is not integer:
assert_raises(ValueError, cval.KFold, 5, 1.5)
assert_raises(ValueError, cval.StratifiedKFold, y, 1.5)
assert_raises(ValueError, cval.StratifiedKFold, y2, 1.5)

This comment has been minimized.

Copy link
@MechCoder

MechCoder Jan 20, 2016

Member

you can change this back to y

" train / test split")
assert_raise_message(ValueError, error_string,
cval.StratifiedKFold, y2, 0)
assert_raise_message(ValueError, error_string,

This comment has been minimized.

Copy link
@MechCoder

MechCoder Jan 20, 2016

Member

can you change this back to y? In general, we don't modify previous tests unless absolutely required.

@MechCoder

This comment has been minimized.

Copy link
Member

MechCoder commented Jan 20, 2016

lgtm pending nitpicks

@MechCoder MechCoder changed the title [MRG] SKF raises error if all n_labels for individual classes <n_folds [MRG+!] SKF raises error if all n_labels for individual classes <n_folds Jan 20, 2016

@MechCoder MechCoder changed the title [MRG+!] SKF raises error if all n_labels for individual classes <n_folds [MRG+1] SKF raises error if all n_labels for individual classes <n_folds Jan 20, 2016

@MechCoder

This comment has been minimized.

Copy link
Member

MechCoder commented Jan 20, 2016

ping @TomDLT to just verify?

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 21, 2016

Done :)

@TomDLT

This comment has been minimized.

Copy link
Member

TomDLT commented Jan 21, 2016

if all the n_labels for individual classes are less than n_folds

I don't find the word n_labels very clear, it sounds like the number of classes, which is confusing.
What do you think of "if all the classes have less than n_folds samples"?

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 22, 2016

@TomDLT yeah it's concise too. Should I go ahead and make the changes?
cc: @MechCoder @rvraghav93

@raghavrv

This comment has been minimized.

Copy link
Member

raghavrv commented Jan 22, 2016

Please do!

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 22, 2016

Done!

@MechCoder

This comment has been minimized.

Copy link
Member

MechCoder commented Jan 23, 2016

Thanks !! 🍹 🍷

@MechCoder MechCoder closed this Jan 23, 2016

@dsquareindia

This comment has been minimized.

Copy link
Contributor Author

dsquareindia commented Jan 23, 2016

No problem :) Thanks for the help!

@dsquareindia dsquareindia deleted the dsquareindia:skf_fix branch Jan 23, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.