Skip to content

[MRG+1] SKF raises error if all n_labels for individual classes <n_folds #6182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

devashishd12
Copy link
Contributor

Addresses #6177.

@devashishd12
Copy link
Contributor Author

@rvraghav93 @MechCoder could you please check this? Thanks!

@devashishd12 devashishd12 changed the title SKF raises error if n_labels<n_folds for individual classes SKF raises error if all n_labels for individual classes <n_folds Jan 18, 2016
" members, which is too few. The minimum"
" number of labels for any class cannot"
" be less than n_folds=%d."
% (min_labels, self.n_folds))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be np.all(self.n_folds > label_counts). It is fine if the least populated class has fewer than n_folds. It is a problem if all the classes have n_labels lesser than n_folds.

@devashishd12 devashishd12 changed the title SKF raises error if all n_labels for individual classes <n_folds [MRG] SKF raises error if all n_labels for individual classes <n_folds Jan 19, 2016
@devashishd12
Copy link
Contributor Author

@MechCoder is this fine?
cc: @rvraghav93

@raghavrv
Copy link
Member

This looks good! Thanks. @MechCoder or @amueller for a second review?

@raghavrv
Copy link
Member

(Squash the commits please)

assert_raises(ValueError, cval.StratifiedKFold, y, 0)
assert_raises(ValueError, cval.StratifiedKFold, y, 1)
assert_raises(ValueError, cval.StratifiedKFold, y2, 0)
assert_raises(ValueError, cval.StratifiedKFold, y2, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check for the err message maybe? (assert_raises_msg)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As the ValueError could sometimes be from a different error)

@MechCoder
Copy link
Member

Please update whatsnew

@devashishd12
Copy link
Contributor Author

@rvraghav93 @MechCoder is this fine?

@devashishd12
Copy link
Contributor Author

@rvraghav93 @MechCoder how can I restart this build? I this test failing is not related to my pr :/

@raghavrv
Copy link
Member

Ah don't mind about that. appveyor gets whacky at times. (And I've confirmed that the current appveyor failure is unrelated to your PR).

As the commit hash is dependent on the commit time you could reset, re-commit and force push for the build to get restarted.

@@ -92,6 +92,9 @@ Enhancements
Bug fixes
.........

- :class:`StratifiedKFold` now raises error if all n_labels for individual classes is less than n_folds.
(`#6182 <https://github.com/scikit-learn/scikit-learn/pull/6182>`_) by `Raghav R V`_, `Manoj Kumar`_ and `Devashish Deshpande`_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa that was generous ;) Please add your name alone :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha I wouldn't really like to do that though 🍻

@raghavrv
Copy link
Member

Thanks for addressing the comments. Looks good to me apart from the nitpicks. Wait for @jnothman or @MechCoder

@@ -519,6 +519,12 @@ def __init__(self, y, n_folds=3, shuffle=False,
unique_labels, y_inversed = np.unique(y, return_inverse=True)
label_counts = bincount(y_inversed)
min_labels = np.min(label_counts)
# Raise error when all the n_labels for individual classes
# are less than n_folds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm I think the code is self explanatory

@MechCoder
Copy link
Member

lgtm pending nitpicks

@MechCoder MechCoder changed the title [MRG] SKF raises error if all n_labels for individual classes <n_folds [MRG+!] SKF raises error if all n_labels for individual classes <n_folds Jan 20, 2016
@MechCoder MechCoder changed the title [MRG+!] SKF raises error if all n_labels for individual classes <n_folds [MRG+1] SKF raises error if all n_labels for individual classes <n_folds Jan 20, 2016
@MechCoder
Copy link
Member

ping @TomDLT to just verify?

@devashishd12
Copy link
Contributor Author

Done :)

@TomDLT
Copy link
Member

TomDLT commented Jan 21, 2016

if all the n_labels for individual classes are less than n_folds

I don't find the word n_labels very clear, it sounds like the number of classes, which is confusing.
What do you think of "if all the classes have less than n_folds samples"?

@devashishd12
Copy link
Contributor Author

@TomDLT yeah it's concise too. Should I go ahead and make the changes?
cc: @MechCoder @rvraghav93

@raghavrv
Copy link
Member

Please do!

@devashishd12
Copy link
Contributor Author

Done!

@MechCoder
Copy link
Member

Thanks !! 🍹 🍷

@MechCoder MechCoder closed this Jan 23, 2016
@devashishd12
Copy link
Contributor Author

No problem :) Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants