[MRG] Raise ValueError when trainset is empty in CVSplitters #12861

NicolasHug · 2018-12-24T21:56:27Z

Reference Issues/PRs

Closes #11028, following #11028 (comment)

What does this implement/fix? Explain your changes.

ValueError is raised in the CVSplitters if the resulting train set is empty. As far as I can tell, the test sets are never empty so no need to check.

No need to check either for the KFolds splitters or some *Group* splitters, which have internal checks preventing the train set to be empty.

Any other comments?

thomasjpfan

Cosmetics

thomasjpfan · 2019-02-13T16:08:23Z

sklearn/model_selection/tests/test_split.py

+def test_shuffle_split_empty_trainset(CVSplitter):
+    cv = CVSplitter(test_size=.99)
+    X, y = [[1]], [0]  # 1 sample
+    assert_raises_regexp(


We seem to be moving to using pytest.raises for testing exceptions.

thomasjpfan · 2019-02-13T16:16:10Z

sklearn/model_selection/_split.py

+                'p={} must be strictly less than the number of '
+                'samples={}'.format(self.p, n_samples)
+            )
+        for combination in combinations(range(n_samples), self.p):
            yield np.array(combination)


Since we are python >= 3 now, maybe:

yield from map(np.array, combinations(range(n_samples), self.p))

I'd rather not, since this is not related to the PR

thomasjpfan

LGTM

NicolasHug · 2019-03-21T16:34:45Z

This is +2, @thomasjpfan can you merge this?

thomasjpfan · 2019-03-22T01:33:35Z

sklearn/model_selection/tests/test_split.py

+            ValueError,
+            match='With n_samples=1, test_size=0.99 and train_size=None, '
+            'the resulting train set will be empty'):
+        train_test_split(X, test_size=.99)


Given that train_test_split is a commonly used function, a non-trival test such as X=[[1],[1],[1]] and test_size=0.67 would be useful.

…arn#12861)

…cikit-learn#12861)" This reverts commit 9314ff7.

…arn#12861)

NicolasHug added 2 commits December 24, 2018 16:50

Raise ValueError when trainset is empty in CVSplitters

874589f

Updated whatsnew

5a178db

thomasjpfan reviewed Feb 13, 2019

View reviewed changes

used pytest.raises

bd801d2

thomasjpfan approved these changes Feb 13, 2019

View reviewed changes

jnothman approved these changes Feb 13, 2019

View reviewed changes

Merge branch 'master' into cv_iterators_empty_set

bb11b8d

thomasjpfan reviewed Mar 22, 2019

View reviewed changes

added 2/3rd test

d4554bf

thomasjpfan merged commit 2718d62 into scikit-learn:master Mar 23, 2019

NicolasHug mentioned this pull request Apr 16, 2019

[MRG+2] Faster Gradient Boosting Decision Trees with binned features #12807

Merged

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

ENH Raise ValueError when trainset is empty in CVSplitters (scikit-le…

9314ff7

…arn#12861)

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "ENH Raise ValueError when trainset is empty in CVSplitters (s…

c3647b7

…cikit-learn#12861)" This reverts commit 9314ff7.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "ENH Raise ValueError when trainset is empty in CVSplitters (s…

a27bf00

…cikit-learn#12861)" This reverts commit 9314ff7.

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

ENH Raise ValueError when trainset is empty in CVSplitters (scikit-le…

e66453e

…arn#12861)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Raise ValueError when trainset is empty in CVSplitters #12861

[MRG] Raise ValueError when trainset is empty in CVSplitters #12861

NicolasHug commented Dec 24, 2018

thomasjpfan left a comment

thomasjpfan Feb 13, 2019

thomasjpfan Feb 13, 2019

NicolasHug Feb 13, 2019 •

edited

Loading

thomasjpfan left a comment

NicolasHug commented Mar 21, 2019

thomasjpfan Mar 22, 2019

[MRG] Raise ValueError when trainset is empty in CVSplitters #12861

[MRG] Raise ValueError when trainset is empty in CVSplitters #12861

Conversation

NicolasHug commented Dec 24, 2018

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

thomasjpfan left a comment

Choose a reason for hiding this comment

thomasjpfan Feb 13, 2019

Choose a reason for hiding this comment

thomasjpfan Feb 13, 2019

Choose a reason for hiding this comment

NicolasHug Feb 13, 2019 • edited Loading

Choose a reason for hiding this comment

thomasjpfan left a comment

Choose a reason for hiding this comment

NicolasHug commented Mar 21, 2019

thomasjpfan Mar 22, 2019

Choose a reason for hiding this comment

NicolasHug Feb 13, 2019 •

edited

Loading