[MRG+1] fix P/R/F for truncated range(n_labels) #10377

gxyd · 2017-12-27T16:39:20Z

What does this implement/fix? Explain your changes.

for ex. if n_labels = 5, then passing labels = [0, 1, 2]
will give results similar to labels = [0, 1, 2, 3, 4], neglecting
the value of labels.

Currently in master branch:

>>> y_true = np.array([[0, 1, 1], [1, 0, 0]])
>>> y_pred = np.array([[1, 1, 1], [1, 0, 1]])
>>> precision_recall_fscore_support(y_true, y_pred, average='samples', labels=[0, 1])
(0.58333333333333326, 1.0, 0.73333333333333339, None)
>>> precision_recall_fscore_support(y_true, y_pred, average='samples', labels=[1, 0])
(0.75, 1.0, 0.83333333333333326, None)

I'm not sure if this will require any changes to test_common, since labels=[0, 1] and labels=[1, 0] should give the same result for average='samples' (atleast for average='samples', haven't thought about other averages), this is similar to commutative property w.r.t. labels.

for ex. if n_labels = 5, then passing labels = [0, 1, 2] will give results similar to labels = [0, 1, 2, 3, 4], neglecting the value of labels

jnothman

otherwise LGTM

jnothman · 2017-12-31T09:01:57Z

sklearn/metrics/tests/test_classification.py

@@ -197,6 +197,13 @@ def test_precision_recall_f_extra_labels():
        assert_raises(ValueError, recall_score, y_true_bin, y_pred_bin,
                      labels=np.arange(-1, 4), average=average)

+    y_true = np.array([[0, 1, 1], [1, 0, 0]])


Because it's hard to see what this is testing, it would be good to add a comment saying that it tests non-regression on issue #xxx.

There isn't any issue for this, should I instead refer this PR itself?

jnothman · 2017-12-31T09:02:40Z

Please add an entry to the change log at doc/whats_new/v0.20.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

gxyd · 2017-12-31T09:26:59Z

I've made the asked change, though I've commented #PR number instead of #issue_number. I think that is fine as well.

gxyd · 2017-12-31T20:01:22Z

Fixes #10307

Fixes scikit-learn#10307

jnothman · 2017-12-31T23:06:42Z

Flake8 errors?

gxyd · 2018-01-01T05:20:53Z

Doesn't it seem like so to me. It says:

 l.append(self.Function(name, self, args=args, callobj=call))
/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
/home/travis/build/scikit-learn/scikit-learn/sklearn/utils/deprecation.py:58: DeprecationWarning: Class VBGMM is deprecated; The `VBGMM` class is not working correctly and it's better to use `sklearn.mixture.BayesianGaussianMixture` class with parameter `weight_concentration_prior_type='dirichlet_distribution'` instead. VBGMM is deprecated in 0.18 and will be removed in 0.20.
  warnings.warn(msg, category=DeprecationWarning)
collecting 7727 items/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
collecting 7999 items/home/travis/miniconda/envs/testenv/lib/python3.4/site-packages/_pytest/python.py:625: PendingDeprecationWarning: This usage is deprecated, please use pytest.Function instead
  l.append(self.Function(name, self, args=args, callobj=call))
collected 8423 items 

....repeated lines

The job exceeded the maximum time limit for jobs, and has been terminated.

Here is the link to failing travis job https://travis-ci.org/scikit-learn/scikit-learn/jobs/323659077 . Some random failure?

jnothman · 2018-01-01T06:09:38Z

sorry i didn't check properly. Probably just needs a restart

glemaitre · 2018-01-11T00:01:17Z

sklearn/metrics/tests/test_classification.py

+    p, r, f, _ = precision_recall_fscore_support(y_true, y_pred,
+                                                 average='samples',
+                                                 labels=[0, 1])
+    assert_almost_equal(np.array([p, r, f]), np.array([3. / 4, 1., 5. / 6]))


we have imported division from future on the top so let's remove the useless . for the float.

glemaitre · 2018-01-11T00:20:49Z

@gxyd Thanks!!!

gxyd · 2018-01-11T04:38:24Z

@glemaitre was it necessary the changes in commit that you added?

gxyd · 2018-01-11T10:07:16Z

@glemaitre never mind. I just saw your comment here #10377 (comment) .

Thanks @jnothman @glemaitre for review.

glemaitre · 2018-01-11T10:36:52Z

@gxyd I decided that it was not worth to make a round of change/review for such a nitpick :)
Regarding the change itself, I would think that we should try writing python 3 code which is back-compatible but this is just a detail.

jnothman · 2018-01-11T11:12:15Z

it's backward compatible with `__future__.division`, which is more than sufficient for tests and library code, though you're right for examples where we expect users to cut and paste snippets

fix P/R/F for truncated range(n_labels)

f9836db

for ex. if n_labels = 5, then passing labels = [0, 1, 2] will give results similar to labels = [0, 1, 2, 3, 4], neglecting the value of labels

gxyd changed the title ~~fix P/R/F for truncated range(n_labels)~~ [WIP] fix P/R/F for truncated range(n_labels) Dec 27, 2017

gxyd changed the title ~~[WIP] fix P/R/F for truncated range(n_labels)~~ [MRG] fix P/R/F for truncated range(n_labels) Dec 27, 2017

jnothman approved these changes Dec 31, 2017

View reviewed changes

jnothman changed the title ~~[MRG] fix P/R/F for truncated range(n_labels)~~ [MRG+1] fix P/R/F for truncated range(n_labels) Dec 31, 2017

gxyd mentioned this pull request Dec 31, 2017

BUG Inconsistent f1_score behavior when combining label indicator input with labels attribute #10307

Closed

gxyd force-pushed the prf-bug branch from 62f164f to dc7c038 Compare December 31, 2017 20:02

add entry in docs/whats_new

6d0b191

Fixes scikit-learn#10307

gxyd force-pushed the prf-bug branch from dc7c038 to 6d0b191 Compare December 31, 2017 20:03

update issue number

446ffa3

glemaitre reviewed Jan 11, 2018

View reviewed changes

COSMIT use python 3 style

c19d145

glemaitre merged commit 60b0cf8 into scikit-learn:master Jan 11, 2018

glemaitre mentioned this pull request Jan 11, 2018

FIX <BUG Inconsistent f1_score behavior when combining label indicator input with labels attribute #10307> #10333

Closed

gxyd deleted the prf-bug branch January 11, 2018 04:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] fix P/R/F for truncated range(n_labels) #10377

[MRG+1] fix P/R/F for truncated range(n_labels) #10377

gxyd commented Dec 27, 2017 •

edited

Loading

jnothman left a comment

jnothman Dec 31, 2017

gxyd Dec 31, 2017

jnothman commented Dec 31, 2017

gxyd commented Dec 31, 2017 •

edited

Loading

gxyd commented Dec 31, 2017

jnothman commented Dec 31, 2017

gxyd commented Jan 1, 2018

jnothman commented Jan 1, 2018 via email

glemaitre Jan 11, 2018

glemaitre commented Jan 11, 2018

gxyd commented Jan 11, 2018

gxyd commented Jan 11, 2018

glemaitre commented Jan 11, 2018

jnothman commented Jan 11, 2018 via email

[MRG+1] fix P/R/F for truncated range(n_labels) #10377

[MRG+1] fix P/R/F for truncated range(n_labels) #10377

Conversation

gxyd commented Dec 27, 2017 • edited Loading

What does this implement/fix? Explain your changes.

jnothman left a comment

Choose a reason for hiding this comment

jnothman Dec 31, 2017

Choose a reason for hiding this comment

gxyd Dec 31, 2017

Choose a reason for hiding this comment

jnothman commented Dec 31, 2017

gxyd commented Dec 31, 2017 • edited Loading

gxyd commented Dec 31, 2017

jnothman commented Dec 31, 2017

gxyd commented Jan 1, 2018

jnothman commented Jan 1, 2018 via email

glemaitre Jan 11, 2018

Choose a reason for hiding this comment

glemaitre commented Jan 11, 2018

gxyd commented Jan 11, 2018

gxyd commented Jan 11, 2018

glemaitre commented Jan 11, 2018

jnothman commented Jan 11, 2018 via email

gxyd commented Dec 27, 2017 •

edited

Loading

gxyd commented Dec 31, 2017 •

edited

Loading