precision_recall_curve should raise UndefinedMetricWarning when there are no true samples #13043

zjpoh · 2019-01-24T18:14:04Z

Description

When precision_recall_score is calculated with no true samples, a RuntimeWarning due to division by 0 is raised and nan is returned. However, when recall_score is calculated with no true labels, a UndefinedMetricWarning due to no true samples is raised.

For consistency, I think precision_recall_score should raise the same error and return 0.

Steps/Code to Reproduce

from sklearn.metrics import precision_recall_curve, recall_score
print(precision_recall_curve([0, 0], [0, 1]))
print(recall_score([0, 0], [0, 1]))

Expected Results

/Users/.../scikit-learn/sklearn/metrics/classification.py:1293: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 due to no true samples.
(array([0., 1.]), array([0.,  0.]), array([1]))
/Users/.../scikit-learn/sklearn/metrics/classification.py:1293: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 due to no true samples.
  'recall', 'true', average, warn_for)
0.0

Actual Results

/Users/.../scikit-learn/sklearn/metrics/ranking.py:601: RuntimeWarning: invalid value encountered in true_divide
  recall = tps / tps[-1]
(array([0., 1.]), array([nan,  0.]), array([1]))
/Users/.../scikit-learn/sklearn/metrics/classification.py:1293: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 due to no true samples.
  'recall', 'true', average, warn_for)
0.0

Versions

System:
    python: 3.7.2 (default, Dec 29 2018, 00:00:04)  [Clang 4.0.1 (tags/RELEASE_401/final)]
executable: /Users/.../anaconda3/envs/sklearn-py37/bin/python
   machine: Darwin-18.2.0-x86_64-i386-64bit

BLAS:
    macros: SCIPY_MKL_H=None, HAVE_CBLAS=None
  lib_dirs: /Users/.../anaconda3/envs/sklearn-py37/lib
cblas_libs: mkl_rt, pthread

Python deps:
       pip: 18.1
setuptools: 40.6.3
   sklearn: 0.21.dev0
     numpy: 1.15.4
     scipy: 1.1.0
    Cython: 0.29.2
    pandas: 0.23.4

The text was updated successfully, but these errors were encountered:

jnothman · 2019-01-25T06:33:30Z

I think #8280 is meant to fix this, but it got stuck somewhere. Mind giving it a look?

zjpoh · 2019-01-25T16:12:30Z

Sure.

robguinness · 2021-04-29T12:45:15Z

Any updates on this issue? I think that average_precision_score also has the same issue. I am getting sklearn/metrics/_ranking.py:817: RuntimeWarning: invalid value encountered in true_divide when calling this function.

cmarmo · 2022-02-16T22:40:39Z

Related to #19085.

zjpoh mentioned this issue Jan 24, 2019

Precision-Recall Curve does not show class labels DistrictDataLabs/yellowbrick#601

Closed

cmarmo added Enhancement module:metrics labels Feb 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

precision_recall_curve should raise UndefinedMetricWarning when there are no true samples #13043

precision_recall_curve should raise UndefinedMetricWarning when there are no true samples #13043

zjpoh commented Jan 24, 2019

jnothman commented Jan 25, 2019

zjpoh commented Jan 25, 2019

robguinness commented Apr 29, 2021

cmarmo commented Feb 16, 2022

precision_recall_curve should raise UndefinedMetricWarning when there are no true samples #13043

precision_recall_curve should raise UndefinedMetricWarning when there are no true samples #13043

Comments

zjpoh commented Jan 24, 2019

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

jnothman commented Jan 25, 2019

zjpoh commented Jan 25, 2019

robguinness commented Apr 29, 2021

cmarmo commented Feb 16, 2022