Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

precision_recall_curve should raise UndefinedMetricWarning when there are no true samples #13043

Open
zjpoh opened this issue Jan 24, 2019 · 4 comments

Comments

@zjpoh
Copy link
Contributor

zjpoh commented Jan 24, 2019

Description

When precision_recall_score is calculated with no true samples, a RuntimeWarning due to division by 0 is raised and nan is returned. However, when recall_score is calculated with no true labels, a UndefinedMetricWarning due to no true samples is raised.

For consistency, I think precision_recall_score should raise the same error and return 0.

Steps/Code to Reproduce

from sklearn.metrics import precision_recall_curve, recall_score
print(precision_recall_curve([0, 0], [0, 1]))
print(recall_score([0, 0], [0, 1]))

Expected Results

/Users/.../scikit-learn/sklearn/metrics/classification.py:1293: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 due to no true samples.
(array([0., 1.]), array([0.,  0.]), array([1]))
/Users/.../scikit-learn/sklearn/metrics/classification.py:1293: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 due to no true samples.
  'recall', 'true', average, warn_for)
0.0

Actual Results

/Users/.../scikit-learn/sklearn/metrics/ranking.py:601: RuntimeWarning: invalid value encountered in true_divide
  recall = tps / tps[-1]
(array([0., 1.]), array([nan,  0.]), array([1]))
/Users/.../scikit-learn/sklearn/metrics/classification.py:1293: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 due to no true samples.
  'recall', 'true', average, warn_for)
0.0

Versions

System:
    python: 3.7.2 (default, Dec 29 2018, 00:00:04)  [Clang 4.0.1 (tags/RELEASE_401/final)]
executable: /Users/.../anaconda3/envs/sklearn-py37/bin/python
   machine: Darwin-18.2.0-x86_64-i386-64bit

BLAS:
    macros: SCIPY_MKL_H=None, HAVE_CBLAS=None
  lib_dirs: /Users/.../anaconda3/envs/sklearn-py37/lib
cblas_libs: mkl_rt, pthread

Python deps:
       pip: 18.1
setuptools: 40.6.3
   sklearn: 0.21.dev0
     numpy: 1.15.4
     scipy: 1.1.0
    Cython: 0.29.2
    pandas: 0.23.4
@jnothman
Copy link
Member

I think #8280 is meant to fix this, but it got stuck somewhere. Mind giving it a look?

@zjpoh
Copy link
Contributor Author

zjpoh commented Jan 25, 2019

Sure.

@robguinness
Copy link

Any updates on this issue? I think that average_precision_score also has the same issue. I am getting sklearn/metrics/_ranking.py:817: RuntimeWarning: invalid value encountered in true_divide when calling this function.

@cmarmo
Copy link
Member

cmarmo commented Feb 16, 2022

Related to #19085.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants