Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jaccard UndefinedMetricWarning gives bad advice #17826

crypdick opened this issue Jul 3, 2020 · 4 comments · Fixed by #17866

Jaccard UndefinedMetricWarning gives bad advice #17826

crypdick opened this issue Jul 3, 2020 · 4 comments · Fixed by #17866


Copy link

crypdick commented Jul 3, 2020

Describe the bug

If you run jaccard with no true positives or predicted true labels, it gives the following warning:

sklearn/metrics/ UndefinedMetricWarning:
  Jaccard is ill-defined and being set to 0.0 in labels with no true or predicted samples. Use `zero_division` parameter to control this behavior.

I tried to heed the warning's advice. I tried adding zero_division=1 but then get this error:

 TypeError: jaccard_score() got an unexpected keyword argument 'zero_division'

Given the first warning, I expect jaccard to have a zero_division kwarg.


import sklearn; sklearn.show_versions()                                                       

    python: 3.7.4 (default, Aug 13 2019, 20:35:49)  [GCC 7.3.0]
executable: /home/richard/src/anaconda3/bin/python
   machine: Linux-5.4.0-37-generic-x86_64-with-debian-bullseye-sid

Python dependencies:
          pip: 20.1.1
   setuptools: 41.4.0
      sklearn: 0.23.1
        numpy: 1.17.2
        scipy: 1.5.0
       Cython: 0.29.13
       pandas: 0.25.1
   matplotlib: 3.1.1
       joblib: 0.13.2
threadpoolctl: 2.1.0

Built with OpenMP: True
@crypdick crypdick added the Bug: triage Reported bugs that are not confirmed label Jul 3, 2020
Copy link
Contributor Author

crypdick commented Jul 3, 2020

Suggested patch:

  Index: ml/lib/python3.7/site-packages/sklearn/metrics/
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
--- ml/lib/python3.7/site-packages/sklearn/metrics/	(date 1593790417022)
+++ ml/lib/python3.7/site-packages/sklearn/metrics/	(date 1593790417022)
@@ -680,7 +680,7 @@
 def jaccard_score(y_true, y_pred, labels=None, pos_label=1,
-                  average='binary', sample_weight=None):
+                  average='binary', sample_weight=None, zero_division='warn'):
     """Jaccard similarity coefficient score
     The Jaccard index [1], or Jaccard similarity coefficient, defined as
@@ -803,7 +803,7 @@
         denominator = np.array([denominator.sum()])
     jaccard = _prf_divide(numerator, denominator, 'jaccard',
-                          'true or predicted', average, ('jaccard',),)
+                          'true or predicted', average, ('jaccard',), zero_division=zero_division)
     if average is None:
         return jaccard
     if average == 'weighted':

Copy link

jnothman commented Jul 4, 2020 via email

@adrinjalali adrinjalali added Bug and removed Bug: triage Reported bugs that are not confirmed labels Jul 5, 2020
Copy link

Is there any objection if I work on this?

Copy link
Contributor Author

crypdick commented Jul 8, 2020

@josephwillard please do! I am too busy at work to write unit tests anytime soon. I just created a PR with a partial fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

Successfully merging a pull request may close this issue.

4 participants