Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG fix behaviour in confusion_matrix with with empty array-like as input #16442

Merged
merged 12 commits into from Feb 23, 2020
Merged

BUG fix behaviour in confusion_matrix with with empty array-like as input #16442

merged 12 commits into from Feb 23, 2020

Conversation

parsons-kyle-89
Copy link
Contributor

@parsons-kyle-89 parsons-kyle-89 commented Feb 14, 2020

Reference Issues/PRs

Fixes #16432

What does this implement/fix? Explain your changes.

Currently on master confusion_matrix raises a ValueError when y_true and y_pred are empty and labels is not None.

from sklearn.metrics import confusion_matrix

confusion_matrix([], [])
# returns a 0x0 numpy array of dtype int64

confusion_matrix([], [], labels=[True, False])
# raises ValueError

With the change in this PR, the second case will return an nxn numpy array of zeros. This is correct with respect to the definition of confusion matrix given in the documentation.

Any other comments?

This change should be especially useful to users using confusion_matrix in automated reporting. Automated reports will sometimes (correctly) be run on datasets of size 0. In this case allowing confusion_matrix to correctly return an nxn matrix of zeros rather than raising will eliminate the need for the user to write a special case.

Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the following:

  • an entry to the change log at doc/whats_new/v0.23.rst under bug fixes. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:
  • a similar test to check that plot_confusion_matrix is giving the appropriate result as well. The file is located in sklearn/metrics/_plot/tests/test_plot_confusion_matrix.py.

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved
sklearn/metrics/tests/test_classification.py Outdated Show resolved Hide resolved
sklearn/metrics/tests/test_classification.py Outdated Show resolved Hide resolved
@glemaitre glemaitre added this to REVIEWED AND WAITING FOR CHANGES in Guillaume's pet Feb 14, 2020
parsons-kyle-89 and others added 6 commits February 14, 2020 17:52
Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>
…d' of github.com:parsons-kyle-89/scikit-learn into fix-confusion_matrix-0-observations-and-labels-specified
@parsons-kyle-89
Copy link
Contributor Author

* a similar test to check that `plot_confusion_matrix` is giving the appropriate result as well. The file is located in `sklearn/metrics/_plot/tests/test_plot_confusion_matrix.py`.

I wasn't quite sure what form such a test should have, but I copied the form of another test of text color. Let me know if you think there should be a more thorough test here.

@glemaitre glemaitre self-requested a review February 15, 2020 08:43
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last changes otherwise LGTM.

sklearn/metrics/_plot/tests/test_plot_confusion_matrix.py Outdated Show resolved Hide resolved
sklearn/metrics/tests/test_classification.py Show resolved Hide resolved
sklearn/metrics/tests/test_classification.py Outdated Show resolved Hide resolved
@glemaitre glemaitre moved this from REVIEWED AND WAITING FOR CHANGES to TO BE MERGED in Guillaume's pet Feb 17, 2020
@glemaitre glemaitre moved this from TO BE MERGED to REVIEWED AND WAITING FOR CHANGES in Guillaume's pet Feb 19, 2020
@glemaitre glemaitre self-assigned this Feb 23, 2020
@glemaitre
Copy link
Contributor

I pushed my last comments and split one of the test which focus only on errors raised in confusion_matrix. I will merge if CIs turn green.

@glemaitre glemaitre moved this from REVIEWED AND WAITING FOR CHANGES to TO BE MERGED in Guillaume's pet Feb 23, 2020
@glemaitre glemaitre changed the title [MRG] Fix ValueError in confusion_matrix when y_true / y_pred are length zero and labels are not None BUG fix behaviour in confusion_matrix with with empty list as input Feb 23, 2020
@glemaitre glemaitre changed the title BUG fix behaviour in confusion_matrix with with empty list as input BUG fix behaviour in confusion_matrix with with empty array-like as input Feb 23, 2020
@glemaitre glemaitre merged commit 7366a5a into scikit-learn:master Feb 23, 2020
@glemaitre
Copy link
Contributor

@parsons-kyle-89 Thank for raising and contributing

@glemaitre glemaitre moved this from TO BE MERGED to MERGED in Guillaume's pet Feb 23, 2020
@parsons-kyle-89
Copy link
Contributor Author

@parsons-kyle-89 Thank for raising and contributing

👍 Glad to contribute. Thanks for being responsive and getting it over the finish line.

panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

confusion_matrix errors with empty y_true/y_pred
3 participants