Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add parameter in colorbar in plot_confusion_matrix #17192

Merged
merged 16 commits into from Jun 26, 2020

Conversation

avigupta2612
Copy link
Contributor

Reference Issues/PRs

Added feature to make colorbar optional as mentioned in #16519

What does this implement/fix? Explain your changes.

  • Added an argument colorbar in plot_confusion_matrix function and Set the colorbar to True by Default
  • Made same changes to the Confusion matrix display class's plot function and changed
    '
    fig.colorbar(self.im_, ax=ax)
    '
    to
    '
    if self.colorbar:
    fig.colorbar(self.im_, ax=ax)
    '

Any other comments?

@avigupta2612
Copy link
Contributor Author

Can anyone tell what what went wrong in this and how can I correct this?

@cmarmo
Copy link
Member

cmarmo commented May 12, 2020

Hi @avigupta2612, your code have a linting issue.
scikit-learn use lint flake8 directives: in order to check your code locally you can run

$ flake8 <your_modified_file>.py

Hope this can help.

@avigupta2612
Copy link
Contributor Author

So this is my first PR. Plot_confusion_matrix tests are failing. So, do I need to edit the tests? If so, how can I do it?

@cmarmo
Copy link
Member

cmarmo commented May 12, 2020

Tests related to each module of scikit-learn are in sklearn/<name_of_the_module>/tests/ directory. The build logs (e.g. this one) tell you which test is failing and the specific error.
In particular, you are in the process of adding an argument to plot_confusion_matrix, you should then check where in the tests this argument needs to be explicitly added.

Copy link
Member

@cmarmo cmarmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @avigupta2612! Also, please add a new entry in the "what's new" file for version 0.24, describing your addition.

@@ -70,6 +71,7 @@ def plot(self, *, include_values=True, cmap='viridis',
Axes object to plot on. If `None`, a new figure and axes is
created.

colorbar : matplotlib colorbar, default=True
Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check the rendering of the documentation you see that you need a blank line before the "Returns" paragraph in order to keep the correct indentation.

@@ -70,6 +71,7 @@ def plot(self, *, include_values=True, cmap='viridis',
Axes object to plot on. If `None`, a new figure and axes is
created.

colorbar : matplotlib colorbar, default=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are adding a new parameter, a useful resource is the scikit-learn contributing documentation, in order to standardize the parameter description.
Also, could you please add a versionadded directive? You can check #15426 for examples about how this should be addressed.

@avigupta2612
Copy link
Contributor Author

Thanks @cmarmo for reviewing. I have made some changes as you suggested, please have a look at them.

@cmarmo
Copy link
Member

cmarmo commented May 13, 2020

Last comments on my side.
I think that the colorbar parameter should be described as a boolean, this is the way you are using it in the code.
I'm not sure about the label to attach to this feature in the "what's new".

Let's wait for other reviewers.
And, please, be patient: the release is just out and people are really busy fixing all comments about the upgrade.
Thanks for your work!

Copy link
Member

@amueller amueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Maybe check that by default the colorbar is not None?

doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved
Co-authored-by: Venkatachalam N <venky.yuvy@gmail.com>
@glemaitre glemaitre changed the title Added argument in plot_confusion_matrix to make colorbar optional ENH Add parameter in colorbar in plot_confusion_matrix May 27, 2020
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can improve the test mainly just to cover all possible use cases.

sklearn/metrics/_plot/confusion_matrix.py Outdated Show resolved Hide resolved
sklearn/metrics/_plot/confusion_matrix.py Outdated Show resolved Hide resolved
@@ -173,6 +173,9 @@ def test_confusion_matrix_display(pyplot, data, fitted_clf, y_pred, n_classes):
image_data = disp.im_.get_array().data
assert_allclose(image_data, cm)

disp.plot(colorbar=False)
assert disp.im_.colorbar is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should add a separate test to check the effect of the parameter.
Indeed, it is appearing in disp.plot and plot_confusion_matrix so we should check that the behaviour is correct.

@pytest.mark.parametrize("colorbar, [True, False])
def test_plot_confusion_matrix_colorbar(colorbar):
    disp = plot_confusion_matrix(..., colorbar=colorbar)
    assert disp.im_.colorbar ...
    
     # attempt a plot with the opposit effect
     disp.plot(colorbar=not colorbar)
    assert disp.im_.colorbar ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be addressed.

@glemaitre glemaitre added this to REVIEWED AND WAITING FOR CHANGES in Guillaume's pet May 27, 2020
@cmarmo
Copy link
Member

cmarmo commented Jun 21, 2020

Hi @avigupta2612 , you already have one approval. Do you have some time to fix conflicts and apply last comments? Thanks for your work!

@avigupta2612
Copy link
Contributor Author

@cmarmo I made the changes as per this 9fcc4d9 commit.

@cmarmo
Copy link
Member

cmarmo commented Jun 25, 2020

@glemaitre or perhaps @thomasjpfan, all green here and one approval already. Do you mind having a look? Thanks!

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @avigupta2612 !

@@ -173,6 +173,9 @@ def test_confusion_matrix_display(pyplot, data, fitted_clf, y_pred, n_classes):
image_data = disp.im_.get_array().data
assert_allclose(image_data, cm)

disp.plot(colorbar=False)
assert disp.im_.colorbar is None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be addressed.

@avigupta2612
Copy link
Contributor Author

I created a separate test for colorbar as suggestes by @glemaitre. I am running lint test and it is working fine but when I push the changes the tests are faling can you help here @cmarmo

@cmarmo
Copy link
Member

cmarmo commented Jun 26, 2020

Some conflicts with upstream raised up, in the meanwhile. Merging with upstream (and solving conflicts), will solve your issues.

Comment on lines 237 to 238
<<<<<<< HEAD
=======
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those two lines should be removed: they are the relic of the automatic git diff with upstream

:mod:`sklearn.datasets`
.......................
- |Feature| :func:`datasets.fetch_openml` now validates md5checksum of arff
files downloaded or cached to ensure data integrity.
:pr:`14800` by :user:`Shashank Singh <shashanksingh28>` and `Joel Nothman`_.

>>>>>>> upstream/master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: to be removed.

Comment on lines 239 to 244
:mod:`sklearn.datasets`
.......................
- |Feature| :func:`datasets.fetch_openml` now validates md5checksum of arff
files downloaded or cached to ensure data integrity.
:pr:`14800` by :user:`Shashank Singh <shashanksingh28>` and `Joel Nothman`_.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed the entire block should be removed: it is already present on the top.
Modules are in alphabetical order in the what's new file.

@cmarmo
Copy link
Member

cmarmo commented Jun 26, 2020

@thomasjpfan all is green here but I did a mistake: the current what's new page contains two entries for the sklearn.datasets module (see here). I asked @avigupta2612 to remove the second one but it wasn't a duplicate. Can we merge this PR and I will propose a fix for my mistake? I don't have edit rights and I think that a lot of syncs have already be done in this PR. Thanks for your understanding.

@avigupta2612
Copy link
Contributor Author

Thanks @cmarmo for your help. The whats_new file I updated (here) and the main page here both contain same entries for sklearn.datasets module. I guess the entry I removed in the previous commit was a third entry in sklearn.datasets

@cmarmo
Copy link
Member

cmarmo commented Jun 26, 2020

Thanks @cmarmo for your help. The whats_new file I updated (here) and the main page here both contain same entries for sklearn.datasets module. I guess the entry I removed in the previous commit was a third entry in sklearn.datasets

Right, if you still have some time do you mind re-adding it at the beginning of the file? Then I hope that you don't need to re-synchronize again before approval. Thanks for your patience.

@avigupta2612
Copy link
Contributor Author

avigupta2612 commented Jun 26, 2020

@cmarmo I guess this third entry for sklearn.datasets has been updated incorrectly in the main master branch here. Should I change it accoring to the master branch or should I include it in the sklearn.datasets section?
Or If I can give you the edit rights to you tell me how to do it so that you can change it accordingly.

@cmarmo
Copy link
Member

cmarmo commented Jun 26, 2020

@cmarmo I guess this third entry for sklearn.datasets has been updated incorrectly in the main master branch here. Should I change it accoring to the master branch or should I include it in the sklearn.datasets section?

I think the best option is keep it where it is in master and fix this in another pull request. Thanks.

Or If I can give you the edit rights to you tell me how to do it so that you can change it accordingly.

You are very reactive and you know how to fix the issue, no need for that. :)

@glemaitre glemaitre self-requested a review June 26, 2020 18:22
Copy link
Contributor

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 changes and we are good to go.

@@ -230,12 +234,6 @@ Changelog
:meth:`tree.DecisionTreeRegressor.fit`, and has not effect.
:pr:`17614` by :user:`Juan Carlos Alfaro Jiménez <alfaro96>`.

:mod:`sklearn.datasets`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should reintroduce these lines (233-238) at the exact same position.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have included entry for my change at lines (145-148). So shouldn't I include skleanrn.datasets entry at lines 237-242?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are red in the diff because you removed them (probably not on purpose but when fixing a conflict). Thus, you should reinsert such that we don't see any diff on GitHub (meaning that you did not edit the lines apart of your additional entry).

Comment on lines 252 to 257
disp = plot_confusion_matrix(fitted_clf, X, y, colorbar=True)
assert disp.im_.colorbar is not None

# attempt a plot with the opposit effect
disp.plot(colorbar=False)
assert disp.im_.colorbar is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
disp = plot_confusion_matrix(fitted_clf, X, y, colorbar=True)
assert disp.im_.colorbar is not None
# attempt a plot with the opposit effect
disp.plot(colorbar=False)
assert disp.im_.colorbar is None
def _check_colorbar(disp, has_colorbar):
if has_colorbar:
assert disp.im_.colorbar is not None
assert disp.im_.colorbar.__class__.__name__ == "Colorbar"
else:
assert disp.im_.colorbar is None
disp = plot_confusion_matrix(fitted_clf, X, y, colorbar=colorbar)
_check_colorbar(disp, colorbar)
# attempt a plot with the opposite effect of colorbar
disp.plot(colorbar=not colorbar)
_check_colorbar(disp, not colorbar)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use the colorbar created by the parametrization.

@avigupta2612
Copy link
Contributor Author

@glemaitre I have made all the changes as you suggested thanks for your help.

@glemaitre glemaitre merged commit 8dc143e into scikit-learn:master Jun 26, 2020
@glemaitre
Copy link
Contributor

@avigupta2612 Thanks for the work

@glemaitre glemaitre moved this from REVIEWED AND WAITING FOR CHANGES to MERGED in Guillaume's pet Jul 7, 2020
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Jul 17, 2020
…17192)

Co-authored-by: Venkatachalam N <venky.yuvy@gmail.com>
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
…17192)

Co-authored-by: Venkatachalam N <venky.yuvy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

6 participants