-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Fix format of values in confusion matrix plot. #16159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fix will now show values up to 1e7 in integer values, and afterwards in .2g"
@jnothman Hey Joel, I made the fix name a bit clearer now. Could you help me understand why the codecov/patch and codecov/project tests are not passing? |
to check for array_equal."
counted in the if statement."
Well that certainly was a journey, but it seems to be working now. @NicolasHug What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Rick-Mackenbach , made a few comments
Just FYI, you don't need to close and open new PR, you can simply push changes to your already-existing PR. It makes it easier for us to keep track of it ;)
ha="center", va="center", | ||
color=color) | ||
|
||
if (values_format is None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No parenthesis
color = cmap_max if cm[i, j] < thresh else cmap_min | ||
|
||
self.text_[i, j] = ax.text( | ||
j, i, format(cm[i, j], values_format), | ||
ha="center", va="center", | ||
color=color) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are duplicated.
Let's put the if value_format is None
condition inside of the for loop to avoid duplication.
if (values_format is None and cm.dtype.kind == "f"): | ||
values_format = '.2g' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge this condition with the rest below? It's better to have all related logic at the same place
text_text = np.array([ | ||
t.get_text() for t in disp.text_.ravel()]) | ||
assert_array_equal(expected_text, text_text) | ||
if values_format is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a big if/else statement like this kinda defeats the purpose of parametrizing the test.
Maybe keep this test unchanged and create a new one test_confusion_matrix_text_format_default
which would only check the default behavior.
…ikit-learn into confusion_fix
I don't think that merge worked well... The diff now includes all changes. You should have been able to do something like:
not sure that'll work now. You might want to reset back to your last good commit using It might be better to lose your history by doing something like:
|
Thanks @jnothman for the fix! |
I hope it kept all your changes!
|
Seems like it did from what I can see :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks almost done!
color=color) | ||
|
||
fmt = values_format | ||
if values_format is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think trying .2g
and then trying d
if applicable, is slightly clearer
This suggestion is to address #16159 (comment)
if values_format is None:
text_cm = format(cm[i, j], '.2g')
if cm.dtype.kind != 'f':
text_d = format(cm[i, j], 'd')
if len(text_d) < len(text_cm):
text_cm = text_d
else:
text_cm = format(cm[i, j], values_format)
The test would need to updated to reflect this behavior.
@thomasjpfan Not sure if this is what you meant? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your continued work and patience on this PR.
format(cm[i, j], 'd')): | ||
fmt = '.2g' | ||
|
||
self.text_[i, j] = ax.text( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I was not more clear. The current PR may call format
three times. I am trying to push for calling it at most twice:
if values_format is None:
text_cm = format(cm[i, j], '.2g')
if cm.dtype.kind != 'f':
text_d = format(cm[i, j], 'd')
if len(text_d) < len(text_cm):
text_cm = text_d
else:
text_cm = format(cm[i, j], values_format)
self.text_[i, j] = ax.text(j, i, text_cm, ...)
@thomasjpfan My pleasure, thanks for you continuous feedback on this :). I wasn't looking correctly regarding the correction, I see now that the formatting is done in the if-statement, and you just pass this formatted text to self.text_ Should be correct now 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please add an entry to the change log at |
@thomasjpfan Not sure why the tests have failed here, could you help me out a bit? :) |
I pushed a change to fix up the PR. It should be ready now. |
Thanks @Rick-Mackenbach! |
Reference Issues/PRs
In reference to this, #16127
What does this implement/fix? Explain your changes.
The formatting of values in the plot of the confusion matrix were set to '.2g' if no other format was specified. As of now, if no format is provided by the user, the format will be 'd' if the value is <1e7, and '.2g' otherwise.