-
Notifications
You must be signed in to change notification settings - Fork 34
IMP: Add ability to adjust heatmap colorscale #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Oddant1 ! I pulled down these changes and tested this locally — all seems to work as intended.
However, these parameters make it easy to abuse or (worse yet) unintentionally mask data. I wonder whether we should at least describe this behavior in the parameter descriptions or even raise a warning if vmin/vmax are set to values internal to the value range (I am not too keen to bog down the code with a validation test so much prefer the former to the latter). @thermokarst any thoughts?
E.g., I made this plot from the moving pictures data and set vmin=0.9 and vmax=0.91:
this just causes all lower and higher values to be colored uniformly, which is very misleading. Compare to vmin/vmax="auto":
q2_sample_classifier/plugin_setup.py
Outdated
parameter_descriptions={ | ||
'truth': 'Metadata column (true values) to plot on y axis.', | ||
'missing_samples': parameter_descriptions['base']['missing_samples'], | ||
'vmin': 'The minimum color value for the heatmap', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we modify from the seaborn descriptions:
'vmin': 'The minimum value to use for anchoring the colormap. If "auto", vmin is set to the minimum value in the data.',
'vmax': 'The maximum value to use for anchoring the colormap. If "auto", vmax is set to the maximum value in the data.'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Oddant1 please see this old change needed
@nbokulich - the first thing that comes to my mind RE the unintentional masking of data is to promote that to a real mask (using the |
I suppose that works. @Oddant1 what do you think about adding a mask to |
@nbokulich and @thermokarst, if I'm reading this correctly, what's happening is it's too easy to set |
@Oddant1 thank you for asking — I scratched my head for a bit first at @thermokarst 's suggestion. There are really two problems here: vmin/vmax can lead to unintentional (or nefarious) "masking" (in my parlance) of data that are off-scale, causing two distinct issues:
masking (in seaborn parlance, not mine) will allow values lower than vmin to be dropped, so that the square appears white instead of whatever the lowest color value is. This is effectively a fix for issue 2, but unfortunately not for 1. However, issue 1 might not be our job to solve just as it's not seaborn's job; e.g., users should be responsible and pay attention, it is a mostly foolproof step, and we can't run too much of a nanny framework here. Even if a user accidentally uses an inappropriate vmin value (e.g., they are copying/pasting from an existing workflow), the masked values will make it apparent to them that there's a problem and they need to re-run that step. So this is a "good enough" fix in my mind, even if it's not perfect. |
@nbokulich so it wouldn't eliminate the issue, but it would make it obvious that the issue had occurred which would hopefully get the user to rerun and pay closer attention to their |
yep! I am also on the fence about whether we should expose a |
One way to expose it in a limited manner is to use A, B, __ = TypeMap({
(Float, Bool % Choices(False)): Visualization,
(Str % Choices('auto'), Bool): Visualization,
})
...
parameters={
'foo': A,
'bar': B
... This produces a signature like this: Let me know if you want the example expanded on. |
So in that example, |
@Oddant1 what is the status of this PR? My understanding is that you are going to refactor to toss the TypeMap and masking, but keep vmin/vmax and guard against vmin > min val and vmax < max value. Is that still the plan? (or maybe that message never made it through). I am reassigning to you to make these changes — let's connect on slack if you have questions. |
@nbokulich I'm not aware of any plan to get rid of TypeMap? Where did you message me? I probably just missed it. |
977b1bb Not sure why the tests in this commit fail (the regexes aren't matching for some reason, I'm sure I'm just missing something stupid), but is this this along the lines of what we want? |
yep! that looks like what we need — but pull out the vmin/vmax check into a separate utility function and call it as early as possible in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @Oddant1 ! just 3 tiny language tweaks and this is ready to go!
q2_sample_classifier/plugin_setup.py
Outdated
parameter_descriptions={ | ||
'truth': 'Metadata column (true values) to plot on y axis.', | ||
'missing_samples': parameter_descriptions['base']['missing_samples'], | ||
'vmin': 'The minimum color value for the heatmap', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Oddant1 please see this old change needed
q2_sample_classifier/visuals.py
Outdated
error = '' | ||
if vmin is not None: | ||
if vmin > lowest_frequency: | ||
error += ('Your vmin value must be less than or equal to the ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change to:
"vmin must be less than or equal to the lowest predicted class frequency"
q2_sample_classifier/visuals.py
Outdated
if vmax < highest_frequency: | ||
if error: | ||
error += '\n' | ||
error += ('Your vmax value must be greater than or equal to the ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change to:
"vmax must be greater than or equal to the highest predicted class frequency"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks @Oddant1 !
Closes #148 Adds ability to adjust colorscale of heatmaps derived from confusion matrices via seaborn's vmin and vmax parameters. Is this along the lines of what you want @nbokulich? Once the parameters are properly implemented I can add some tests.