Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qualitative colormaps re-map discrete color values #21786

Open
jonahpearl opened this issue Nov 28, 2021 · 7 comments
Open

Qualitative colormaps re-map discrete color values #21786

jonahpearl opened this issue Nov 28, 2021 · 7 comments

Comments

@jonahpearl
Copy link

Bug summary

When I started using qualitative color maps, I expected to be able to pass integers into the c argument of plt.scatter() like keys in a dictionary, basically. Integer in, color out, same relationship every time. Instead, the values get remapped based on min/max, same as quantitative colormaps. I think this is kind of silly: the whole point of a qualitative color map is that you can pick distinct values and pass them in to represent some kind of category, without worrying about the scale of the numbers. Indeed, this could even be misleading: if you pass in, say, color values (integers) ranging from 1-20 to the Set1 colormap, it will group some of them to be the same color, when imo it should raise an error and tell you that you picked a colormap that doesn't have enough colors.

I suspect I'm opening up a can of worms here, but a quick search didn't reveal any prior discussion, so figured I'd ask.

Also: I do realize that I can get the desired behavior by passing in actual rgb values for each point. But it took me a while to figure out why the heck the above strategy wasn't working, and I want to save everyone else that time!

Code for reproduction

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(3)
y = x
c = np.array([0,1,2])
plt.subplot(121)
plt.scatter(x,y,c=c, cmap='Set1')  # weird
plt.subplot(122)
plt.scatter(x,y,c=c, cmap='Set1', vmin=0, vmax=8) # the expected behavior

Actual outcome

image

Expected outcome

See above

Additional information

No response

Operating system

OSX

Matplotlib Version

3.4.3

Matplotlib Backend

module://matplotlib_inline.backend_inline

Python version

3.9.7

Jupyter version

6.4.6

Installation

conda

@story645
Copy link
Member

I suspect I'm opening up a can of worms here, but a quick search didn't reveal any prior discussion, so figured I'd ask.

There's a related discussion around categorical colormaps that I think boils down to Matplotlib doing colormapping in two stages- norm to 0-1 and then cmap to a continuous colormapping-when category/discrete value based colormapping is a one stage process. Long term the goal is for it to be addressed in a library re-architecture; we've been doing a few years of research on what that would be like & implementation phase should start next year.

Short term, I think it could be useful to have an explicit "DiscreteNorm" that says every value is discrete and should be mapped to a distinct color. Especially if paired with metadata on colormaps that identifies them as discrete or continuous, which is I think a thing we don't have either.

@jklymak
Copy link
Member

jklymak commented Nov 29, 2021

We already have a discrete norm - BoundaryNorm.

@story645
Copy link
Member

BoundaryNorm requires bins - and yes you can construct fake bins to include all your data, but then you also have to shift the colorbar ticks & labels to be in the middle & so isn't quite the same. I thought NoNorm would/should do the trick but it didn't.

@jklymak
Copy link
Member

jklymak commented Nov 29, 2021

Matplotlib has a very flexible continuous colormap model. If someone wants to make a DiscreteNorm, and add the hooks to colorbar to make it work, they should feel free to design and implement that.

OTOH, I don't think there is anything we should do to automatically decide to use that norm, at least with integers. Perhaps if the user supplied an object array of strings - I could see something automatically happening there, as per @dstansby's PR #20962 and of course they should be allowed to choose that norm manually if they want an array of integers.

@dstansby
Copy link
Member

Matplotlib has a very flexible continuous colormap model. If someone wants to make a DiscreteNorm, and add the hooks to colorbar to make it work, they should feel free to design and implement that.

I thought I'd echo that this sounds like a good idea to me 👍

OTOH, I don't think there is anything we should do to automatically decide to use that norm, at least with integers. Perhaps if the user supplied an object array of strings - I could see something automatically happening there, as per @dstansby's PR #20962 and of course they should be allowed to choose that norm manually if they want an array of integers.

Yes, I think my PR would solve this, but it's been a while since I looked at it, and I don't think it has any chance of being merged in the near future.

@tacaswell
Copy link
Member

There is also https://matplotlib.org/stable/api/_as_gen/matplotlib.colors.NoNorm.html which as the name suggests just passes the values through (but it has to return ints and you have to make sure they play well with the colormap LUT).

@krabo0om
Copy link

Just encountered the same issue as described by OP. I also used NoNorm but as stated by @tacaswell, it does not work as expected because values are converted from int to float. Passing vmin and vmax without any norm attribute solves the issue but is somewhat counter intuitive and not very elegant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants