Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with discrete ListedColormaps when more than 4 colors are present #7103

Closed
3 of 5 tasks
rasbt opened this issue Sep 13, 2016 · 11 comments
Closed
3 of 5 tasks
Assignees

Comments

@rasbt
Copy link
Contributor

rasbt commented Sep 13, 2016

To help us understand and resolve your issue please check that you have provided
the information below.

  • Matplotlib version, Python version and Platform (Windows, OSX, Linux ...)
    • MacOSX
    • matplotlib 1.5.1
    • Python 3.5.2
  • How did you install Matplotlib and Python (pip, anaconda, from source ...)
    • I installed it via Miniconda and updated it to the most recent version, i.e., conda update matplotlib
  • If possible please supply a Short, Self Contained, Correct, Example
    that demonstrates the issue i.e a small piece of code which reproduces the issue
    and can be run with out any other (or as few as possible) external dependencies.
    • Code that works correctly:
from matplotlib.colors import ListedColormap
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets
from sklearn.linear_model import LogisticRegression

def plot_decision_regions(X, y, classifier, resolution=0.1):

    # setup marker generator and color map
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))+1])

    # plot the decision surface
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
                           np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha=0.4, cmap=cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())

    # plot class samples
    for idx, cl in enumerate(np.unique(y)):
        plt.scatter(x=X[y == cl, 0], y=X[y == cl, 1],
                    alpha=0.8, c=colors[idx],
                    marker=markers[idx], label=cl)


# Loading some example data
iris = datasets.load_iris()
X = iris.data[:, [0,2]]
y = iris.target
y = np.concatenate((y, np.ones(50)+2))
y = y.astype(int)
X = np.concatenate((X, X[:50]*2))

lr = LogisticRegression(solver='newton-cg', multi_class='multinomial')
lr.fit(X, y)
plot_decision_regions(X, y, classifier=lr)

a0df98a0-799e-11e6-9b0f-4cbc79ed396a

  • The error occurs if 4 distinct regions are present:
iris = datasets.load_iris()
X = iris.data[:, [0,2]]
y = iris.target
y = np.concatenate((y, np.ones(50)+2, np.ones(50)+3))
y = y.astype(int)
X = np.concatenate((X, X[:50]*2, X[:50]*3))

lr = LogisticRegression(solver='newton-cg', multi_class='multinomial')
lr.fit(X, y)
plot_decision_regions(X, y, classifier=lr)

c4ba8654-799e-11e6-9bb1-44dc26175544

However, a continuous colormap works fine, e.g., viridis. So I suspect that there's a bug in ListedColormap

I.e., using the colormap viridis in plt.contourf(xx1, xx2, Z, alpha=0.4, cmap='viridis') produces expected results:

d90ee3a4-79a1-11e6-8f26-26c5e1d3e985-1

  • If this is an image generation bug attach a screenshot demonstrating the issue.
    • I attached the images above after the respective code examples
  • If this is a regression (Used to work in an earlier version of Matplotlib), please
    note where it used to work.
@efiring
Copy link
Member

efiring commented Sep 13, 2016

That's a pretty long and complicated SSCCE!
I think the problem here is a misunderstanding of how color mapping works. See http://matplotlib.org/users/colormapnorms.html. If your Z is, or can be converted to, a sequence of integers starting at 0, so that it can index directly into the colormap's lookup table, then you could use norm=matplotlib.colors.NoNorm.

@rasbt
Copy link
Contributor Author

rasbt commented Sep 14, 2016

Thanks for looking into this, I really appreciate it. Sorry about the lengthy example, I tried to come up with a smaller toy example but could reproduce the issue I encountered that was present in the longer toy example that I posted. The reason why I thought that this was a bug was that it seemed to work if less than 5 unique integers were present in the Z array.

I tried your suggestions but couldn't get it to work quite yet.
Like you suggested, Z is already an integer array of integers 0-4

>>> Z.dtype
int64
 >>> np.unique(Z)
[0 1 2 3 4]

I tried the NoNorm like so

plt.contourf(xx1, xx2, Z, 
             alpha=0.4, 
             cmap=cmap, 
             norm=matplotlib.colors.NoNorm(vmin=0, vmax=4, clip=True))

(tried both clip=True and clip=False), but somehow it always results in the following figure:

unknown

I may have to read more on this and the alternative solutions. Right now, pcolormesh seems to work fine for this application although less visually appealing.

Sorry for the additional trouble, but if you think that this behavior is indeed not a bug, please close this issue.

@tacaswell tacaswell added this to the 2.0.1 (next bug fix release) milestone Sep 14, 2016
@tacaswell
Copy link
Member

At a minimum it should be better documented.

@rasbt
Copy link
Contributor Author

rasbt commented Sep 14, 2016

@tacaswell Thanks for the feedback!

At a minimum it should be better documented.

Are you referring to the ListerColormap or my lengthy example? I'd be happy to add more comments if necessary!

@tacaswell
Copy link
Member

Sorry, I meant ListedColormap and NoNorm, not your example.

@rasbt
Copy link
Contributor Author

rasbt commented Sep 14, 2016

Oh, I see! So this is indeed expected behavior (and I was just lucky that it worked for the first example (up to 4 integers in that array)? I'll look into other solutions then. I wish I could help with the documentation of ListedColormap & NoNorm but I still don't have a good understanding of how it's supposed to work.

@tacaswell
Copy link
Member

I am also not sure if this is the expected behavior, hence it should be documented better ;)

You may want to look at BoundaryNorm here as well.

@efiring
Copy link
Member

efiring commented Sep 14, 2016

Sorry, I was not thinking about the contourf case even though it was right in front of me.

y = np.arange(50)
x = np.arange(40)
X, Y = np.meshgrid(x, y)
Z = ((X + Y)//10) % 5
cmap = mpl.colors.ListedColormap(['r', 'g', 'b', 'c', 'm'])
plt.contourf(X, Y, Z, levels = np.arange(Z.max() + 2) - 0.5, cmap=cmap)

This illustrates contouring when you have a listed colormap of 5 colors, and you have 5 consecutive integer values starting from zero. With contouring, it is almost always a good idea to supply the levels (the contour boundaries) explicitly.

@rasbt
Copy link
Contributor Author

rasbt commented Sep 14, 2016

@efiring great, it works perfectly. Thanks a lot!

@tacaswell
Copy link
Member

@efiring Is there obvious documenting to be done here or should this be closed with no action?

@efiring
Copy link
Member

efiring commented Sep 15, 2016

I was forgetting that the easy way to handle this is to let contourf take care of making the ListedColormap and the norm. See the middle example in http://matplotlib.org/examples/pylab_examples/contourf_demo.html. All it requires is the list of colors and a corresponding list of boundaries ('levels'). So the real need here was for better documentation about how to use contourf in various situations. Essentially, there is a major section of the the User's Guide that never got written, covering contouring and pcolor-like plots. (I think I was supposed to write it, long ago...) Most of the information needed is probably scattered among the examples and some of the existing user docs, but "scattered" is the operative word.
For ListedColormap itself, there is one example illustrating its use with BoundaryNorm, but it is for making a standalone colorbar, which one would rarely if ever do: http://matplotlib.org/examples/api/colorbar_only.html. This is not relevant to contouring, but it is relevant to pcolor-type plots.

@efiring efiring closed this as completed Sep 15, 2016
@ghost ghost assigned efiring Sep 15, 2016
@QuLogic QuLogic modified the milestones: unassigned, 2.0.1 (next bug fix release) Dec 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants