Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add colormap= argument to DataFrame plotting methods #3860

Merged
merged 1 commit into from
Jun 27, 2013

Conversation

qwhelan
Copy link
Contributor

@qwhelan qwhelan commented Jun 12, 2013

I frequently plot DataFrames with a large number of columns and generally have difficulty distinguishing series due to the short cycle length of the default color scheme.

Especially in cases where the ordering of columns has significant information, the ideal way to color the series would be with a matplotlib colormap that uniformly spaces colors. This is pretty straightforward with pyplot, but pretty annoying to have to repeatedly do.

This patch modifies DataFrame plotting functions to take a colormap= argument consisting of either a str name of a matplotlib colormap or a colormap object itself.

df.cumsum().plot(colormap='jet', figsize=(10,5))

jet_10

KDE plot:

df.plot(kind='kde', colormap='jet', figsize=(10,5))

kde

Some colormaps don't work as well on a white background (the 0 column is white):
df.cumsum().plot(colormap=cm.Greens, figsize=(10,5))
greens_10

But work better for other graph types:
df.plot(kind='bar', colormap='jet', figsize=(10,5))
greens_bar

Parallel coordinates on the iris dataset:

parallel_coordinates(iris, 'Name', colormap='gist_rainbow')

iris_parallel

Andrews curves (I'd appreciate someone double checking this one; don't think I have it quite right):

andrews_curves(iris, 'Name', colormap='winter')

andrews_winter

I've included some test coverage and unified all the color creation code into one method _get_standard_colors(). I started adding to the documentation but ran into a weird issue with the sphinx plot output. When adding this to visualization.rst:

.. ipython:: python

   from matplotlib import cm

   df = DataFrame(randn(1000, 4), index=ts.index, columns=list('ABCD'))
   df = df.cumsum()

   plt.figure()

   @savefig greens.png width=6in
   df.plot(colormap=cm.Greens)

I get this output (the lines should be white->green):
greens

My first thought was that it was the options.display.mpl_style = 'default', but plots render fine in IPython with this setting. My guess is something in @savefig, but is anyone familiar with what might be happening here?

@jtratner
Copy link
Contributor

I like this:+1: on the idea of extending plotting. That said, I'm hoping long-term we follow the matplotlib convention and pass all extra kwargs to the OOP interface of matplotlib (in other words, if you get cmap=xyz, then you run set_cmap(xyz) on the axes). We'd need to keep a list of aliases around too (e.g., colormap: cmap).

Btw - What should happen when you pass an invalid colormap?

@jtratner
Copy link
Contributor

That said, I think this is a great addition!

@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 19, 2013

@jtratner My impression is that set_cmap() only applies to luminosities, not plots; see matplotlib/matplotlib#484 (comment) . I'd love to be wrong.

As for what happens when you pass a string that does not correspond to a matplotlib colormap, you just get matplotlib's error:

ValueError: Colormap foo is not recognized

@jtratner
Copy link
Contributor

@qwhelan Thanks for clarifying on this :), didn't realize that this wasn't a straight to mpl case. Can you add a test case for the failure condition?

@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 20, 2013

@jtratner Done.

I've worked out my Sphinx issue, but it'll be a few days before I can get back to it. I can do that in a separate pull if that's more convenient.

@jtratner
Copy link
Contributor

@qwhelan thanks for adding that test in. in regards to your question - probably easier to just include the changes with this pull request.

@jtratner
Copy link
Contributor

@jreback @cpcloud thoughts on this? Don't want this to get lost in the shuffle...looks like it could be useful (though I don't know if there are more changes to plotting that would make it so we'd want to wait on this). Maybe could be experimental or something?

@jreback
Copy link
Contributor

jreback commented Jun 24, 2013

I think it's natural to pass any option for is not specifically picked by pandas directly to matplotlob
doesn't that solve the problem and make this very simple?

of course pandas can be smarter and intercept certain arguments

but that is a different issue

@jtratner
Copy link
Contributor

@jreback, no, as @qwehelan points out, you have to manipulate the axis directly to do it. I do think it makes the graphs far more visually appealing.

@jreback
Copy link
Contributor

jreback commented Jun 24, 2013

we can push to 0.12 though I don't see any harm in adding now

@jtratner
Copy link
Contributor

your call - might be better to push to 0.12, in case there are other plotting changes...

@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 24, 2013

@jtratner I've added some documentation to the visualization.rst page. I don't have any further changes planned.

@jtratner
Copy link
Contributor

Thanks... I meant more along the lines of broader changes to plotting.
Anyways, @jreback knows way better than me.
On Jun 23, 2013 11:15 PM, "Christopher Whelan" notifications@github.com
wrote:

@jtratner https://github.com/jtratner I've added some documentation to
the visualization.rst page. I don't have any further changes planned.


Reply to this email directly or view it on GitHubhttps://github.com//pull/3860#issuecomment-19887246
.

@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 24, 2013

@jtratner Sorry for the confusion, I wasn't responding to your previous post. I was following up on the comment I made several days ago regarding additional changes to this pull request.

@jreback
Copy link
Contributor

jreback commented Jun 24, 2013

@qwhelan

I am fine with this...there might be other changes to plotting in 0.12, but this is 'independent' AFAIK...

can you rebase on current master?

@cpcloud
Copy link
Member

cpcloud commented Jun 24, 2013

is this compatible with mpltools?

@cpcloud
Copy link
Member

cpcloud commented Jun 25, 2013

why not call it cmap?

@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 25, 2013

@cpcloud Should be compatible, colors are being generated the exact same way. The only difference is that mpltools calls set_color_cycle() on the axis, which will give different results if you add more series to the plot (mpltools will automatically cycle, this won't; not sure which is better).

As for colormap vs cmap, I have no preference. I usually just tab-complete in IPython anyway.

@cpcloud
Copy link
Member

cpcloud commented Jun 25, 2013

@qwhelan can u rebase?

@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 26, 2013

@jreback @cpcloud Rebased.

@jreback
Copy link
Contributor

jreback commented Jun 26, 2013

@qwhelan can you hook up to travis? (see contributing in main pandas dir)

pls also add a release notes entry and in v0.11.1 (will be changed to 0.12 - as thats going to be the release version, but not merged yet)

and then squash down commits to a few

thanks

@cpcloud
Copy link
Member

cpcloud commented Jun 26, 2013

@qwhelan trying to close out issues for 0.12...status of the above?

@cpcloud
Copy link
Member

cpcloud commented Jun 26, 2013

alternatively we could push to 0.13...

@jreback
Copy link
Contributor

jreback commented Jun 26, 2013

this just needs release notes and entry in 0.12 and test on travis...

@cpcloud
Copy link
Member

cpcloud commented Jun 26, 2013

i'm running the branch on travis

@jreback
Copy link
Contributor

jreback commented Jun 26, 2013

ok...perfect

@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 26, 2013

Thanks. I can take care of the squash/release notes by 8pm PST.
On Jun 26, 2013 12:32 PM, "Phillip Cloud" notifications@github.com wrote:

i'm running the branch on travis


Reply to this email directly or view it on GitHubhttps://github.com//pull/3860#issuecomment-20073799
.

@jreback
Copy link
Contributor

jreback commented Jun 26, 2013

@qwhelan great...

you will need to rebase to master first (and see if you can setup travis for the future in any event)

Refactor color and colormap option handling

Add tests for colormap=

Add tests for colormap=

Add colormap documentation to visualization.rst

Add release notes
@qwhelan
Copy link
Contributor Author

qwhelan commented Jun 27, 2013

@jreback Added release notes, squashed, and travis is passing.

Should be good to go.

jreback added a commit that referenced this pull request Jun 27, 2013
Add colormap= argument to DataFrame plotting methods
@jreback jreback merged commit 60bb880 into pandas-dev:master Jun 27, 2013
@jreback
Copy link
Contributor

jreback commented Jun 27, 2013

looks great
thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants