Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

truncated violin plots #1494

Closed
ricky-r opened this issue Mar 19, 2014 · 8 comments

Comments

Projects
None yet
3 participants
@ricky-r
Copy link

commented Mar 19, 2014

Hi all,

I have a problem with violin plots when I use the log scale on the y axis. The underlying coloured area goes missing in a one or two boxplots as soon as I turn on the log scale.
truncated_violin
The warning I get on the terminal is:
/usr/local/lib/python2.7/dist-packages/matplotlib/axes.py:1171: UserWarning: aspect is not supported for Axes with xscale=linear, yscale=log
'yscale=%s' % (xscale, yscale))

Is there anything I can do to fix this?

Thanks!

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 19, 2014

@phobson Do you have an idea what the warning might mean?

@ricky-r Can you provide a reproducible example, with real or made up data?
Can you check the smallest values in the first violin plot? Try to clip the smallest values to something larger.
I'm just guessing, we need to figure out if this is the data, the way the violins are calculated or something with matplotlib.

Is this with default settings or did you set any options?

It looks strange to me that the violins are not smooth in this case.

@phobson

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

My guess is that matplotlib thinks (and I agree) that setting an aspect ratio on an axes is ambiguous if the scales aren't the same.

If I had to guess, the funky drawing and warning are unrelated. What you're seeing here might a result of the KDE reaching down into negative values. Like josef said, a working example would help greatly.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 19, 2014

Thanks Paul,

If the kde has negative values, then we would need an option for users to truncate the violin instead of using the full default range.

possible enhancement to kde:
If the non-smooth shape comes from the log-scale transformation of the original kde, then we could try to get the kde in the transformed space.
One possibility might be to call the violin plots with the transformed data, and then adjust the tick labels on the y axis.
Another possibility, that I never checked: treat the kde as a non-linear transformation of a density, the density of a non-linearly transformed variable is not just a rescaling of the axis.

@ricky-r

This comment has been minimized.

Copy link
Author

commented Mar 19, 2014

Can I just give you guys a pickled plot? If not, where can I upload the data from my plot?

Anyway, the data in the first violinplot should be comparable with that in the second one in my example. The ranges sould be very similar.

@phobson

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

A pickled plot won't do anything for us. We need code that generates the data and constructs the figure demonstrating this problem. Creating a gist at https://gist.github.com/ would be great.

@phobson

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2014

I'm increasingly convinced this is the result of lognormally-distributed data:
http://nbviewer.ipython.org/gist/phobson/9650257

edit: updated the notebook with a quick example of taking the log of the data and redefining the y-labels accordingly

@ricky-r

This comment has been minimized.

Copy link
Author

commented Mar 19, 2014

Ok guys, it was a lot simpler than what I thought. I went through the data that I'm plotting and found out that whenever my values included a zero, the corresponding violin plots had the behaviour in the figure above. In any case the boxplot was correct, just the area around it (the violin plot) got messed up a bit.

I have no problem removing these values from my dataset, so no big deal. I'll make a wild guess here: does matplotlib remove all zeros by default when in log scale?

(The warning was indeed unrelated to this issue. Nothing appears when I plot the same data in interactive mode)

@phobson

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2014

@ricky-r can you close this?

@ricky-r ricky-r closed this Mar 21, 2014

@josef-pkt josef-pkt added the PR label Apr 14, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.