I have a problem with violin plots when I use the log scale on the y axis. The underlying coloured area goes missing in a one or two boxplots as soon as I turn on the log scale.
The warning I get on the terminal is:
/usr/local/lib/python2.7/dist-packages/matplotlib/axes.py:1171: UserWarning: aspect is not supported for Axes with xscale=linear, yscale=log
'yscale=%s' % (xscale, yscale))
Is there anything I can do to fix this?
@phobson Do you have an idea what the warning might mean?
@ricky-r Can you provide a reproducible example, with real or made up data?
Can you check the smallest values in the first violin plot? Try to clip the smallest values to something larger.
I'm just guessing, we need to figure out if this is the data, the way the violins are calculated or something with matplotlib.
Is this with default settings or did you set any options?
It looks strange to me that the violins are not smooth in this case.
My guess is that matplotlib thinks (and I agree) that setting an aspect ratio on an axes is ambiguous if the scales aren't the same.
If I had to guess, the funky drawing and warning are unrelated. What you're seeing here might a result of the KDE reaching down into negative values. Like josef said, a working example would help greatly.
If the kde has negative values, then we would need an option for users to truncate the violin instead of using the full default range.
possible enhancement to kde:
If the non-smooth shape comes from the log-scale transformation of the original kde, then we could try to get the kde in the transformed space.
One possibility might be to call the violin plots with the transformed data, and then adjust the tick labels on the y axis.
Another possibility, that I never checked: treat the kde as a non-linear transformation of a density, the density of a non-linearly transformed variable is not just a rescaling of the axis.
Can I just give you guys a pickled plot? If not, where can I upload the data from my plot?
Anyway, the data in the first violinplot should be comparable with that in the second one in my example. The ranges sould be very similar.
A pickled plot won't do anything for us. We need code that generates the data and constructs the figure demonstrating this problem. Creating a gist at https://gist.github.com/ would be great.
I'm increasingly convinced this is the result of lognormally-distributed data:
edit: updated the notebook with a quick example of taking the log of the data and redefining the y-labels accordingly
Ok guys, it was a lot simpler than what I thought. I went through the data that I'm plotting and found out that whenever my values included a zero, the corresponding violin plots had the behaviour in the figure above. In any case the boxplot was correct, just the area around it (the violin plot) got messed up a bit.
I have no problem removing these values from my dataset, so no big deal. I'll make a wild guess here: does matplotlib remove all zeros by default when in log scale?
(The warning was indeed unrelated to this issue. Nothing appears when I plot the same data in interactive mode)
@ricky-r can you close this?