Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems when setting positions in boxplot() (mainly on log-scale axis) #3566

Closed
leoluecken opened this issue Nov 20, 2023 · 9 comments
Closed

Comments

@leoluecken
Copy link

leoluecken commented Nov 20, 2023

Hey everybody,
Thanks for adding native_scale to boxplot in 0.13!! I've been waiting for this! :)
Now, I tried to do some tweaking dodging boxpositions manually, and encountered the following (I'm on 0.13.0, and matplotlib 3.7.2):

My code:

import seaborn as sns
import matplotlib.pyplot as plt

data = {
    "x":[0.1,0.1,0.1,0.1,1,1,1,1],
    "y":[1,2,3,4,1,2,3,4]
}

xvals = sorted(set(data["x"]))

# logscale dodging
boxpositions = [np.exp(np.log(x)+0.5) for x in xvals]

fig, ax = plt.subplots()
ax.set_xscale("log")
sns.boxplot(data, x="x", y="y", width=0.3, native_scale=True, positions=boxpositions, ax=ax)
plt.show()

produces the following image:

image

This doesn't get better when trying to use log_scale instead of native_scale:

sns.boxplot(data, x="x", y="y", width=0.3, log_scale=True, positions=boxpositions, ax=ax)

image

It works more or less fine on linear scale:

import seaborn as sns
import matplotlib.pyplot as plt

data = {
    "x":[0.1,0.1,0.1,0.1,1,1,1,1],
    "y":[1,2,3,4,1,2,3,4]
}

xvals = sorted(set(data["x"]))

# linscale dodging
boxpositions = [x+0.5 for x in xvals]

fig, ax = plt.subplots()
sns.boxplot(data, x="x", y="y", width=0.3, positions=boxpositions, ax=ax)
plt.show()

Only the xlim is not updated:

image

Cheers,
Leo

@mwaskom
Copy link
Owner

mwaskom commented Nov 20, 2023

The problem is that positions is a keyword argument of the underlying matplotlib function. Matplotlib itself has no idea how to draw a boxplot on a log axis, seaborn needs to do all of the positional adjustments itself to make things like dodging on a log scale work (it's quite complicated and annoying).

I don't think there's any simple way to make this work. It's possible that seaborn should just reject the positions keyword since it conflicts with the operations it is doing.

@leoluecken
Copy link
Author

Hmm. At least the "center lines" were positioned correctly, it seems.

And seaborn makes nice boxwidths in logscale when I use native_scale without dodging:

fig, ax = plt.subplots()
ax.set_xscale("log")
sns.boxplot(data, x="x", y="y", width=0.3, native_scale=True, ax=ax)

image

That seems to happen in seaborn, doesn't it? Because matplotlib scales boxwidth logarithmically as it seems. At least,

fig, ax = plt.subplots()
ax.set_xscale("log")
ax.boxplot([[1,2,3,4], [1,2,3,4]], positions=[0.1, 1.0])

gives

image

That means, the math must be somewhere in there already :)

@mwaskom
Copy link
Owner

mwaskom commented Nov 20, 2023

That seems to happen in seaborn, doesn't it?

That means, the math must be somewhere in there already :)

Not really sure I understand your question here. Yes, seaborn is handling setting the widths properly (for both the boxes and caps) on a log scale because matplotlib does not know how to. To accomplish that, it needs to modify the artists after matplotlib creates them. It does not need to modify the vertical line because that is width-independent. Feel free to look at the code if you would like to know more.

My point is that seaborn already has a way to manage the positions of the boxes the categorical axis (i.e., the value of the x/y variable) and juggling the separate positions argument that matplotlib offers would add a lot of complexity without clear benefit. I don't really understand why you need to use positions to accomplish what you want to do now that native_scale exists; can't you just modify the x data?

@leoluecken
Copy link
Author

leoluecken commented Nov 23, 2023

The problem in my case is that I want to do a boxplot for two categories with bars dodged side-by-side, but the values don't live on the same y-scale. So I am using two different axes (via ax2=ax.twinx()) to plot them. Otherwise I could use the hue parameter, of course.

But I understand that it is perhaps not seaborn's job to support these kind of hacks.

I will see if I can find the code which dodges the bars on native scale in case of several categories when using the hue keyword. This would just do, what I need here, I think.

@mwaskom
Copy link
Owner

mwaskom commented Nov 23, 2023

I’m still not following why you can’t apply your positional adjustment to the x data.

Alternatively, you could use hue and hue order to dodge across the twinned axes.

@mwaskom
Copy link
Owner

mwaskom commented Nov 23, 2023

e.g something like

tips = sns.load_dataset("tips")
x = 10 ** tips["size"]
f, ax1 = plt.subplots()
ax1.set_xscale("log")
ax2 = ax1.twinx()
sns.boxplot(tips, x=np.exp(np.log(x) - .45), y="total_bill", native_scale=True, width=.4, ax=ax1)
sns.boxplot(tips, x=np.exp(np.log(x) + .45), y="tip", native_scale=True, width=.4, color="C1", ax=ax2)

image

is what you're looking for, no?

@mwaskom
Copy link
Owner

mwaskom commented Nov 23, 2023

Alternatively,

tips = sns.load_dataset("tips")
x = 10 ** tips["size"]
f, ax1 = plt.subplots()
ax1.set_xscale("log")
ax2 = ax1.twinx()
sns.boxplot(tips, x=x, y="total_bill", hue=0, hue_order=[0, 1], dodge=True, legend=False, native_scale=True, ax=ax1)
sns.boxplot(tips, x=x, y="tip", hue=1, hue_order=[0, 1], dodge=True, legend=False, native_scale=True, ax=ax2)

image

@leoluecken
Copy link
Author

Yes, you're right, of course! I didn't get that the first time - sorry! Thanks a lot for looking into this in such detail!

@mwaskom
Copy link
Owner

mwaskom commented Nov 24, 2023

Great. If you're interested, I'd probably accept a PR to actively ignore the positions kwarg (either with a warning or an exception, we'd need to decide) but otherwise going to close here. Thanks for stress-testing native_scale and glad that it's solving problems for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants