Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-aligned x-axes when plotting two series on the same axes #6630

Open
fonnesbeck opened this issue Mar 13, 2014 · 5 comments
Open

Non-aligned x-axes when plotting two series on the same axes #6630

fonnesbeck opened this issue Mar 13, 2014 · 5 comments
Labels

Comments

@fonnesbeck
Copy link

I have the following DataFrame that I want to create a boxplot for:

estimates.head()

Out[162]:
            0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  \
DATE
2009-04-28   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
2009-04-29   6   6   6   6   6   6   6   6   6   6   6   6   6   6   6   6
2009-04-30  10  10  10  10  10   9   9   9   9   9   9   9   9   9   9   9
2009-05-13   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9
2009-05-14  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11

            16  17  18  19  20  21  22  23  24  25  26  27  28  29
DATE
2009-04-28   0   0   0   0   0   0   0   0   0   0   0   0   0   0 ...
2009-04-29   6   6   6   6   6   6   6   6   6   6   6   6   6   6 ...
2009-04-30   9   9   9   9   9   9   9   9   9   9   9   9   9   9 ...
2009-05-13   9   9   9   9   9   9   9   9   9   9   9   9   9   9 ...
2009-05-14  11  11  11  11  11  11  11  11  11  11  12  12  12  12 ...

along with a Series that I want to plot on the same axes:

counts.head()

Out[165]:
DATE
2009-04-28    0
2009-04-29    2
2009-04-30    4
2009-05-13    2
2009-05-14    5
Name: OBS1_NMANATEES, dtype: float64

They are both the same length, and have the same index (in fact, to ensure the latter, I assigned the index estimates to counts). However, when I plot them on the same set of axes, the plot for the counts is off by one on the x-axis:

fig, axes = plt.subplots(figsize=(18,6))
estimates.T.boxplot(ax=axes, grid=False, rot=90)
axes.plot(counts)
plt.tight_layout()
plt.xlabel('Survey')
plt.ylabel('Population estimate')

Notice in the plot the line graph does not begin with zero at 2009-04-28 as is clearly shown in the data, and you can see the entire line is shifted. This looks like an alignment error, no?

@fonnesbeck
Copy link
Author

Made this work by substituting my third line with:

axes.plot(axes.get_xticks(), counts)

but I should not have to do that, right?

@TomAugspurger
Copy link
Contributor

So it looks like for axes.boxplot matplotlib has a positions keyword arg:

  *positions* : [ default 1,2,...,n ]
    Sets the horizontal positions of the boxes. The ticks and limits
    are automatically set to match the positions.

They decided to set the default for this one to 1 (maybe so the boxes aren't draw outside the figure? I have real idea why...).

I think something like

>>> estimates.T.boxplot(ax=ax, grid=False, rot=90,
                        positions=range(len(estimates.T.columns)))
>>> axes.plot(counts)

should work for you.


Thoughts on whether this is something we should change in pandas? I tend towards honoring matplotlib's defaults / parameter names, even if they seem odd at first.

@fonnesbeck
Copy link
Author

I've also tried using the plot method on the Series instead of the pyplot.plot function, but even with the ax=axes argument set, it draws a new plot over the existing one, instead of on the same set of axes.

fig, axes = plt.subplots(figsize=(18,6))
estimates.T.boxplot(ax=axes, grid=False, rot=90)
counts.plot(ax=axes, style='ro')
plt.tight_layout()
plt.xlabel('Survey')
plt.ylabel('Population estimate')

@TomAugspurger
Copy link
Contributor

I think I had the same problem when I was playing with it a few days ago. This one is a bug I think.

What's happening is that counts has a DatetimeIndex, and we do some special stuff with those to make the formatting look nice before handing things off to matplotlib. However, estimates.T.boxplot doesn't have a DatetimeIndex, since it's just plotting a box for each column (which happen to be dates, but aren't boxplot doesn't care).

You'll want to do counts.plot(ax=axes, style='ro', use_index=False) to just use the existing ticks. Since you've got everything aligned ahead of time, you should be good to go. Sorry if this is nonintuitive. I'm going to reword the issue to for this bug if that's ok.

@jreback jreback added this to the 0.16.1 milestone Mar 8, 2015
@jreback jreback modified the milestones: Next Major Release, 0.16.1 Apr 29, 2015
@seeM
Copy link

seeM commented Oct 27, 2019

@TomAugspurger this is marked as a good first issue, but I'm not sure if that's still the case. If so, I'd like to look into it. With the following minimal example (pandas v0.25.2 and matplotlib v3.1.1):

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

np.random.seed(0)
dates = pd.date_range('2019-01-01', '2019-01-07')
y = pd.DataFrame(np.random.randn(len(dates), 10), index=dates)
x = y.T.median()

the most intuitive approach raises ValueError: zero-size array to reduction operation fmin which has no identity.

fig, ax = plt.subplots()
y.T.boxplot(ax=ax, rot=90)
x.plot(ax=ax)

Reversing the order of x and y plots above , as well as the following two variations, don't raise errors but do not produce the expected plot:

y.T.boxplot(ax=ax, rot=90)
ax.plot(x)

y.T.boxplot(ax=ax, rot=90, positions=range(len(y.T.columns)))
x.plot(ax=ax)

The only working approach is as suggested by @fonnesbeck:

y.T.boxplot(ax=ax, rot=90)
ax.plot(ax.get_xticks(), x)

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants