New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG in plotting timeseries data with twinx (different data representation on each ax) #14322

Open
cygenb0ck opened this Issue Sep 29, 2016 · 8 comments

Comments

Projects
None yet
3 participants
@cygenb0ck

cygenb0ck commented Sep 29, 2016

Simplified the example.
During issue reporting i updated my pandas version from 0.13 to 0.18.1 - with version 0.13 i was able to add the whole DataFrame to my plot, only after i selected some rows between dates it produces the error. After the update the behaviour became worse - also adding the whole DataFrame to the plot now produces the error.

A small, complete example of the issue

import pandas
import dateutil.parser
import matplotlib.pyplot as plt

p_vals = {
    'x_vals' : [
        "2006-12-17 00:00:00+01:00",
        "2006-12-18 00:00:00+01:00",
        "2006-12-19 00:00:00+01:00",
        "2006-12-20 00:00:00+01:00",
        "2006-12-21 00:00:00+01:00",
        "2006-12-22 00:00:00+01:00",
        "2006-12-23 00:00:00+01:00",
        "2006-12-24 00:00:00+01:00",
        "2006-12-25 00:00:00+01:00",
        "2006-12-26 00:00:00+01:00",
    ],
    'y_vals' : [
        10,9,8,7,6,5,4,3,2,1
    ]
}

p_vals2 = {
    'x_vals' : [
        "2006-12-17 00:00:00+01:00",
        "2006-12-18 00:00:00+01:00",
        "2006-12-19 00:00:00+01:00",
        "2006-12-20 00:00:00+01:00",
        "2006-12-21 00:00:00+01:00",
    ],
    'y_vals' : [
        1,2,3,4,5
    ]
}

p_vals['x_vals'] = [ dateutil.parser.parse(x) for x in p_vals['x_vals'] ]
p_vals2['x_vals'] = [ dateutil.parser.parse(x) for x in p_vals2['x_vals'] ]

df = pandas.DataFrame(data = [1,2,3,4,5], index=["2006-12-17","2006-12-18","2006-12-19","2006-12-20","2006-12-21"])
df.index = pandas.to_datetime(df.index, format="%Y-%m-%d")

fig, ax1 = plt.subplots()
ax2 = ax1.twinx()

ax1.plot(p_vals['x_vals'], p_vals['y_vals'], color="r")
#ax2.plot(p_vals2['x_vals'], p_vals2['y_vals'], color="b") # works as intended, see second attached image
df.plot(ax=ax2, color="b") # hides data on ax1, see first image

plt.show()

Expected Output

Output of pd.show_versions()

## INSTALLED VERSIONS

commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-69-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: 1.3.1
pip: 8.1.2
setuptools: 3.3
Cython: None
numpy: 1.11.1
scipy: 0.13.3
statsmodels: None
xarray: None
IPython: 1.2.1
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: 3.1.1
numexpr: 2.2.2
matplotlib: 1.5.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.2.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None
pandas_how_it_looks
pandas_how_it_should_look

@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger

TomAugspurger Sep 29, 2016

Contributor

Pandas 0.13 is quite old, can you try with a more recent version? Also see if you can simplify your example a bit.

also the x label look strange.

What do you mean by strange?

Contributor

TomAugspurger commented Sep 29, 2016

Pandas 0.13 is quite old, can you try with a more recent version? Also see if you can simplify your example a bit.

also the x label look strange.

What do you mean by strange?

@cygenb0ck

This comment has been minimized.

Show comment
Hide comment
@cygenb0ck

cygenb0ck Sep 29, 2016

i just updated to pandas: 0.18.1 - sry for not trying with updated panda version
now both of my plot calls hide the data on the first axis.

sorry for my bad wording - by strange i meant, just, that is looks different

i just updated to pandas: 0.18.1 - sry for not trying with updated panda version
now both of my plot calls hide the data on the first axis.

sorry for my bad wording - by strange i meant, just, that is looks different

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Sep 30, 2016

Member

@cygenb0ck Can you try to simplify the example? Eg try to make it reproducible by not having to read a csv file but just create the data with code. Also try to remove other things that are not essential to the problem as much as possible.

Member

jorisvandenbossche commented Sep 30, 2016

@cygenb0ck Can you try to simplify the example? Eg try to make it reproducible by not having to read a csv file but just create the data with code. Also try to remove other things that are not essential to the problem as much as possible.

@cygenb0ck cygenb0ck changed the title from After selecting rows between dates and plotting with matplotlib, plotted rows hide first axis to Plotting DataFrame on second axis hides data on first axis - was: Plotting a DataFrame on second axis hides data on first axis Oct 1, 2016

@cygenb0ck

This comment has been minimized.

Show comment
Hide comment
@cygenb0ck

cygenb0ck Oct 1, 2016

@jorisvandenbossche
simplified the example and changed the subject

@jorisvandenbossche
simplified the example and changed the subject

@cygenb0ck cygenb0ck changed the title from Plotting DataFrame on second axis hides data on first axis - was: Plotting a DataFrame on second axis hides data on first axis to Plotting DataFrame on second axis hides data on first axis - was: After selecting rows between dates and plotting with matplotlib, plotted rows hide first axis Oct 1, 2016

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Oct 1, 2016

Member

@cygenb0ck Thanks a lot! That let me look at it, and it's a bit a gotcha with the dates.

To start, it's not an issue with the twinx. Eg if you try the following similar example (but without using datetimes), you will see it works as expected:

fig, ax1 = plt.subplots()
ax2 = ax1.twinx()

ax1.plot([1,3,2], color="r")

df3 = pd.DataFrame({'col': [2,5,1]})
df3.plot(ax=ax2, color="b")
# df3['col'].plot(ax=ax2, color="b") # to plot one column not full dataframe

The reason it does not work with the example data you gave, is not because the plot is overwritten, but because the data on the first ax now fall outside the visible plot (if you zoom out enough, you will see both lines). This is because the dates are handled differently in the two cases.
The reason for that is a problem in pandas' plotting machinery to combine both irregular and regular time series in one plot (and because your data on ax1 have hours (although daily freq), they are regarded as irregular, the data on ax2 are regular). Related issues are #6608, #9053, #13341. We should definitely solve this ...
However, in this case it seems also specific to using twinx, as not using this does also solve the issue (then the second data are plotted fine).

Workaround you can use for now is by also plotting on ax2 with the matplotlib plot call:

fig, ax1 = plt.subplots()
ax1.plot(p_vals['x_vals'], p_vals['y_vals'], color="r")
ax2 = ax1.twinx()
ax2.plot(df.index, df[0].values, color="b")
Member

jorisvandenbossche commented Oct 1, 2016

@cygenb0ck Thanks a lot! That let me look at it, and it's a bit a gotcha with the dates.

To start, it's not an issue with the twinx. Eg if you try the following similar example (but without using datetimes), you will see it works as expected:

fig, ax1 = plt.subplots()
ax2 = ax1.twinx()

ax1.plot([1,3,2], color="r")

df3 = pd.DataFrame({'col': [2,5,1]})
df3.plot(ax=ax2, color="b")
# df3['col'].plot(ax=ax2, color="b") # to plot one column not full dataframe

The reason it does not work with the example data you gave, is not because the plot is overwritten, but because the data on the first ax now fall outside the visible plot (if you zoom out enough, you will see both lines). This is because the dates are handled differently in the two cases.
The reason for that is a problem in pandas' plotting machinery to combine both irregular and regular time series in one plot (and because your data on ax1 have hours (although daily freq), they are regarded as irregular, the data on ax2 are regular). Related issues are #6608, #9053, #13341. We should definitely solve this ...
However, in this case it seems also specific to using twinx, as not using this does also solve the issue (then the second data are plotted fine).

Workaround you can use for now is by also plotting on ax2 with the matplotlib plot call:

fig, ax1 = plt.subplots()
ax1.plot(p_vals['x_vals'], p_vals['y_vals'], color="r")
ax2 = ax1.twinx()
ax2.plot(df.index, df[0].values, color="b")

@jorisvandenbossche jorisvandenbossche changed the title from Plotting DataFrame on second axis hides data on first axis - was: After selecting rows between dates and plotting with matplotlib, plotted rows hide first axis to BUG in plotting timeseries data with twinx (different data representation on each ax) Oct 1, 2016

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Oct 1, 2016

Member

Apparently, using x_compat=True is also a way to get this working:

fig, ax1 = plt.subplots()
ax1.plot(p_vals['x_vals'], p_vals['y_vals'], color="r")
ax2 = ax1.twinx()
df.plot(ax=ax2, x_compat=True, color="b")

It's mentioned in the docs: http://pandas.pydata.org/pandas-docs/stable/visualization.html#suppressing-tick-resolution-adjustment (although for another reason, I am not that familiar with this keyword)

Member

jorisvandenbossche commented Oct 1, 2016

Apparently, using x_compat=True is also a way to get this working:

fig, ax1 = plt.subplots()
ax1.plot(p_vals['x_vals'], p_vals['y_vals'], color="r")
ax2 = ax1.twinx()
df.plot(ax=ax2, x_compat=True, color="b")

It's mentioned in the docs: http://pandas.pydata.org/pandas-docs/stable/visualization.html#suppressing-tick-resolution-adjustment (although for another reason, I am not that familiar with this keyword)

@cygenb0ck

This comment has been minimized.

Show comment
Hide comment
@cygenb0ck

cygenb0ck Oct 3, 2016

@jorisvandenbossche
thank you very much for the workaround with x_compat=True. I can finally plot my data and continue my project.

@jorisvandenbossche
thank you very much for the workaround with x_compat=True. I can finally plot my data and continue my project.

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Nov 26, 2016

Member

This was only partly closed by #14330 (this example still does not work when first plotting the irregular series, #14330 added the test but commented it out)

Member

jorisvandenbossche commented Nov 26, 2016

This was only partly closed by #14330 (this example still does not work when first plotting the irregular series, #14330 added the test but commented it out)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment