Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling Pandas plot method modify Maplotlib Time Representation #18716

Closed
jlandercy opened this issue Dec 10, 2017 · 3 comments
Closed

Calling Pandas plot method modify Maplotlib Time Representation #18716

jlandercy opened this issue Dec 10, 2017 · 3 comments
Labels
Duplicate Report Duplicate issue or pull request Visualization plotting

Comments

@jlandercy
Copy link

jlandercy commented Dec 10, 2017

Code Sample, a copy-pastable example if possible

# Imports:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Trial Dataframe:
t = pd.date_range('1940-01-01', '1940-08-01', freq='1H', closed='left')
df = pd.DataFrame(np.random.randn(t.size, 10), index=t)

# First plot: Matplotlib only
fig, axe0 = plt.subplots()
axe0.plot(df)
plt.show(axe0)
axe0.get_xticks()
#array([ -9.50000000e+17,  -9.47500000e+17,  -9.45000000e+17,
#       -9.42500000e+17,  -9.40000000e+17,  -9.37500000e+17,
#        -9.35000000e+17,  -9.32500000e+17,  -9.30000000e+17,
#        -9.27500000e+17,  -9.25000000e+17])

# Second Plot: Pandas DataFrame Plot
fig, axe1 = plt.subplots()
df.plot(ax=axe1)
plt.show(axe1)
axe1.get_xticks()
#array([-263247, -262992, -262248, -261552, -260808, -260088, -259344,
#       -258624, -257880, -257625])

# Third Plot: Matplotlib only again, but behaves differently
fig, axe2 = plt.subplots()
axe2.plot(df)
plt.show(axe2)
axe2.get_xticks()
#array([ 708205.,  708236.,  708265.,  708296.,  708326.,  708357.,
#        708387.,  708418.])

Problem description

After upgrading Matplotlib to 2.1.0 I had several plot having issue with Time Axis and Formatter.
I already know that Matplotlib does work well with np.datetime64 and works as expected with datetime.datetime. I know that I can use to_pydatetime() to convert index before plotting.

But here, I am facing a different thing, if I run a plot from Matplotlib it gives me result.
If I plot something else using pandas.DataFrame.plot, then next plot using Matplotlib behaves differently.

In the example above, we see:

  • First plot has a float axis without Date Formatter, xticks are negative float;
  • Second plot is time formatted by Pandas, xticks are negative integer;
  • Third plot, same code as first, axis is time formatted by Matplotlib and xticks are positive rounded float.

My conclusion is calling DataFrame.plot() modify Matplotlib behaviour.

Expected Output

I expected both Matplotlib code to render equal, but it does not.
It seems Matplotlib time conventions and representation change after calling Pandas plot.
I do not have enough insight in Pandas to understand what is going on behind. Is this normal?

Why does it happen? Is this a Pandas related problem or must I look to the Matplotlib side?

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-75-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 38.2.3
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: 1.1.15
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
@jorisvandenbossche
Copy link
Member

@jlandercy see #18301 (and linked issues) and http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#restore-matplotlib-datetime-converter-registration

This will be fixed in upcoming 0.21.1 release.

First plot has a float axis without Date Formatter, xticks are negative float;

This is as expected. Matplotlib cannot yet handle datetime64 data, so those timestamps are just converted to its underlying integer representation. But, the upcoming feature release of matplotlib should fix this.

Second plot is time formatted by Pandas, xticks are negative integer;

That is just the internals of the plotting machinery, this is correct (matplotlib plotting only deals with floats in the end, so everything has to be converted to floats)

Third plot, same code as first, axis is time formatted by Matplotlib and xticks are positive rounded float.

Yes, pandas plotting currently modified matplotlib behaviour. This will still be the case in 0.21.1, but we are exploring to fix this in pandas 0.22.0

@jorisvandenbossche jorisvandenbossche added Duplicate Report Duplicate issue or pull request Visualization plotting labels Dec 11, 2017
@jorisvandenbossche jorisvandenbossche added this to the No action milestone Dec 11, 2017
@jlandercy
Copy link
Author

@jorisvandenbossche Thank you for your answer. I have missed that.

@jorisvandenbossche
Copy link
Member

No, problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request Visualization plotting
Projects
None yet
Development

No branches or pull requests

2 participants