Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plotting problem in pandas 0.15.1 for DataFrame with datetime-index #9012

Closed
davaco opened this issue Dec 5, 2014 · 9 comments
Closed

Plotting problem in pandas 0.15.1 for DataFrame with datetime-index #9012

davaco opened this issue Dec 5, 2014 · 9 comments
Labels
Bug Regression Functionality that used to work in a prior pandas version Visualization plotting
Milestone

Comments

@davaco
Copy link

davaco commented Dec 5, 2014

Hi have the following snippet:

from datetime import datetime
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

plot_df = pd.DataFrame(
    np.random.rand(21, 2),
    index=pd.bdate_range(datetime(2000, 1, 1), datetime(2000, 1, 31)),
    columns=['a', 'b'])

fig = plt.figure()
ax = fig.add_subplot(111)
ax = plot_df.plot(ax=ax)
plt.axhline(y=0)

This gives an exception in Pandas 0.15.2, not in Pandas 0.14.1.
Trace:

Traceback (most recent call last):
  File "test_p15.py", line 14, in <module>
    plt.axhline(y=0)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2501, in axhline
    ret = ax.axhline(y=y, xmin=xmin, xmax=xmax, **kwargs)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 729, in axhline
    self.add_line(l)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 1486, in add_line
    self._update_line_limits(line)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 1497, in _update_line_limits
    path = line.get_path()
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/lines.py", line 871, in get_path
    self.recache()
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/lines.py", line 568, in recache
    xconv = self.convert_xunits(self._xorig)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/artist.py", line 163, in convert_xunits
    return ax.xaxis.convert_units(x)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/matplotlib/axis.py", line 1448, in convert_units
    ret = self.converter.convert(x, self.units, self)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/pandas/tseries/converter.py", line 121, in convert
    return PeriodIndex(values, freq=axis.freq).values
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/pandas/tseries/period.py", line 641, in __new__
    ordinal, freq = cls._from_arraylike(data, freq, tz)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/pandas/tseries/period.py", line 691, in _from_arraylike
    data = _get_ordinals(data, freq)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/pandas/tseries/period.py", line 507, in _get_ordinals
    return lib.map_infer(data, f)
  File "pandas/src/inference.pyx", line 1020, in pandas.lib.map_infer (pandas/lib.c:56502)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/pandas/tseries/period.py", line 503, in <lambda>
    f = lambda x: Period(x, freq=freq).ordinal
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/pandas/tseries/period.py", line 126, in __init__
    dt, _, reso = parse_time_string(value, freq)
  File "/Users/dcoevord/anaconda/envs/p15/lib/python2.7/site-packages/pandas/tseries/tools.py", line 460, in parse_time_string
    raise DateParseError(e)
pandas.tseries.tools.DateParseError: day is out of range for month

Version info:

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Darwin
OS-release: 14.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.15.1
nose: None
Cython: None
numpy: 1.9.1
scipy: 0.14.0
statsmodels: None
IPython: 2.3.1
sphinx: None
patsy: None
dateutil: 1.5
pytz: 2014.9
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.4.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: None
pymysql: None
psycopg2: None
@jorisvandenbossche jorisvandenbossche added this to the 0.15.2 milestone Dec 5, 2014
@jorisvandenbossche jorisvandenbossche added Visualization plotting Regression Functionality that used to work in a prior pandas version labels Dec 5, 2014
@jorisvandenbossche
Copy link
Member

I can confirm it is a regression compared to 0.14

@jreback
Copy link
Contributor

jreback commented Dec 8, 2014

@jorisvandenbossche is this fixiable in next day or 2?

@jorisvandenbossche
Copy link
Member

I did not yet track the bug down (only confirmed it), so I don't really know.
So if someone can look at this? @TomAugspurger @sinhrks @stevesimmons
I will also try to look at it tomorrow

@jreback jreback modified the milestones: 0.16.0, 0.15.2 Dec 10, 2014
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.15.2, 0.16.0 Dec 10, 2014
@jorisvandenbossche
Copy link
Member

@jreback just looked at this, this is due to a change you did in the Index-no-longer-subclass-array refactor here: 8d3cb3f#diff-c9c253d067464a66d3288308b3a00300R120

If I change PeriodIndex(values, freq=axis.freq).values back to the original [get_datevalue(x, axis.freq) for x in values], it works again.

it boils down to this: pd.PeriodIndex([0, 1], freq='D') giving an error on the 0 (see the above error, 'day is out of range for month'). The PeriodConverter gets these values from matplotlibs axhline. Previously, the get_datevalue would endure that simple integers were passed through as is instead of converting to a PeriodIndex, and so not generating that error.

I will just do a PR with this, but do you remember there was a reason for this change?

@jreback
Copy link
Contributor

jreback commented Dec 10, 2014

perf diff was huge
u are calling kinds of inference routines each time where they are doing the same work

@jorisvandenbossche
Copy link
Member

Is there a specific vbench for this?

@jorisvandenbossche
Copy link
Member

BTW, see line https://github.com/pydata/pandas/blob/master/pandas/tseries/converter.py#L118, there is also an isinstance(values, Index) check with a values.map(..) path. Shouldn't that be removed now you have the PeriodIndex(values, ...) two lines below?

@jreback
Copy link
Contributor

jreback commented Dec 10, 2014

I think I added a benchmark but don't really remember.

@jreback
Copy link
Contributor

jreback commented Dec 11, 2014

closed by #9050

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Regression Functionality that used to work in a prior pandas version Visualization plotting
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants