Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: OverflowError in time series plot after frequency conversion #25600

Closed
mundus08 opened this issue Apr 2, 2023 · 6 comments
Closed

[Bug]: OverflowError in time series plot after frequency conversion #25600

mundus08 opened this issue Apr 2, 2023 · 6 comments

Comments

@mundus08
Copy link

mundus08 commented Apr 2, 2023

Bug summary

After applying a
df1.resample('10S').asfreq()
transformation to a timeseries pandas dataframe I get an OverflowError error when trying to format the X-axis.
Before the transformation the ploting works.

Code for reproduction

# %%
from io import StringIO

import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import pandas as pd

import matplotlib
print (matplotlib.__version__)
import platform
print (platform.python_version())

# %%
df1 = pd.read_csv(StringIO(
    'ts,DE0008469008,MBI000000006\r\n2023-03-31 12:12:09,15576.57,15579.0\r\n2023-03-31 12:12:10,15576.66,\r\n2023-03-31 12:12:13,,15580.9\r\n2023-03-31 12:12:15,15576.74,15581.9\r\n2023-03-31 12:12:16,15577.0,15581.8\r\n2023-03-31 12:12:17,15577.3,15581.7\r\n2023-03-31 12:12:19,,15580.8\r\n2023-03-31 12:12:20,,15582.0\r\n2023-03-31 12:12:21,15577.68,15581.8\r\n2023-03-31 12:12:22,,15581.3\r\n2023-03-31 12:12:23,15577.64,\r\n2023-03-31 12:12:24,15577.82,\r\n2023-03-31 12:12:25,15577.92,15582.4\r\n2023-03-31 12:12:26,15578.19,\r\n2023-03-31 12:12:28,15578.69,15581.7\r\n2023-03-31 12:12:29,,15582.0\r\n2023-03-31 12:12:30,15578.85,\r\n2023-03-31 12:12:31,15578.76,\r\n2023-03-31 12:12:33,15578.57,\r\n2023-03-31 12:12:34,15578.65,\r\n2023-03-31 12:12:35,,15584.0\r\n2023-03-31 12:12:37,,15584.2\r\n2023-03-31 12:12:38,,15583.4\r\n2023-03-31 12:12:40,15578.57,\r\n2023-03-31 12:12:41,15578.49,15581.8\r\n2023-03-31 12:12:42,,15582.3\r\n2023-03-31 12:12:44,,15581.5\r\n2023-03-31 12:12:45,15578.54,15580.3\r\n2023-03-31 12:12:46,,15580.6\r\n2023-03-31 12:12:47,,15580.3\r\n'),
                  index_col='ts', parse_dates=['ts'])

# %%
df2 = df1.resample('10S').asfreq().ffill().dropna(axis=0, how='all')


# %%
def plotme(df):
    fig, ax = plt.subplots()
    df.plot(ax=ax, linewidth=0.5)
    hours = mdates.HourLocator(interval=1)
    h_fmt = mdates.DateFormatter('%H')
    ax.xaxis.set_major_locator(hours)
    ax.xaxis.set_major_formatter(h_fmt)


# %%
plotme(df1)

# %%
plotme(df2)

Actual outcome

358 dt = (np.datetime64(get_epoch()) +

--> 359 np.timedelta64(int(np.round(x * MUSECONDS_PER_DAY)), 'us'))
360 if dt < np.datetime64('0001-01-01') or dt >= np.datetime64('10000-01-01'):
361 raise ValueError(f'Date ordinal {x} converts to {dt} (using '
362 f'epoch {get_epoch()}), but Matplotlib dates must be '
363 'between year 0001 and 9999.')

OverflowError: int too big to convert

Expected outcome

The transformed dataframe can be plotted

Additional information

No response

Operating system

Windows 11

Matplotlib Version

3.7.1

Matplotlib Backend

module://matplotlib_inline.backend_inline

Python version

3.9

Jupyter version

jupyterlab 3.6.3

Installation

conda

@tacaswell
Copy link
Member

In the second case pandas installs a different unit converter than in the first:

In [34]: f1.gca().xaxis.converter
Out[34]: <pandas.plotting._matplotlib.converter.DatetimeConverter at 0x7f2d7f08faf0>

In [35]: f2.gca().xaxis.converter
Out[35]: <pandas.plotting._matplotlib.converter.PeriodConverter at 0x7f2d7e765c60>

I suspect (but have not tracked down) that the PeriodConverter returns time in a different base than DatetimeConverter and what the Matplotlib formatters/locators expect (my guess is that it is a nanosecond vs microsecond issue).

The fix for this is probably on the pandas side.

@jklymak
Copy link
Member

jklymak commented Apr 2, 2023

Yes, Sadly you can't mix our Locators with Pandas converters. Pandas will either have to start converting to our epoch or set the epoch to their epoch when they install their converters.

I feel it is quite confusing for downstream libraries to have converters that are not compatible with ours, because it leads to crossed expectations like this. We perhaps should consider some way to mark an axis as incompatible with certain locators...

@rcomer
Copy link
Member

rcomer commented Apr 3, 2023

Would the work for #24951 help us create a more informative error message here?

@ford--prefect
Copy link

If this is a pandas issue: was this reported to pandas already? If so, it might be helpful to provide a reference. If not, the issue should be raised there to enable solution-finding.
I'm bitten by the same bug, but since I have little knowledge about either internals, I feel I can't contribute much here. Feel free to correct me if I'm wrong.

@jklymak
Copy link
Member

jklymak commented Jun 16, 2023

Yes please open at pandas and come back here if matplotlib can help.

@ford--prefect
Copy link

Turns out it is already being discussed in pandas here.

@QuLogic QuLogic removed this from the v3.7.2 milestone Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants