Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Plotting of Pandas DataFrame data with nanosecond timestamps #23493

Open
wolfig opened this issue Jul 26, 2022 · 3 comments
Open

[Bug]: Plotting of Pandas DataFrame data with nanosecond timestamps #23493

wolfig opened this issue Jul 26, 2022 · 3 comments

Comments

@wolfig
Copy link

wolfig commented Jul 26, 2022

Bug summary

When plotting time-series data from a high-frequency data source, the plot looks distorted when the values are plotted against a nanosecond timestamp.

I am working in PyCharm 2021.2.2 Community Edition, without Jupiter notebook

Code for reproduction

import pandas as pd
from pandas import DataFrame
from matplotlib import pyplot as plt


if __name__ == '__main__':
    data = DataFrame(pd.read_hdf('YRT1DT2F_rawdata_series_2022-05-10_0h-6h.h5', 'YRT1DT2F'))
    data = data[pd.to_datetime('2022-05-10 04:28:32.573995', utc=False, unit='ns'):pd.to_datetime('2022-05-10 04:28:32.574', utc=False, unit='ns')]

    print(data)

    plt.plot(data['value']) # produces bad output
    plt.show()
    plt.plot(data['value'].values) # produces good output
    plt.show()

Actual outcome

Plot result of data['values'].values
Bildschirmfoto 2022-07-26 um 15 47 09

Plot result of data['values']
Bildschirmfoto 2022-07-26 um 15 47 40

Expected outcome

I would expect the plot result to lookalike "good plot", but with time-scaled x-Axis

Additional information

No response

Operating system

MacOS 12.4 (Monterey)

Matplotlib Version

3.5.2

Matplotlib Backend

MacOSX

Python version

3.9.13

Jupyter version

No response

Installation

pip

@wolfig
Copy link
Author

wolfig commented Jul 26, 2022

As it seems, matplotlib cannot deal with nanosecond timestamps

@jklymak
Copy link
Member

jklymak commented Jul 26, 2022

Can you provide a self contained example that doesn't involve pandas, otherwise please open an issue with them. Thanks.

@jklymak jklymak added the status: needs clarification Issues that need more information to resolve. label Jul 26, 2022
@jklymak
Copy link
Member

jklymak commented Jul 26, 2022

However, I agree that just using Matplotlib does not give optimal results either:

from matplotlib.colors import Normalize
import matplotlib.pyplot as plt
import numpy as np

times = (np.array([5, 7, 15, 23, 68], dtype='timedelta64[ns]') +
         np.datetime64('2022-07-26T04:50:20', 'ns'))

fig, ax = plt.subplots()
ax.plot(times, np.array([5, 7, 15, 23, 68]))
plt.show()

I think there have been attempts to fix this in the past, but it usually bumps up against the fact that nano-seconds are probably best plotted as deltas from a starting time.

@jklymak jklymak added topic: date handling and removed status: needs clarification Issues that need more information to resolve. labels Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants