Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with nanoseconds #25

Closed
slazarov opened this issue Sep 2, 2019 · 2 comments
Closed

Issues with nanoseconds #25

slazarov opened this issue Sep 2, 2019 · 2 comments

Comments

@slazarov
Copy link

slazarov commented Sep 2, 2019

I am writing the data in nanoseconds but upon retrieving it has lost the precision:

import pystore
import pandas as pd

pystore.set_path("/tmp")
store = pystore.store('test')
collection = store.collection('tick')

ts = '2019-06-15 00:00:12.868214001+00:00'
ts = pd.to_datetime(ts, format="%Y-%m-%d %H:%M:%S.%f")
df = pd.DataFrame({'ts': [ts],
                   'data': [100]})
df.set_index('ts', inplace=True)

name = 'ns_test'
collection.write(name, df, overwrite=True)

result = collection.item(name).to_pandas()
print(f'Before write: {df.index}')
print(f'After read: {result.index}')
print(f'Difference: {df.index.values.astype(int) - result.index.values.astype(int)}')
----
Before write: DatetimeIndex(['2019-06-15 00:00:12.868214001+00:00'], dtype='datetime64[ns, UTC]', name='ts', freq=None)
After read: DatetimeIndex(['2019-06-15 00:00:12.868214+00:00'], dtype='datetime64[ns, UTC]', name='ts', freq=None)
Difference: [1]
@ChristophRose
Copy link

You have to explicitly tell FastParquet to store the timestamp as an int96 when working with nanosecond resolution timestamps.
Luckily PyStore and Dask pass on the argument.

collection.write(name, df, overwrite=True, times="int96")

@slazarov
Copy link
Author

slazarov commented Sep 4, 2019

Thank you for the clarification.

Perhaps this could be added into the example file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants