Loss of datetime precision with Pandas < 0.20 #16

amacd31 · 2017-08-08T08:56:00Z

With Pandas < 0.20 the precision of datetimes can be lost when writing irregular series.

This can occur when the data to be written can be stored as float32 and Pandas treats the data as float32 instead of float64. The below snippet of code demonstrates the problem and the final assert will fail with older versions of Pandas.


sample = pd.DataFrame(
    pd.np.array(
        [x + 0.1 for x in range(10)],
        dtype=pd.np.float32
    ),
    index=pd.date_range(
        '2017-08-06 06:50:00+00:00',
        periods=10,
        freq='1T'
    )
)

sample['datestamp'] = pd.Series(
    sample.index.map(
        lambda dateval: dateval.value // 1000000000
    ),
    index=sample.index
)

sample.values.astype(pd.np.int64)

print(sample.values.astype(pd.np.int64)[0][1])
assert sample.values.astype(pd.np.int64)[0][1] == 1502002200

This issue stems from the fact that calling .values on a pandas.DataFrame should cast the data types to the a capable common data type. However the bug in pandas results in the int64 being cast to a float32, losing precision in the process, instead of correctly upcasting everything to float64 (which can hold both types successfully).

This bug can be worked around by either ensuring that the time series to be written has a dtype of float64 or you are using Pandas >= 0.20.

The text was updated successfully, but these errors were encountered:

amacd31 added the bug label Aug 8, 2017

amacd31 closed this as completed in 719246b Aug 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss of datetime precision with Pandas < 0.20 #16

Loss of datetime precision with Pandas < 0.20 #16

amacd31 commented Aug 8, 2017

Loss of datetime precision with Pandas < 0.20 #16

Loss of datetime precision with Pandas < 0.20 #16

Comments

amacd31 commented Aug 8, 2017