Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: read_csv not correctly infering datetime64[ns] with embedded nan/NaT #3062

Closed
jreback opened this issue Mar 15, 2013 · 1 comment · Fixed by #3621
Closed

BUG: read_csv not correctly infering datetime64[ns] with embedded nan/NaT #3062

jreback opened this issue Mar 15, 2013 · 1 comment · Fixed by #3621
Labels
Enhancement IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Mar 15, 2013

This is known, but putting up for reference
easy (after reading) workaround, but could be done inilne

In [28]: df = pd.DataFrame(dict({
  'A' : np.asarray(range(10),dtype='float64'), 
  'B' : pd.Timestamp('20010101') }))

In [29]: df.ix[3:6,:] = np.nan

In [30]: df
Out[30]: 
    A                   B
0   0 2001-01-01 00:00:00
1   1 2001-01-01 00:00:00
2   2 2001-01-01 00:00:00
3 NaN                 NaT
4 NaN                 NaT
5 NaN                 NaT
6 NaN                 NaT
7   7 2001-01-01 00:00:00
8   8 2001-01-01 00:00:00
9   9 2001-01-01 00:00:00

In [31]: df.dtypes
Out[31]: 
A           float64
B    datetime64[ns]
dtype: object

In [32]: df.to_csv('test.h5')

In [33]: df2 = pd.read_csv('test.h5',index_col=0)

In [34]: df2.dtypes
Out[34]: 
A    float64
B     object
dtype: object

To fix, force a conversion to datetimes

In [35]: df2['B'] = pd.to_datetime(df2['B'])

In [36]: df2
Out[36]: 
    A                   B
0   0 2001-01-01 00:00:00
1   1 2001-01-01 00:00:00
2   2 2001-01-01 00:00:00
3 NaN                 NaT
4 NaN                 NaT
5 NaN                 NaT
6 NaN                 NaT
7   7 2001-01-01 00:00:00
8   8 2001-01-01 00:00:00
9   9 2001-01-01 00:00:00

In [37]: df2.dtypes
Out[37]: 
A           float64
B    datetime64[ns]
dtype: object

@wesm
Copy link
Member

wesm commented Mar 19, 2013

Marked as an "enhancement" to get something better for this at some point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO Data IO issues that don't fit into a more specific label
Projects
None yet
2 participants