Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_datetime %Y%m%d does not coerce correctly #7930

Closed
CarstVaartjes opened this issue Aug 4, 2014 · 5 comments · Fixed by #7931
Closed

to_datetime %Y%m%d does not coerce correctly #7930

CarstVaartjes opened this issue Aug 4, 2014 · 5 comments · Fixed by #7931
Labels
Bug IO Data IO issues that don't fit into a more specific label Timeseries
Milestone

Comments

@CarstVaartjes
Copy link

Hi,

see this example:

# imagine a dataframe with a from/to 
x_df = DataFrame([[20120101, 20121231], [20130101, 20131231], [20140101, 20141231], [20150101, 99991231]])
x_df.columns = ['date_from', 'date_to']
date_def = '%Y%m%d'
# so everything is ok & peachy for the from dates
x_df['date_from_2'] = pd.to_datetime(x_df['date_from'], format=date_def, coerce=True)
x_df['date_from_2'].dtype
list(x_df['date_from_2'])
# but with out of bound dates it goes horribly wrong
x_df['date_to_2'] = pd.to_datetime(x_df['date_to'], format=date_def, coerce=True)
x_df['date_to_2'].dtype 
list(x_df['date_to_2']) # note the lack of NATs and conversion to datetime.datetime instead of np.datetime64
# now we can do 
x_df['date_to_3'] = [np.datetime64(date_val, unit='s') for date_val in x_df['date_to_2']] # works great but unfortunately pandas chose to aim for nanoseconds as a standard for date detail...
x_df['date_to_3'] = [pd.Timestamp(date_val) for date_val in x_df['date_to_2']] # which breaks on the 99991231 example

This is a bug, but also related to the discussion in #7307. Probably has to do with:
"Note Specifying a format argument will potentially speed up the conversion considerably and on versions later then 0.13.0 explicitly specifying a format string of ‘%Y%m%d’ takes a faster path still."

@jreback
Copy link
Contributor

jreback commented Aug 4, 2014

easy enough

@CarstVaartjes
Copy link
Author

Thanks! (insightful commit, I guess it's "easy" when you know the libtools and all places by heart hehe ;)

@jreback
Copy link
Contributor

jreback commented Aug 4, 2014

haha...(and I wrote the original :)

@jrovegno
Copy link

jrovegno commented Nov 6, 2014

where is the like button? when you search for an error and you realize that is a bug fixed in a newer release. 👍

@jreback
Copy link
Contributor

jreback commented Nov 7, 2014

@jrovegno thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label Timeseries
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants