-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: to_datetime re-parsing Arrow-backed objects #53301
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a whatsnew too.
Also, does something similar happen to to_timedelta
?
I think so, I'll try to get at it at a followup. I think in the future, it might also make sense for DF inputs with Arrow backed e.g.
should return something with dtype: Main thing I'm worried about with this PR is doing the wrong thing for (cc @phofl in case any thoughts and to double check my logic). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very small comment
pandas/core/tools/datetimes.py
Outdated
elif isinstance(arg_dtype, ArrowDtype) and arg_dtype.kind == "M": | ||
# TODO: Combine with above if DTI/DTA supports Arrow timestamps | ||
if utc: | ||
arg = arg.astype("timestamp[ns, UTC][pyarrow]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather use pd.ArrowDtype(pa...)
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.astype shouldn't allow converting a tz-naive to tzaware (or vice-versa). i think this should use tz_convert/tz_localize instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(btw if .astype works on ArrowEA to convert tznaive to tzaware we probably want to change that to match the non-arrow raising behavior)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(btw if .astype works on ArrowEA to convert tznaive to tzaware we probably want to change that to match the non-arrow raising behavior)
Sure, I'll raise a PR after this one gets in.
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
pandas/core/tools/datetimes.py
Outdated
arg = Index(arg_array._dt_tz_localize("UTC")) | ||
else: | ||
# ArrowExtensionArray | ||
arg = arg._dt_tz_localize("UTC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is tz_localize
always the right one, or do we need tz_convert
if the input was timezone-aware to begin with?
In the test you've added, dti_arrow
is always timezone-naive - would we make a timezone-aware example too please, just to check this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the catch.
I messed up the test (it does parametrize over UTC and US/Central), but I forgot to propagate the tz information in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done, looks good to me!
Thanks @lithomas1 |
Does the to_datetime docstring need to be updated to reflect non-DTI return? |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Ideally, we could push this down into DTA/DTI, but that doesn't seem to support Arrow arrays ATM.