Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add __from_pyarrow__ support to DatetimeTZDtype #52201

Merged
merged 34 commits into from
Apr 17, 2023

Conversation

tswast
Copy link
Contributor

@tswast tswast commented Mar 25, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Closes #52200

@tswast tswast force-pushed the DatetimeTZDtype-from_arrow branch from 4d46462 to 6985319 Compare March 25, 2023 15:02
@tswast tswast requested a review from mroeschke March 25, 2023 22:09
@tswast tswast requested a review from jreback March 26, 2023 12:51
@mroeschke mroeschke added Datetime Datetime data dtype Arrow pyarrow functionality labels Mar 27, 2023
@tswast
Copy link
Contributor Author

tswast commented Mar 28, 2023

Any additional feedback on this? All checks have been green, and it's aligned with what we do for the other Extension Dtypes.

@jbrockmendel
Copy link
Member

I'd like to see more buy-in on ignoring iNaT

@mroeschke
Copy link
Member

I would be okay with ignoring iNaT for now. Would be good to write a comment in __from_arrow__

@tswast
Copy link
Contributor Author

tswast commented Mar 28, 2023

Thanks @mroeschke @jbrockmendel I've added a docstring to __from_arrow__ clarifying the iNaT behavior.

@mroeschke
Copy link
Member

Looks like test_astype_from_non_pyarrow is failing the pyarrow nightly build. Possibly this changes something with the upcoming pyarrow release?

@tswast
Copy link
Contributor Author

tswast commented Apr 11, 2023

Looks like test_astype_from_non_pyarrow is failing the pyarrow nightly build. Possibly this changes something with the upcoming pyarrow release?

Thanks for the FYI. Looks like maybe some different assumptions about timezones in latest pyarrow? I'll investigate.

@tswast
Copy link
Contributor Author

tswast commented Apr 13, 2023

Possible bug with: apache/arrow#34270 ? If the timezones aren't the same (or UTC and naive) then it shouldn't be no-copy, right?

@tswast
Copy link
Contributor Author

tswast commented Apr 14, 2023

@mroeschke I think I found the issue. I don't think I should be calling tz_localize, but there was a bug where the units weren't being copied over from the DatetimeArray constructor that I vaguely recall trying to work around. Should be fixed now.

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing! Will merge in a few days in there's no other comments

@mroeschke mroeschke added this to the 2.1 milestone Apr 17, 2023
@mroeschke mroeschke merged commit 436f5eb into pandas-dev:main Apr 17, 2023
@mroeschke
Copy link
Member

Thanks for sticking with this @tswast

@tswast tswast deleted the DatetimeTZDtype-from_arrow branch April 17, 2023 17:02
jorisvandenbossche pushed a commit to apache/arrow that referenced this pull request Jun 1, 2023
…35546)

### Rationale for this change
In pandas version 2.1.0 `__from_arrow__` method has been implemented for `DatetimeTZDtype` extension type. See pandas-dev/pandas#52201.

Due to this change, the conversion from pyarrow table column with datetime uses the newly implemented `__from_arrow__` method for the pandas dtype and `pandas_api.series` to construct the pandas series. We had to pass the name in the `pandas_api.series` also, to fix failing tests in #35248.

This PR adds explicit test for this change.
* Closes: #35250

Authored-by: Alenka Frim <frim.alenka@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: DatetimeTZDtype is missing __from_arrow__
5 participants