Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] zoneinfo timezones failing during type inference #31548

Closed
asfimport opened this issue Apr 7, 2022 · 3 comments · Fixed by #34394
Closed

[Python] zoneinfo timezones failing during type inference #31548

asfimport opened this issue Apr 7, 2022 · 3 comments · Fixed by #34394

Comments

@asfimport
Copy link
Collaborator

The conversion itself works fine (eg when specifying type=pa.timestamp("us", tz="America/New_York") in the below example), but inferring the type and timezone from the first value fails if it has a zoneinfo timezone:

In [53]: tz = zoneinfo.ZoneInfo(key='America/New_York')

In [54]: dt = datetime.datetime(2013, 11, 3, 10, 3, 14, tzinfo = tz)

In [55]: pa.array([dt])
....
ArrowInvalid: Object returned by tzinfo.utcoffset(None) is not an instance of datetime.timedelta

cc @AlenkaF

Reporter: Joris Van den Bossche / @jorisvandenbossche
Watchers: Rok Mihevc / @rok

Note: This issue was originally created as ARROW-16140. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Todd Farmer / @toddfarmer:
This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

@AlenkaF
Copy link
Member

AlenkaF commented Jan 24, 2023

It seems this works correctly on latest master:

import pyarrow as pa
import datetime
import zoneinfo

tz = zoneinfo.ZoneInfo(key='America/New_York')
dt = datetime.datetime(2013, 11, 3, 10, 3, 14, tzinfo = tz)
pa.array([dt])
# <pyarrow.lib.TimestampArray object at 0x12ca01280>
# [
#   2013-11-03 15:03:14.000000
# ]
pa.array([dt], type=pa.timestamp("us", tz="America/New_York"))
# <pyarrow.lib.TimestampArray object at 0x12ca01400>
# [
#   2013-11-03 15:03:14.000000
# ]
pa.__version__
# '11.0.0.dev477+gc84d2dabc.d20230119'

@jorisvandenbossche
Copy link
Member

We should verify if we have an explicit test for this, otherwise we can do a PR to add a test before closing this.

Looking for "zoneinfo", I only see it in the hypothesis strategies and in test_types.py (but that is testing more specifically the conversion to/from string for the timezone object itself). I find it a bit hard to see what the hypothesis tests exactly cover, so adding an explicit test might be a good idea.

In test_convert_builtin.py, we have a test like test_sequence_timestamp_with_timezone_inference that uses pytz, we can add something similar with zoneinfo.

@AlenkaF AlenkaF self-assigned this Mar 1, 2023
jorisvandenbossche pushed a commit that referenced this issue Mar 14, 2023
…ype inference (#34394)

### What changes are included in this PR?

Explicit test for timestamp inference when creating `pyarrow.Array` with a datetime that has `zoneinfo` timezone specified.
* Closes: #31548

Authored-by: Alenka Frim <frim.alenka@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
@jorisvandenbossche jorisvandenbossche added this to the 12.0.0 milestone Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants