BUG/API: Index creation with different tz coerces DatetimeIndex #11696

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
2 participants
Member

sinhrks commented Nov 24, 2015

Closes #11488.

Based on the type check added in #11588, the change affects to fewer functions than I originally thought.

sinhrks added this to the 0.18.0 milestone Nov 24, 2015

@jreback jreback commented on an outdated diff Nov 27, 2015

pandas/tslib.pyx
@@ -1409,6 +1409,26 @@ cpdef inline object maybe_get_tz(object tz):
return tz
+cpdef is_datetime_with_singletz_array(ndarray[object] values):
+ """
+ Check values have the same tzinfo attribute.
+ Doesn't check values are datetime-like types.
+ """
+ cdef Py_ssize_t i, n = len(values)
+ cdef object base_tz, tz
+
+ if n == 0:
+ return False
+
+ base_tz = getattr(values[0], 'tzinfo', None)
+ base_tz = get_timezone(base_tz)
@jreback

jreback Nov 27, 2015

Contributor

I think this might fail with NaT

@jreback jreback commented on an outdated diff Nov 27, 2015

pandas/tslib.pyx
+ """
+ Check values have the same tzinfo attribute.
+ Doesn't check values are datetime-like types.
+ """
+ cdef Py_ssize_t i, n = len(values)
+ cdef object base_tz, tz
+
+ if n == 0:
+ return False
+
+ base_tz = getattr(values[0], 'tzinfo', None)
+ base_tz = get_timezone(base_tz)
+
+ for i in range(1, n):
+ tz = getattr(values[i], 'tzinfo', None)
+ if base_tz != get_timezone(tz):
@jreback

jreback Nov 27, 2015

Contributor

I think you can ignore if this is None (meaning its a NaT)

@jreback jreback and 1 other commented on an outdated diff Nov 27, 2015

pandas/tslib.pyx
+ Doesn't check values are datetime-like types.
+ """
+ cdef Py_ssize_t i, n = len(values)
+ cdef object base_tz, tz
+
+ if n == 0:
+ return False
+
+ base_tz = getattr(values[0], 'tzinfo', None)
+ base_tz = get_timezone(base_tz)
+
+ for i in range(1, n):
+ tz = getattr(values[i], 'tzinfo', None)
+ if base_tz != get_timezone(tz):
+ return False
+ return True
@jreback

jreback Nov 27, 2015

Contributor

I also don't think you actually need to use get_timezone, just check object equality no? (e.g. that the timezones are all the same object)

@jreback

jreback Nov 27, 2015

Contributor

I would also move this to inference.pyx I think

@sinhrks

sinhrks Nov 28, 2015

Member

get_timezone is needed to handle DST properly:

import pandas as pd
t1 = pd.Timestamp('2011-01-01', tz='US/Eastern')
t2 = pd.Timestamp('2011-08-01', tz='US/Eastern')

t1.tz == t2.tz
# False

from pandas.tslib import get_timezone
get_timezone(t1.tz) == get_timezone(t2.tz)
# True
@jreback

jreback Nov 28, 2015

Contributor

hmm ok

do an equality check on tz first then only check get_timezone if it's False (of course if that's False then return)

I think this would be much more performant

@jreback

jreback Nov 29, 2015

Contributor

is my comment above worth implementing? is this performant enough? (e.g. I dont' think get_timezone is particualy performant when called lots of times....)

@jreback

jreback Dec 6, 2015

Contributor

was this worth it?

Member

sinhrks commented Nov 28, 2015

Thanks, moved to inferrence.pyx. Also fixed and added tests for NaT.

Contributor

jreback commented Nov 29, 2015

lgtm. comment above.

Member

sinhrks commented Dec 10, 2015

Thanks, I've fixed the tz comparison logic based on your comment below.

do an equality check on tz first then only check get_timezone if it's False

Contributor

jreback commented Dec 10, 2015

merged via b80b5c7

thanks!

jreback closed this Dec 10, 2015

sinhrks deleted the sinhrks:index_new_tz branch Dec 10, 2015

sinhrks referenced this pull request Jul 12, 2016

Open

API: Datetime tz-aware/tz-naive ops consistency #11456

3 of 9 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment