Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Should extended dtype work as the same as np.dtype? #12619

sinhrks opened this issue Mar 14, 2016 · 3 comments


None yet
3 participants
Copy link

commented Mar 14, 2016

We can create / convert using datetimetz dtype, but it doesn't work some cases.

s = pd.Series([pd.Timestamp('2011-01-31', tz='US/Eastern')])
#0   2011-01-31 00:00:00-05:00
# dtype: datetime64[ns, US/Eastern]

# OK
s.astype('datetime64[ns, Asia/Tokyo]')
#0   2011-01-31 14:00:00+09:00
# dtype: datetime64[ns, Asia/Tokyo]


# numpy (OK)
#0   2011-01-31
# dtype: datetime64[ns]

# extended (NG)
pd.Series([1296432000000000000]).astype('datetime64[ns, Asia/Tokyo]')
# TypeError: Invalid datetime unit in metadata string "[ns, Asia/Tokyo]"

dtype arg

# extended (OK ? I think the result should be 2011-01-31 00:00:00-05:00... see below)
pd.Series([1296432000000000000], dtype='datetime64[ns, US/Eastern]')
#0   2011-01-31 05:00:00-05:00
# dtype: datetime64[ns, US/Eastern]

# ref
pd.Series([1296432000000000000], dtype='datetime64[ns]').dt.tz_localize('US/Eastern')
#0   2011-01-31 00:00:00-05:00
# dtype: datetime64[ns, US/Eastern]
# extended (NG)
pd.Series([pd.Timestamp('2011-01-01', tz='US/Eastern')], dtype='datetime64[ns, US/Eastern]')
# TypeError: data type not understood

@sinhrks sinhrks added this to the 0.18.1 milestone Mar 14, 2016


This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2016

Found dtype arg issue affects to boxing (#12752) issue. Must fix this first.

s = pd.Series([pd.Timestamp('2011-01-01 09:00', tz='US/Eastern')])
# 0   2011-01-01 09:00:00-05:00
# dtype: datetime64[ns, US/Eastern]

# array([1293890400000000000])

# NG, must be 2011-01-01 09:00:00-05:00
pd.Series(s._data._block.values.asi8, dtype='datetime64[ns, US/Eastern]')
# 0   2011-01-01 14:00:00-05:00
# dtype: datetime64[ns, US/Eastern]

@sinhrks sinhrks referenced this issue Apr 2, 2016


CLN: Move boxing logic to BlockManager #12752

5 of 5 tasks complete

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2016

Timestamp/TDI holds internal repr in int, and it refers to absolute time of GMT.

# 1293840000000000000

int(pd.Timestamp('2011-01-01', tz='US/Eastern').asm8)
# 1293858000000000000

Thus, Timestamp creation using int should have the same internal repr.

# Timestamp('2011-01-01 05:00:00')

# 1293858000000000000

pd.Timestamp(1293858000000000000, tz='US/Eastern')
# Timestamp('2011-01-01 00:00:00-0500', tz='US/Eastern')

int(pd.Timestamp(1293858000000000000, tz='US/Eastern').asm8)
# 1293858000000000000

However, the rule is not applied to DTI. DTI must work the same as Timestamp, otherwise boxing against scalar / array outputs different results.

# OK, without TZ
# DatetimeIndex(['2011-01-01 05:00:00'], dtype='datetime64[ns]', freq=None)

# array([1293858000000000000])

# NG, with TZ slides internal repr
pd.DatetimeIndex([1293858000000000000], tz='US/Eastern')
# DatetimeIndex(['2011-01-01 05:00:00-05:00'], dtype='datetime64[ns, US/Eastern]', freq=None)

pd.DatetimeIndex([1293858000000000000], tz='US/Eastern').asi8
# array([1293876000000000000])

This comment has been minimized.

Copy link

commented Apr 3, 2016

yeah, I think there's a bug somewhere where I am converting on a localized UTC somewhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.