Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reset_index loses type information on time indices with time zone information #10762

Closed
ghost opened this issue Aug 7, 2015 · 3 comments
Closed
Labels
Timezones Timezone data dtype Usage Question

Comments

@ghost
Copy link

ghost commented Aug 7, 2015

import pandas as pd

ts1 = pd.Timestamp('2015-07-01 10:00:00')
ts2 = pd.Timestamp('2015-07-01 11:00:00')

df1 = pd.DataFrame([[0.1],[0.2]], index=[ts1, ts2], columns=['a'])
print(df1.index.dtype)
df1 = df1.reset_index(drop=False)
print(df1['index'].dtype)

df2 = pd.DataFrame([[0.1],[0.2]], index=[ts1, ts2], columns=['a'])
df2.index = df2.index.tz_localize('Europe/London')
print(df2.index.dtype)
df2 = df2.reset_index(drop=False)
print(df2['index'].dtype)

gives

datetime64[ns]
datetime64[ns]
datetime64[ns]
object

While one would expect that the last two lines are identical.

@jorisvandenbossche
Copy link
Member

This is because, at this moment, series cannot hold datetimes with timezone information in a proper numpy dtype. Therefore the object dtype is used and the values are stored as pd.Timestamp objects (which can hold timezones).
But this is being worked on (#10477), and the ability to store timezones info in series will be in pandas 0.17

@jorisvandenbossche jorisvandenbossche added this to the No action milestone Aug 7, 2015
@jorisvandenbossche
Copy link
Member

You can also see what I described by looking at the difference of the .values for both:

In [16]: df1['index'].values
Out[16]:
array(['2015-07-01T12:00:00.000000000+0200',
       '2015-07-01T13:00:00.000000000+0200'], dtype='datetime64[ns]')

In [17]: df2['index'].values
Out[17]:
array([Timestamp('2015-07-01 10:00:00+0100', tz='Europe/London'),
       Timestamp('2015-07-01 11:00:00+0100', tz='Europe/London')], dtype=object)

@ghost
Copy link
Author

ghost commented Aug 7, 2015

Thanks. See related issue: #10763

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Timezones Timezone data dtype Usage Question
Projects
None yet
Development

No branches or pull requests

1 participant