Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pivot function on timezone aware objects does not preserve timezone info in resulting dataframe index #5878

Closed
mirage007 opened this issue Jan 8, 2014 · 1 comment · Fixed by #7092
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Milestone

Comments

@mirage007
Copy link

This bug is in 0.12.0

Using example DataFrame like below:

   col  data                       time
0    1     0  2013-03-22 11:00:00-04:00
1    2     1  2013-03-22 15:00:00-04:00
2    2     2  2013-03-22 11:00:00-04:00
3    1     3  2013-03-22 15:00:00-04:00

After pivoting, the old behavior in 0.10.1 properly preserved the timezone info in the index, resulting in a new DataFrame like such:

col                        1  2
time                           
2013-03-22 11:00:00-04:00  0  2
2013-03-22 15:00:00-04:00  3  1

However in 0.12.0 this behavior is lost resulting in an index that does not have the timezone information

col                  1  2
time                     
2013-03-22 15:00:00  0  2
2013-03-22 19:00:00  3  1

Below is the code to reproduce this issue:

import pandas
print pandas.__version__
import datetime
import pandas as pn
import pytz
est = pytz.timezone('US/Eastern')
dt1 = est.localize(datetime.datetime(2013,3,22,11,0,0))
dt2 = est.localize(datetime.datetime(2013,3,22,15,0,0))
df = pn.DataFrame({'time': [dt1, dt2, dt1, dt2], 'col': [1, 2, 2, 1], 'data': range(4)})
pivotDf =  df.pivot('time', 'col', 'data')
print df
print pivotDf
print pivotDf.index

the output from 0.10.1 is:

0.10.1
   col  data                       time
0    1     0  2013-03-22 11:00:00-04:00
1    2     1  2013-03-22 15:00:00-04:00
2    2     2  2013-03-22 11:00:00-04:00
3    1     3  2013-03-22 15:00:00-04:00
col                        1  2
time                           
2013-03-22 11:00:00-04:00  0  2
2013-03-22 15:00:00-04:00  3  1
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-03-22 11:00:00, 2013-03-22 15:00:00]
Length: 2, Freq: None, Timezone: US/Eastern

the output from 0.12.0 is:

0.12.0
   col  data                       time
0    1     0  2013-03-22 11:00:00-04:00
1    2     1  2013-03-22 15:00:00-04:00
2    2     2  2013-03-22 11:00:00-04:00
3    1     3  2013-03-22 15:00:00-04:00
col                  1  2
time                     
2013-03-22 15:00:00  0  2
2013-03-22 19:00:00  3  1
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-03-22 15:00:00, 2013-03-22 19:00:00]
Length: 2, Freq: None, Timezone: None

Notice the None in the "Timezone: " infor of the DatetimeIndex.

@jreback
Copy link
Contributor

jreback commented Jan 8, 2014

looks like a bug to me...present in master as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants