Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Taking first row from each group in groupby sometimes strips tzinfo #10668
Comments
louispotok
commented
Jul 24, 2015
|
And according to this comment the same thing happens if you replace cell [4] with the more pandonic line:
|
|
its a bug. I thought we had an issue for this already, but can't seem to find it. |
jreback
added Bug Groupby Timezones
labels
Jul 24, 2015
jreback
added this to the
Next Major Release
milestone
Jul 24, 2015
|
I can also confirm bug for But does work correctly for |
|
This is all ok on master, so all this issue needs is probably a few confirming tests. @cfperez, @louispotok interesested in a pull-request?
|
jreback
added Testing and removed Bug
labels
Oct 26, 2015
jreback
modified the milestone: 0.17.1, Next Major Release
Oct 26, 2015
|
@jreback I'm working of the latest commit, and problem now is that the timestamp is wrong (exactly 8 hours off reflecting the timezone difference) even while the timezone is preserved. Note that Also, why don't these two methods return the same indices? In your example, I can add tests but still think this is a bug (and unsure how deep the rabbit hole goes.) |
|
ahh wasn't paying enough attention ok will mark it has a bug again then |
sinhrks
referenced
this issue
Oct 28, 2015
Open
API: Datetime tz-aware/tz-naive ops consistency #11456
jreback
modified the milestone: Next Major Release, 0.17.1
Nov 13, 2015
This was referenced Feb 18, 2016
|
Dupe of #12716. |
jreback
added Bug Difficulty Intermediate Effort Medium
labels
Apr 6, 2016
jreback
modified the milestone: 0.18.1, Next Major Release
Apr 6, 2016
jreback
referenced
this issue
Apr 14, 2016
Closed
bugs when groupby/aggregate on columns with datetime64[ns, timezone] #12898
jreback
modified the milestone: 0.18.1, 0.18.2
Apr 26, 2016
jorisvandenbossche
modified the milestone: Next Major Release, 0.19.0
Aug 21, 2016
This was referenced Sep 8, 2016
jreback
added the
Duplicate
label
Feb 16, 2017
|
better example I think in #15426 |
louispotok commentedJul 24, 2015
xref #12898 (same fix)
(c.f. http://stackoverflow.com/questions/31617084/how-to-have-groupby-first-not-remove-timezone-info-from-datetime-columns)
Take a dataframe with a column of tz-aware datetime.datetime objects, and group it by a different column, then return the first row from each group. There are some ways to do this that leave the datetime as it is; and then at least two ways that convert it to a tz-naive pandas Timestamp object.
And apparently
grouped.apply(lambda x: x.iloc[0])does the same as.first().