Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
groupby.apply datetime bug affecting 0.17 #11324
Comments
|
canonical way of selection a max column (and way way more efficient)
|
|
I guess this a bug. You are doing a really really odd thing here though. |
jreback
added Bug Prio-low Groupby Difficulty Novice Effort Low
labels
Oct 14, 2015
jreback
added this to the
Next Major Release
milestone
Oct 14, 2015
hadjmic
commented
Oct 14, 2015
|
Imagine it in the following context: Objective is to create a new dataframe to see what the users have done. Thus, need to group by the user_identifier and somehow aggregate the events of each user. One of the things you need to find is the first and last timestmap the user interacted with the server. Hope this clarifies things a bit. Pandas is awesome by the way, you guys rule. |
|
@hadjmic would df.groupby(['user_identifier']).timestamp.agg(['min, 'max'])work for you? You can also control the naming with |
|
did my example in [59] not clarify? my point is technically using apply like this is ok, but canonically it is quite confusing. |
hadjmic
commented
Oct 14, 2015
|
Perhaps it would have been clearer if I said I have a processUserEvents function. The function takes a dataframe of user events as input (i.e. each group of the groupby operation) and returns back a Series with specific user characteristics. Among those, are the min and max of the timestamp, but there are a lot of other stuff involved, such as values extracted from url paths, query strings, flow paths, etc. |
This was referenced Oct 28, 2015
jreback
modified the milestone: 0.17.1, Next Major Release
Oct 28, 2015
robdmc
added a commit
to robdmc/pandas
that referenced
this issue
Nov 4, 2015
|
|
robdmc |
4791e15
|
jreback
added a commit
that referenced
this issue
Nov 13, 2015
|
|
robdmc + jreback |
5df693f
|
|
closed by #11548 |
hadjmic commentedOct 14, 2015
Exception is raised when
a) the original dataframe has a datetime column
b) the groupby.apply function returns a series object with a new datetime column
Code to reproduce:
This is a new issue affecting 0.17