Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
PERF: DataFrame.groupby.nunique is non-performant #15197
Comments
jreback
added Difficulty Intermediate Effort Low Groupby Reshaping
labels
Jan 23, 2017
jreback
added this to the
0.20.0
milestone
Jan 23, 2017
|
cc @xflr6 |
jorisvandenbossche
added the
Performance
label
Jan 23, 2017
jreback
added a commit
to jreback/pandas
that referenced
this issue
Jan 23, 2017
|
|
jreback |
6d02616
|
jreback
closed this
in dc40058
Jan 24, 2017
AnkurDedania
added a commit
to AnkurDedania/pandas
that referenced
this issue
Mar 21, 2017
|
|
jreback + AnkurDedania |
983fdd2
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
jreback commentedJan 23, 2017
•
edited
xref #14376
Series.groupby.nuniquehas a very performant implementation, but the way theDataFrame.groupby.nuniqueis implemented (via.apply) it ends up in a python loop over the groups, which nullifies this.should be straightforward to fix this. need to make sure to test with
as_index=True/False