Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupby().sum() very slow when applied to boolean columns #2692

Closed
lselector opened this issue Jan 14, 2013 · 1 comment
Closed

groupby().sum() very slow when applied to boolean columns #2692

lselector opened this issue Jan 14, 2013 · 1 comment
Assignees
Labels
Bug Performance Memory or execution speed performance
Milestone

Comments

@lselector
Copy link

While upgrading pandas from 0.7.2 to 0.9.1 we have bumped into slowness of certain groupby().sum() operations. Here is a simple example:

N=10000
aa=DataFrame({'ii':range(N),'bb':[True for x in range(N)]})
timeit aa.sum() # fast
timeit aa.groupby('bb').sum() #fast
timeit aa.groupby('ii').sum() # very slow (~ 1000 times slower)

@wesm
Copy link
Member

wesm commented Jan 14, 2013

Strange. Thanks for letting me know-- I will have a look

@ghost ghost assigned wesm Jan 19, 2013
@wesm wesm closed this as completed in b5b04e0 Jan 19, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Performance Memory or execution speed performance
Projects
None yet
Development

No branches or pull requests

2 participants