groupby().sum() very slow when applied to boolean columns #2692

Closed
lselector opened this Issue Jan 14, 2013 · 1 comment

Comments

Projects
None yet
2 participants

While upgrading pandas from 0.7.2 to 0.9.1 we have bumped into slowness of certain groupby().sum() operations. Here is a simple example:

N=10000
aa=DataFrame({'ii':range(N),'bb':[True for x in range(N)]})
timeit aa.sum() # fast
timeit aa.groupby('bb').sum() #fast
timeit aa.groupby('ii').sum() # very slow (~ 1000 times slower)

Owner

wesm commented Jan 14, 2013

Strange. Thanks for letting me know-- I will have a look

wesm was assigned Jan 19, 2013

wesm closed this in b5b04e0 Jan 19, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment