Skip to content

[QUESTION] Plans for an equivalent to pandas groupby? #341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bscully27 opened this issue Apr 1, 2020 · 4 comments
Open

[QUESTION] Plans for an equivalent to pandas groupby? #341

bscully27 opened this issue Apr 1, 2020 · 4 comments
Assignees

Comments

@bscully27
Copy link

I just started using this library, love it.

Quick question - are there any plans for an equivalent to pandas groupby?

Something like:
bn.group_by(matrix[:, :2]) .reduce(matrix[:, -1], np.sum)

@qwhelan
Copy link
Collaborator

qwhelan commented Apr 2, 2020

To be honest, I hadn't considered it. Are you looking to avoid a pandas dependency or see this as a way to get more performance?

@bscully27
Copy link
Author

The latter, to get more performance. I believe pandas groupby has been optimized (not sure if via Cython) but a bottleneck C function would provide substantial speed gains.

@qwhelan
Copy link
Collaborator

qwhelan commented Apr 2, 2020

Okay, thanks for clarifying. I'll keep this open in case someone would like to try out PRs in this vein, but probably won't take a more serious look at this myself until I clear out the backlog.

@max-sixty
Copy link

FYI for anyone looking for these — numbagg has groupby functions. It makes a good complement to bottleneck...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants