Skip to content

Avoid unnecessary step in pivot_table#5173

Merged
jcrist merged 1 commit intodask:masterfrom
dsaxton:pivot-mod
Jul 30, 2019
Merged

Avoid unnecessary step in pivot_table#5173
jcrist merged 1 commit intodask:masterfrom
dsaxton:pivot-mod

Conversation

@dsaxton
Copy link
Copy Markdown
Contributor

@dsaxton dsaxton commented Jul 30, 2019

  • [ x ] Passes black dask / flake8 dask

This PR is to avoid doing unnecessary calculation in dataframe.reshape.pivot_table when the aggfunc is either "sum" or "count". Right now both aggregations (sums and counts) are calculated regardless of the aggfunc, even though we only need both when aggfunc == "mean".

@TomAugspurger
Copy link
Copy Markdown
Member

Just to be clear, we weren't doing the actual calculation unnecessarily. We just built the task graph unnecessarily.

The changes here seem fine though, if ever so slightly harder to follow. Will merge tomorrow if there aren't any objections.

@dsaxton dsaxton changed the title Avoid unnecessary aggregation in pivot_table Avoid unnecessary step in pivot_table Jul 30, 2019
@jcrist jcrist merged commit e0a7723 into dask:master Jul 30, 2019
@dsaxton dsaxton deleted the pivot-mod branch July 30, 2019 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants