Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Avoid groupby.agg(callable) in groupby-var (#4482)
This has two benefits 1. It's much faster the following benchmark shows a 5x improvement 2. It doesn't require the pandas-like container to implement groupby.agg(callable), which helps cudf Benchmark --------- I get five-ish seconds for this on master And less than one second on this branch ``` from time import time import dask df = dask.datasets.timeseries(dtypes={'id': int, 'data': float}).persist() start = time() for i in range(3): df.groupby('id').data.std().compute() stop = time() print(stop - start)` ```
- Loading branch information