Skip to content

Only rechunk for median if needed#7782

Merged
jsignell merged 4 commits intodask:mainfrom
jsignell:median
Jun 10, 2021
Merged

Only rechunk for median if needed#7782
jsignell merged 4 commits intodask:mainfrom
jsignell:median

Conversation

@jsignell
Copy link
Copy Markdown
Member

From original issue:

import dask.array as da
x = da.random.normal(10, 0.1, size=(100000, 100000), chunks=(10, 100000))
y = da.reductions.median(x, axis=1)
%time z = y.compute()

# CPU times: user 17min 32s, sys: 2.68 s, total: 17min 35s
# Wall time: 1min 36s

@github-actions github-actions bot added the array label Jun 10, 2021
@mrocklin
Copy link
Copy Markdown
Member

In general this looks good to me +1

jsignell and others added 3 commits June 10, 2021 12:06
Co-authored-by: Matthew Rocklin <mrocklin@gmail.com>
@jsignell
Copy link
Copy Markdown
Member Author

Thanks for the review @mrocklin! I'll merge when green.

@jsignell jsignell merged commit 7fe24a3 into dask:main Jun 10, 2021
@jsignell jsignell deleted the median branch June 10, 2021 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

High memory usage for reductions.median but not reductions.mean

2 participants