Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quadratic mean xarray workload #896

Merged
merged 1 commit into from
Jul 4, 2023
Merged

Add quadratic mean xarray workload #896

merged 1 commit into from
Jul 4, 2023

Conversation

fjetter
Copy link
Member

@fjetter fjetter commented Jun 30, 2023

This is an xarray workload that apparently is a stereotypical example of workloads that don't run well on dask, see pangeo-data/distributed-array-examples#2

In it's current form, it runs into a lot of spilling and runs for about 2:30min even though it should be an almost embarrassingly parallel reduction. The culprit is dask.order see dask/dask#10384 (I have a bit of a messed up order prototype where this workload runs in ~30s)

I'm not writing this test case as adaptable to cluster size. Mostly because in my experience the automatic scaling functions do not exactly what they are supposed to and I consider the value-add not worth the time tweaking this to get it into the same shape.

Copy link
Member

@hendrikmakait hendrikmakait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @fjetter!

@hendrikmakait hendrikmakait merged commit acbcd92 into main Jul 4, 2023
13 of 14 checks passed
@fjetter fjetter deleted the quadratic_mean branch July 5, 2023 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants