Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration test: common physical science workload #174

Closed
Tracked by #138
gjoseph92 opened this issue Jun 13, 2022 · 1 comment
Closed
Tracked by #138

Integration test: common physical science workload #174

gjoseph92 opened this issue Jun 13, 2022 · 1 comment
Labels
stability work related to stability test idea

Comments

@gjoseph92
Copy link
Contributor

gjoseph92 commented Jun 13, 2022

dask/distributed#6571

I believe this could be simplified to something like

import dask.array as da

a = da.random.random((1000, 900, 800), chunks=100)
b = da.random.random((1000, 900, 800), chunks=100)
x = da.random.random((900, 800), chunks=100)
y = da.random.random((900, 800), chunks=100)

result = a[1:] * x + b[1:] * y
result.mean().compute()

This is basically the vorticity example from dask/distributed#6560 (comment). (Ideally downsized a bit, so it's faster and cheaper to run frequently.)

Add rechunks before the result operation for bonus complexity. (Note that the data and chunk sizes are just representative here and would need to be set to something larger than the cluster.)

This is expected to perform poorly right now (use a ton of memory, spill a ton to disk, and be really slow) because of:

As we work on fixes to those above issues in the future, we should hopefully see performance improve.

@gjoseph92
Copy link
Contributor Author

I think I already did this in #243: https://github.com/coiled/coiled-runtime/blob/669a3d36eefec2c72af12dae08e2bee8db4a7de5/tests/benchmarks/test_array.py#L90-L133

I used the original vorticity example rather than the further-simplified one here, but I think they're equivalent in terms of the problems they exercise.

One big difference is that I did wait(arr_to_devnull(result)) instead of wait(result). In @TomNicholas's original workload, he wanted to write the array to zarr (I think right)? In my benchmarks in dask/distributed#6560 (comment), the "something else is going on here" thing that confused me was that in all cases, memory just went 📈, and ended at the same level in all cases.

I realized later that I was persisting the entire result, so dask had to keep all the output in memory, so of course it used a lot of memory. In #243 I instead simulated writing it to zarr, which theoretically should be able to work with a much-larger-than-memory array, but didn't due to all the problems listed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stability work related to stability test idea
Projects
None yet
Development

No branches or pull requests

1 participant