Support chunking the `time`/`band` dimensions #116

gjoseph92 · 2022-01-12T17:54:15Z

Closes #106, #105, progress towards #26.

This lets you specify chunksize=(20, 1, 2048, 2048) for example. Or even chunksize="auto". I'm hoping this will help with getting graph size under control for large arrays. Particularly when you're doing composites, chunking a few assets together will greatly reduce graph size.

The neat thing is that through just this parameter, you can totally alter the parallelism strategy for a workflow. Are you doing something that's spatially embarrassingly parallel, where intuitively you'd just do a for-loop over tiles? Use chunksize=(-1, 1, 1024, 1024) to process every image within a 1024x1024 tile. Is it spatially parallel, but you want to batch by time as well to keep peak memory down? Try chunksize=(40, 1, 1024, 1024) to keep the 1024x1024 tiles, but only in stacks of 40 images at a time. Or even just chunksize="200MiB" to get reasonably-sized chunks every time.

Implementation-wise, when there are multiple assets in a single chunk, we just load them serially in the task. I think this deserves some research; I have a feeling total runtime might be slower with fewer multi-asset chunks than more single-asset chunks. I could imagine a small threadpool to overlap IO might be reasonable.

This also further speeds up array generation and reduces graph memory usage by using a BlockwiseDep, which will soon be available from dask/dask#7417, to represent the spatial component of the array, instead of the materialized slices_from_chunks graph we had before (for spatially-enormous arrays, this could be rather large).

And, finally, this adds some tests! They're not enabled in CI yet (will be a follow-up PR), but there are Hypothesis tests for the core to_dask routine.

cc @TomAugspurger

Still pretty complicated. Probably need some helpers in order to test the more interesting stuff. Also would like to hypothesis?

opening asset table probably works; reads still todo. a few missteps along the way: * `np.vectorize` on `asset_table_to_reader_and_window` is segfaulting. May have to do with all the extra arguments? Don't care to gdb it; just moved the vectorization into the function * Blockwise wants to fuse the reader table to the reads. I'm not sure it's good that blockwise optimization tries to fuse such a large expansion, but probably kinda makes sense. Unfortunately the way to disable it (annotations) feels like a little bit of a hack

Passing dtype and fill_value into `fetch_raster_window`, instead of storing them in the reader table, simplifies things a lot. Also lets us get rid of those fields on the reader protocol.

small steps!

[PEP 585](https://www.python.org/dev/peps/pep-0585/) is only available in 3.9. We're still supporting 3.8 for now; planetary computer uses it, as do probably many others

gjoseph92 · 2022-02-02T05:39:28Z

Update: I've tested this branch out for making moderate-sized median composites, and it does help, but it's not a panacea.

I'm doing stack(..., chunksize=(-1, 1, "128MiB", "128MiB")).median("time") on a small cluster (4 workers, 16 CPUs, 128GiB memory), loading Sentinel-2 from PC.

Note the -1 chunksize in the first dimension basically makes "spatial skewers": each chunk will be a small spatial tile, which loads every single item into one tall array per tile. This is great specifically for medians, since this is the chunk pattern you need to do median("time") anyway, so it saves an expensive rechunking step.

Here's cluster memory to compute this median composite (dropping each chunk immediately after it's computed, simulating a .to_zarr at the end or something):

Blue is with -1 temporal chunks on this PR, orange is current behavior with 1 temporal chunks.

Clearly cluster memory usage is way lower, and runtime is about the same—great!

The dashboard also looks a bit cleaner (fewer transfers). Plus, there are 10x fewer tasks with -1 chunks!

-1 chunks with this PR:

1 chunks, current behavior:

BUT as I said, it's not a panacea. I've tried doing some actually-big composites, like processing 30TiB, and the graph is still just way too big. When your input data is 200k chunks, you do three operations and you have 1mil tasks. After uploading 800MiB of graph to the scheduler (!!), I'm still waiting 20min later for anything to happen.

Additionally, probably 95% of the data is NaN generated by stackstac for the pixels out-of-bounds of each asset. stackstac is very efficient about generating these NaNs—it's super fast and takes up almost no space—but in the end, those fake broadcast-trick arrays will likely get expanded into full-size NumPy arrays by downstream operations. And they represent tons of useless tasks which just bog down the scheduler.

In the end, I just don't believe that (dask) arrays / DataArrays are a good data structure for geospatial operations at a large scale. Small to medium scale, they're decent, since they're intuitive and have so much functionality built in. But at larger scales, the lack of sparse-ness becomes problematic. And you really need to push any compositing/filtering/grouping into the data-loading step. In theory this is the sort of thing optimizations should be good for, but in reality it would be very hard to express currently (dask/dask#7933).

gjoseph92 added 18 commits January 12, 2022 02:42

WIP

5b8dd93

working, somewhat-reasonable first test

b70b453

Still pretty complicated. Probably need some helpers in order to test the more interesting stuff. Also would like to hypothesis?

missed one fill_value=None case

961f76f

multi-element reader table

be28969

Passing dtype and fill_value into `fetch_raster_window`, instead of storing them in the reader table, simplifies things a lot. Also lets us get rid of those fields on the reader protocol.

update comments

ae17864

don't expand output array when data is all empty

6e0016e

no need to round

f37e5ed

hypothesis for dtype and fill value

b81527e

small steps!

hypothesis for asset table and spec bounds

50e4476

hypothesis for chunksize_yx

2a728ec

clean up strategies

87176e5

support full chunk sizes!

99c3818

comments

beee6aa

Use window_from_bounds fix

8a40e3c

dtype and fill value in token

335c36a

fix comments

34ca1ef

Can't use standard collections as types in 3.8

c4f759a

[PEP 585](https://www.python.org/dev/peps/pep-0585/) is only available in 3.9. We're still supporting 3.8 for now; planetary computer uses it, as do probably many others

gjoseph92 mentioned this pull request Jan 14, 2022

Culling massive Blockwise graphs is very slow, not constant-time dask/dask#8570

Open

gjoseph92 mentioned this pull request Jan 31, 2022

Better errors for pystac/satstac import issues #125

Merged

gjoseph92 added 4 commits January 31, 2022 16:43

Merge branch 'main' into from-array

ab51354

Use ArraySliceDep from dask

883c8fa

Test broadcast-trick logic

1b03d4b

update docstring

0466561

gjoseph92 force-pushed the from-array branch from 9066f19 to 0466561 Compare February 1, 2022 00:24

gjoseph92 marked this pull request as ready for review February 1, 2022 00:25

gjoseph92 mentioned this pull request Feb 1, 2022

Use tree-reduction in mosaic #126

Merged

gjoseph92 merged commit 0bc305f into main Feb 2, 2022

gjoseph92 deleted the from-array branch February 2, 2022 05:41

gjoseph92 mentioned this pull request Feb 2, 2022

Revamp Dask Array creation logic to be fully blockwise #105

Closed

gjoseph92 mentioned this pull request Feb 10, 2022

Try using chunksize=(-1, 1, N, N) with stackstac? Kirill888/benchmark-report#2

Open

gjoseph92 mentioned this pull request Jun 10, 2022

stackstac.stack to support one time coordinate per unique datetime? #66

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support chunking the `time`/`band` dimensions #116

Support chunking the `time`/`band` dimensions #116

gjoseph92 commented Jan 12, 2022

gjoseph92 commented Feb 2, 2022

Support chunking the time/band dimensions #116

Support chunking the time/band dimensions #116

Conversation

gjoseph92 commented Jan 12, 2022

gjoseph92 commented Feb 2, 2022

Support chunking the `time`/`band` dimensions #116

Support chunking the `time`/`band` dimensions #116