[ROI] Per-ROI parallelization introduces dask overhead (with respect to per-chunk parallelization) #26

tcompa · 2022-07-27T12:32:19Z

TL;DR
Working on an array through ROI indices (rather than chunks) produces more complex dask graphs and has a (small?) time overhead. This is not surprising, given the increased flexibility we aim for.

With per-ROI parallelization, the elements of a dask array (identified by their indices) are populated with values obtained from a delayed function (e.g. illumination correction, or labeling). The parallelization based on this loop of assignments through indices is more complex than the one provided by map_blocks, which rather acts directly onto chunks.

As an example, here are two dask graphs for the OLD (per-chunk) and NEW (per-ROI) illumination-correction tasks (with overwrite=False, but that doesn't matter), when acting on an artificial array with shape (2, 2, 4320, 2560).
The on-disk array has 8 chunks (2 channels, 2 Z planes and 2 images), and indeed we notice 8 branches in the OLD graph. The NEW graph also has 8 branches (one per ROI), but with a more complex structure.

Timing of these tests shows a small additional overhead in the ROI-based version (about 0.5 s of overhead, for a total runtime of about 5 s). This is not something we can get rid of, as we switched from a natural (per-chunk) parallelization scheme to an arbitrary one. Still, we should check that it is still under control for an example at scale.

OLD (per-chunk):

NEW (per-ROI):

The text was updated successfully, but these errors were encountered:

tcompa · 2022-07-27T12:32:53Z

Closing this issue would require a comparison of run times for the per-chunk and per-FOV illumination correction tasks on the same dataset.

jluethi · 2022-07-27T12:49:23Z

Cool, very visual way to understand this, thanks @tcompa !
If we remain at ~10% overhead for arbitrary ROIs, that's totally fine :)

Can we run e.g. the illumination correction for the 10 well, 5x5 case example in the old and new setup? If they remain within a ~10% range of run time, I don't think we'd need to worry much about this.

Also, this explains why it's save to run multi-ROI in parallel within a single job I'd guess :)

tcompa · 2022-08-02T08:16:58Z

Closing this issue would require a comparison of run times for the per-chunk and per-FOV illumination correction tasks on the same dataset.

The current discussion (see #27) is rather on memory, while running times of illumination correction are under control (they are even better, in a certain per-ROI version, than old per-chunk ones).
Notice that illumination correction is the only task where a time/memory comparison between the two schemes is directly meaningful, as the per-ROI version still does exactly the same thing as the per-chunk one (i.e. it still works at level 0, and with 2D ROIs). The only similar case is per-FOV labeling, when used at level 0, while any task that runs at level>0 cannot be used for a direct comparison.

I think we can close this issue as soon as we are happy with #27 (cause otherwise subsequent changes will require re-testing the running times).

tcompa · 2022-09-19T13:32:03Z

This is in principle under control, after #79 and #80, and it will be even more under control after #83.
Let's close once we have a convincing test, e.g. the one in #30.

jluethi · 2022-09-27T13:19:57Z

@tcompa Have you checked the resource usage for #30? If that's in check, I think we can close this issue & the milestone :) It runs on cpu setups that don't take a full node, so from my experience, that looks good :)

tcompa self-assigned this Jul 27, 2022

tcompa mentioned this issue Sep 2, 2022

[ROI] Memory and running time for ROI-based illumination correction #27

Closed

tcompa changed the title ~~Per-ROI parallelization introduces dask overhead (with respect to per-chunk parallelization)~~ [ROI] Per-ROI parallelization introduces dask overhead (with respect to per-chunk parallelization) Aug 2, 2022

jluethi transferred this issue from fractal-analytics-platform/fractal-client Sep 2, 2022

jluethi added this to the Create ROIs & support processing by ROIs milestone Sep 2, 2022

tcompa added the High Priority Current Priorities & Blocking Issues label Sep 20, 2022

jluethi closed this as completed Sep 28, 2022

tcompa added the Tables AnnData and ROI/feature tables label Sep 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROI] Per-ROI parallelization introduces dask overhead (with respect to per-chunk parallelization) #26

[ROI] Per-ROI parallelization introduces dask overhead (with respect to per-chunk parallelization) #26

tcompa commented Jul 27, 2022

tcompa commented Jul 27, 2022

jluethi commented Jul 27, 2022

tcompa commented Aug 2, 2022

tcompa commented Sep 19, 2022

jluethi commented Sep 27, 2022

[ROI] Per-ROI parallelization introduces dask overhead (with respect to per-chunk parallelization) #26

[ROI] Per-ROI parallelization introduces dask overhead (with respect to per-chunk parallelization) #26

Comments

tcompa commented Jul 27, 2022

tcompa commented Jul 27, 2022

jluethi commented Jul 27, 2022

tcompa commented Aug 2, 2022

tcompa commented Sep 19, 2022

jluethi commented Sep 27, 2022