Skip to content

Lazy assembly for hand_mfd dask path (#1416)#1417

Open
brendancol wants to merge 1 commit intomainfrom
issue-1416
Open

Lazy assembly for hand_mfd dask path (#1416)#1417
brendancol wants to merge 1 commit intomainfrom
issue-1416

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • hand_mfd._hand_mfd_dask now assembles via da.map_blocks instead of
    computing every tile into a list and stitching with da.block. Driver
    memory scales with chunk size again, matching hand_dinf and the rest
    of the hydro subpackage.
  • The boundary-propagation phase was already streaming one tile at a
    time; only the assembly phase is changed.
  • Two regression tests verify the dask graph keeps one task per output
    chunk rather than collapsing to a single materialized array.

Closes #1416.

Test plan

  • pytest xrspatial/hydro/tests/test_hand_mfd.py — 20 passed
  • pytest xrspatial/hydro/tests/ — 774 passed
  • New regression tests fail on the previous implementation and pass
    on this branch.

The dask assembly phase in `hand_mfd._hand_mfd_dask` previously computed
every tile via `blocks[iy, ix].compute()` into a list-of-lists of numpy
arrays, then called `da.block(rows)`.  Driver memory held the full grid
at once, defeating the per-tile streaming used elsewhere in the hydro
subpackage.

Replace the eager block-list with `da.map_blocks` over `(flow_accum_da,
elev_da)`, slicing `fractions_da` per tile inside the closure.  The
converged `boundaries` and `frac_bdry` snapshots are captured by the
closure so each tile is recomputed lazily on demand.  Matches the
pattern already used by `hand_dinf._hand_dinf_dask`.

The boundary-propagation phase already streamed one tile at a time, so
it is unchanged.

Add two regression tests:
- `test_dask_result_is_lazy` checks the result preserves chunking.
- `test_dask_assembly_does_not_materialize_all_tiles` checks the dask
  graph contains one task per output chunk rather than collapsing to a
  single materialized array.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hand_mfd dask path materializes all tiles eagerly

1 participant