Skip to content

Guard stream_link_mfd() against unbounded memory allocations (#1337)#1341

Merged
brendancol merged 1 commit intomainfrom
issue-1337
Apr 29, 2026
Merged

Guard stream_link_mfd() against unbounded memory allocations (#1337)#1341
brendancol merged 1 commit intomainfrom
issue-1337

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Fixes #1337.

Summary

  • Add _BYTES_PER_PIXEL = 97 and _GPU_BYTES_PER_PIXEL = 100 to xrspatial/hydro/stream_link_mfd.py plus the four helpers (_available_memory_bytes, _available_gpu_memory_bytes, _check_memory, _check_gpu_memory) used in Guard stream_link_d8() against unbounded memory allocations (#1333) #1335.
  • Wire _check_memory / _check_gpu_memory into the numpy and cupy branches of stream_link_mfd before any H*W allocation. Dask and dask+cupy are left untouched (per-tile allocation already bounded).
  • 50% threshold, MemoryError mentions grid dimensions and the dask alternative.
  • 5 new tests covering: oversize numpy raises, normal numpy succeeds, dask path skips guard, error message includes dimensions, oversize cupy raises (gated).

Byte budget

CPU peak working set per pixel:

  • frac input copy (8, H, W) float64 -> 64
  • stream_mask int8 -> 1
  • link_id float64 -> 8
  • in_degree int32 -> 4
  • orig_indeg int32 -> 4
  • queue_r int64 -> 8
  • queue_c int64 -> 8

Total ~97 B/px (vs. ~32 B/px for stream_link_d8). The (8, H, W) fractions copy dominates.

GPU peak working set per pixel: fractions_f64 (8, H, W) 64 + stream_mask_i8 1 + in_degree 4 + orig_indeg 4 + state 4 + link_id 8 + fa_cp 8 = ~93 B/px. Use 100 B/px as the budget.

Test plan

  • pytest xrspatial/hydro/tests/test_stream_link_mfd.py (11 passed)
  • full hydro suite (635 passed; one unrelated pre-existing failure in test_basin_d8.py::test_basin_dask_temp_cleanup deselected)

Add a 50% RAM/VRAM check to the numpy and cupy branches of
stream_link_mfd before any H*W allocation. MFD's working set is
~97 B/px on the CPU (driven by the (8, H, W) fractions input copy)
and ~93 B/px on the GPU; the guard uses 100 B/px as a conservative
GPU budget.

Mirrors the helper layout from #1335 (stream_link_d8). The dask
and dask+cupy paths process per-tile and are not gated.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Apr 29, 2026
@brendancol brendancol merged commit f1a7fef into main Apr 29, 2026
11 checks passed
brendancol added a commit that referenced this pull request Apr 29, 2026
…#1347)

Add memory guards to the eager numpy and cupy paths of stream_link_dinf,
mirroring the pattern from #1335 (stream_link_d8) and #1341
(stream_link_mfd).

D-inf encodes one continuous downstream angle per cell, so there is no
(8, H, W) per-neighbor weight buffer like in MFD -- working set matches
the d8 budget at 32 B/px (CPU) and 40 B/px (GPU). Guard raises
MemoryError when projected usage exceeds 50% of available host or
device memory, with the dimensions and a pointer to the dask path.

Dask and dask+cupy paths bypass the guard since they process per-tile
within chunk-bounded memory.
brendancol added a commit that referenced this pull request Apr 29, 2026
…1353)

Eager numpy and cupy backends allocated H*W kernel locals plus a
(8, H, W) float64 input copy with no upfront budget check. A 50000x50000
grid asks for ~232 GB before anything errors out.

Adds _check_memory and _check_gpu_memory helpers wired into the numpy
and cupy dispatch in flow_length_mfd(). 50% threshold against
MemAvailable / cuda free. Dask paths skip the guard since per-tile
allocations are bounded by chunk size.

Mirrors PR #1332 (flow_length_d8) and PR #1341 (stream_link_mfd).

Five tests: oversize numpy rejection, normal pass-through, dask bypass,
error message format, oversize cupy rejection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

stream_link_mfd: no memory guard on H*W working arrays

1 participant