Summary
flow_path_mfd() allocates several (8, H, W) and (H, W) float64 working arrays in the numpy and cupy backends without any pre-allocation memory check. Large rasters can exhaust host or device RAM and abort the process before any error message reaches the caller.
This continues the hydro memory-guard series (#1318, #1319, ..., #1351, #1355, #1357).
Affected backends
- numpy (
flow_path_mfd -> _flow_path_mfd_cpu via the public dispatch)
- cupy (
_flow_path_mfd_cupy does .get() + .astype(float64) host-side)
The dask and dask+cupy paths process tiles and are bounded by chunk size, so they skip the guard.
Working-memory accounting
CPU peak working set (B/px):
data.astype(np.float64) copy of (8, H, W) fractions: 64
np.asarray(sp_data, dtype=np.float64) copy of start_points: 8
out (H, W) float64 written by _flow_path_mfd_cpu: 8
Total: ~80 B/px.
GPU peak working set (B/px), _flow_path_mfd_cupy:
- Host-side
.get() of (8, H, W) fractions: 64
- Host-side
.astype(np.float64) copy: up to 64 (skipped when dtype already matches)
- Host-side output (H, W) float64: 8
- Device-side output (H, W) float64: 8
Conservative GPU budget: ~88 B/px.
Worked example
A 30000 x 30000 fractions raster:
- numpy: 30000 * 30000 * 80 = 72 GB peak working memory.
- cupy: 30000 * 30000 * 88 = ~79 GB host residency during the host-side trace.
Both far exceed typical workstation RAM, and no MemoryError is raised before allocation.
Proposed fix
Match the pattern from #1351 (flow_length_mfd) and #1337 (stream_link_mfd):
- Per-module
_BYTES_PER_PIXEL and _GPU_BYTES_PER_PIXEL constants with a comment breaking down the count.
_available_memory_bytes() reads /proc/meminfo then falls back to psutil, then a 2 GiB default.
_available_gpu_memory_bytes() queries cupy.cuda.runtime.memGetInfo, returns 0 if unavailable.
_check_memory(H, W) raises MemoryError if projected use exceeds 50% of available host RAM.
_check_gpu_memory(H, W) mirrors the host check on the GPU; skips silently when CUDA is unavailable.
- Wire both checks into the public
flow_path_mfd dispatch before the eager allocations. Dask and dask+cupy paths skip the guard.
Tests
- numpy raster raises
MemoryError when projected RAM exceeds budget (mock _available_memory_bytes).
- numpy raster of normal size succeeds.
- dask path bypasses the guard even when memory is mocked low.
- error message mentions grid dimensions and points to dask.
- cupy raster raises
MemoryError when projected GPU RAM exceeds budget (skipped without CUDA).
Summary
flow_path_mfd()allocates several(8, H, W)and(H, W)float64 working arrays in the numpy and cupy backends without any pre-allocation memory check. Large rasters can exhaust host or device RAM and abort the process before any error message reaches the caller.This continues the hydro memory-guard series (#1318, #1319, ..., #1351, #1355, #1357).
Affected backends
flow_path_mfd->_flow_path_mfd_cpuvia the public dispatch)_flow_path_mfd_cupydoes.get()+.astype(float64)host-side)The dask and dask+cupy paths process tiles and are bounded by chunk size, so they skip the guard.
Working-memory accounting
CPU peak working set (B/px):
data.astype(np.float64)copy of(8, H, W)fractions: 64np.asarray(sp_data, dtype=np.float64)copy of start_points: 8out(H, W) float64 written by_flow_path_mfd_cpu: 8Total: ~80 B/px.
GPU peak working set (B/px),
_flow_path_mfd_cupy:.get()of(8, H, W)fractions: 64.astype(np.float64)copy: up to 64 (skipped when dtype already matches)Conservative GPU budget: ~88 B/px.
Worked example
A 30000 x 30000 fractions raster:
Both far exceed typical workstation RAM, and no
MemoryErroris raised before allocation.Proposed fix
Match the pattern from #1351 (flow_length_mfd) and #1337 (stream_link_mfd):
_BYTES_PER_PIXELand_GPU_BYTES_PER_PIXELconstants with a comment breaking down the count._available_memory_bytes()reads/proc/meminfothen falls back topsutil, then a 2 GiB default._available_gpu_memory_bytes()queriescupy.cuda.runtime.memGetInfo, returns 0 if unavailable._check_memory(H, W)raisesMemoryErrorif projected use exceeds 50% of available host RAM._check_gpu_memory(H, W)mirrors the host check on the GPU; skips silently when CUDA is unavailable.flow_path_mfddispatch before the eager allocations. Dask and dask+cupy paths skip the guard.Tests
MemoryErrorwhen projected RAM exceeds budget (mock_available_memory_bytes).MemoryErrorwhen projected GPU RAM exceeds budget (skipped without CUDA).