Description
flow_length_d8() on the numpy and cupy backends has no memory guard.
_flow_length_downstream_cpu and _flow_length_upstream_cpu (xrspatial/hydro/flow_length_d8.py) each allocate:
flow_len: np.empty((H, W), float64) -> 8 B/pixel
in_degree: np.zeros((H, W), int32) -> 4 B/pixel
valid: np.zeros((H, W), int8) -> 1 B/pixel
order_r/queue_r: np.empty(H*W, int64) -> 8 B/pixel
order_c/queue_c: np.empty(H*W, int64) -> 8 B/pixel
That is roughly 29 bytes/pixel of working memory plus the caller's input flow_dir array. A 50000x50000 numpy raster asks for ~72 GB of host memory before anything errors out.
_flow_length_cupy copies the input cupy flow_dir to host via .get() then runs the CPU kernel and asarrays the result back; on the device side that is the input float64 array (8 B/px) plus the output (8 B/px) plus a host working set the same 29 B/px as the numpy path. ~32 B/px on the device side is a conservative budget.
The dask paths (_flow_length_dask_iterative, _flow_length_dask_cupy) are bounded per-tile by the user's chunk size and stay safe.
Same asymmetric-guard pattern fixed in flow_accumulation_d8 (#1319), flow_accumulation_mfd (#1324), flow_accumulation_dinf (#1325), hand_d8 (#1326), and the broader security sweep series (#1289, #1290, #1292, #1297, #1298, #1300, #1305).
Hydro is safety-critical, so the same guard pattern applies here.
Expected behavior
flow_length_d8() raises MemoryError with a clear message on the eager numpy and cupy backends when the projected working set exceeds available memory. Dask paths skip the guard.
Proposed fix
Add _available_memory_bytes(), _available_gpu_memory_bytes(), _check_memory(rows, cols), and _check_gpu_memory(rows, cols) helpers (29 B/pixel CPU budget, 32 B/pixel GPU budget, 50% threshold). Call them from the public flow_length_d8() dispatch before the eager numpy or cupy code paths run.
A 50000x50000 raster with this guard would require ~72 GB CPU working memory and would raise MemoryError on a typical 32 GB host before allocation begins.
Description
flow_length_d8()on the numpy and cupy backends has no memory guard._flow_length_downstream_cpuand_flow_length_upstream_cpu(xrspatial/hydro/flow_length_d8.py) each allocate:flow_len:np.empty((H, W), float64)-> 8 B/pixelin_degree:np.zeros((H, W), int32)-> 4 B/pixelvalid:np.zeros((H, W), int8)-> 1 B/pixelorder_r/queue_r:np.empty(H*W, int64)-> 8 B/pixelorder_c/queue_c:np.empty(H*W, int64)-> 8 B/pixelThat is roughly 29 bytes/pixel of working memory plus the caller's input flow_dir array. A 50000x50000 numpy raster asks for ~72 GB of host memory before anything errors out.
_flow_length_cupycopies the input cupy flow_dir to host via.get()then runs the CPU kernel and asarrays the result back; on the device side that is the input float64 array (8 B/px) plus the output (8 B/px) plus a host working set the same 29 B/px as the numpy path. ~32 B/px on the device side is a conservative budget.The dask paths (
_flow_length_dask_iterative,_flow_length_dask_cupy) are bounded per-tile by the user's chunk size and stay safe.Same asymmetric-guard pattern fixed in flow_accumulation_d8 (#1319), flow_accumulation_mfd (#1324), flow_accumulation_dinf (#1325), hand_d8 (#1326), and the broader security sweep series (#1289, #1290, #1292, #1297, #1298, #1300, #1305).
Hydro is safety-critical, so the same guard pattern applies here.
Expected behavior
flow_length_d8()raisesMemoryErrorwith a clear message on the eager numpy and cupy backends when the projected working set exceeds available memory. Dask paths skip the guard.Proposed fix
Add
_available_memory_bytes(),_available_gpu_memory_bytes(),_check_memory(rows, cols), and_check_gpu_memory(rows, cols)helpers (29 B/pixel CPU budget, 32 B/pixel GPU budget, 50% threshold). Call them from the publicflow_length_d8()dispatch before the eager numpy or cupy code paths run.A 50000x50000 raster with this guard would require ~72 GB CPU working memory and would raise
MemoryErroron a typical 32 GB host before allocation begins.