Skip to content

flow_accumulation(): numpy and cupy backends have no memory guard #1318

@brendancol

Description

@brendancol

Description

flow_accumulation_d8() on the numpy and cupy backends has no memory guard.

_flow_accum_cpu (xrspatial/hydro/flow_accumulation_d8.py:148) allocates:

  • accum: np.empty((H, W), float64) -> 8 B/pixel
  • in_degree: np.zeros((H, W), int32) -> 4 B/pixel
  • valid: np.zeros((H, W), int8) -> 1 B/pixel
  • queue_r, queue_c: np.empty(H*W, int64) each -> 16 B/pixel

That is roughly 29 bytes/pixel of working memory plus the caller's input. A 50000x50000 numpy raster asks for ~72 GB of host memory before anything errors out.

_flow_accum_cupy (line 340) is the same shape on the device: cp.zeros((H, W), float64) + cp.zeros((H, W), int32) + cp.zeros((H, W), int32) ~ 16 B/pixel of GPU memory with no check.

The dask paths (_flow_accum_dask_iterative, _flow_accum_dask_cupy) are bounded per-tile by the user's chunk size and stay safe.

Same asymmetric-guard pattern fixed in sieve (#1298), kde (#1289), resample (#1297), focal (#1286), geodesic (#1285), mahalanobis (#1290), true_color (#1292), diffuse (#1268), erode (#1277), emerging_hotspots (#1276), dasymetric (#1263), sky_view_factor (#1300), surface_distance (#1305).

Hydro is safety-critical, so the same guard pattern applies here.

Expected behavior

flow_accumulation_d8() raises MemoryError with a clear message on the eager numpy and cupy backends when the projected working set exceeds available memory. Dask paths skip the guard.

Proposed fix

Add _available_memory_bytes(), _available_gpu_memory_bytes(), _check_memory(rows, cols), and _check_gpu_memory(rows, cols) helpers (29 B/pixel CPU budget, 16 B/pixel GPU budget, 50% threshold). Call them from _flow_accum_cpu's eager wrapper and _flow_accum_cupy before the float64 cast and queue allocation.

The dinf and mfd flow_accumulation variants share the same shape and will be addressed in separate follow-up PRs per the one-fix-per-security-PR policy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghigh-priorityoomOut-of-memory risk with large datasets

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions