Skip to content

flow_path_mfd: no memory guard on H*W working arrays #1365

@brendancol

Description

@brendancol

Summary

flow_path_mfd() allocates several (8, H, W) and (H, W) float64 working arrays in the numpy and cupy backends without any pre-allocation memory check. Large rasters can exhaust host or device RAM and abort the process before any error message reaches the caller.

This continues the hydro memory-guard series (#1318, #1319, ..., #1351, #1355, #1357).

Affected backends

  • numpy (flow_path_mfd -> _flow_path_mfd_cpu via the public dispatch)
  • cupy (_flow_path_mfd_cupy does .get() + .astype(float64) host-side)

The dask and dask+cupy paths process tiles and are bounded by chunk size, so they skip the guard.

Working-memory accounting

CPU peak working set (B/px):

  • data.astype(np.float64) copy of (8, H, W) fractions: 64
  • np.asarray(sp_data, dtype=np.float64) copy of start_points: 8
  • out (H, W) float64 written by _flow_path_mfd_cpu: 8

Total: ~80 B/px.

GPU peak working set (B/px), _flow_path_mfd_cupy:

  • Host-side .get() of (8, H, W) fractions: 64
  • Host-side .astype(np.float64) copy: up to 64 (skipped when dtype already matches)
  • Host-side output (H, W) float64: 8
  • Device-side output (H, W) float64: 8

Conservative GPU budget: ~88 B/px.

Worked example

A 30000 x 30000 fractions raster:

  • numpy: 30000 * 30000 * 80 = 72 GB peak working memory.
  • cupy: 30000 * 30000 * 88 = ~79 GB host residency during the host-side trace.

Both far exceed typical workstation RAM, and no MemoryError is raised before allocation.

Proposed fix

Match the pattern from #1351 (flow_length_mfd) and #1337 (stream_link_mfd):

  • Per-module _BYTES_PER_PIXEL and _GPU_BYTES_PER_PIXEL constants with a comment breaking down the count.
  • _available_memory_bytes() reads /proc/meminfo then falls back to psutil, then a 2 GiB default.
  • _available_gpu_memory_bytes() queries cupy.cuda.runtime.memGetInfo, returns 0 if unavailable.
  • _check_memory(H, W) raises MemoryError if projected use exceeds 50% of available host RAM.
  • _check_gpu_memory(H, W) mirrors the host check on the GPU; skips silently when CUDA is unavailable.
  • Wire both checks into the public flow_path_mfd dispatch before the eager allocations. Dask and dask+cupy paths skip the guard.

Tests

  • numpy raster raises MemoryError when projected RAM exceeds budget (mock _available_memory_bytes).
  • numpy raster of normal size succeeds.
  • dask path bypasses the guard even when memory is mocked low.
  • error message mentions grid dimensions and points to dask.
  • cupy raster raises MemoryError when projected GPU RAM exceeds budget (skipped without CUDA).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghigh-priorityoomOut-of-memory risk with large datasets

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions