Skip to content

Guard resample() against unbounded scale_factor / target_resolution #1295

@brendancol

Description

@brendancol

Describe the bug

resample() does not bound the output dimensions derived from user-supplied scale_factor or target_resolution. The output shape is computed as:

out_h, out_w = max(1, round(in_h * scale_y)), max(1, round(in_w * scale_x))

A caller passing scale_factor=1e6 against a small raster (or target_resolution=1e-9 against a meter-scale raster) produces a huge output shape that is then allocated inside _run_numpy, _run_cupy, the _AGG_FUNCS numba kernels, and _nan_aware_interp_np via np.empty / cupy.zeros / scipy.ndimage.map_coordinates with no memory check. The process OOMs (or hangs) before any work actually begins.

Expected behavior

When the projected output buffer would exceed available memory, the function should raise MemoryError cleanly before any large allocation is attempted, mirroring the guards added in #1284 (focal), #1287 (kde / line_density), #1283 (geodesic slope/aspect), #1262 (cost_distance), and #1267 (diffuse).

Reproducer

import numpy as np
import xarray as xr
from xrspatial import resample

agg = xr.DataArray(
    np.zeros((100, 100), dtype=np.float32),
    dims=('y', 'x'),
    coords={'y': np.arange(100), 'x': np.arange(100)},
)

# Triggers a 1e8 x 1e8 float64 allocation (~80 PB)
resample(agg, scale_factor=1e6, method='nearest')

Fix outline

Add _available_memory_bytes() and _check_resample_memory(out_h, out_w) in xrspatial/resample.py. Budget about 12 bytes per output cell (float64 working buffer plus float32 output) and raise MemoryError when the projected memory exceeds 50 percent of available RAM. Call the guard from resample() after the output shape is computed but before backend dispatch. Add a GPU-side equivalent for the cupy-eager branch using cupy.cuda.runtime.memGetInfo. Skip the guard for pure dask paths since per-chunk allocation is bounded by chunk size; that can be addressed separately if needed.

Additional context

Found during the resample security sweep audit (Cat 1: Unbounded Allocation / DoS).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghigh-priorityinput-validationInput validation and error messagesoomOut-of-memory risk with large datasets

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions