Describe the bug
kriging() takes an arbitrary point count N and template grid size with no upper bound. Three eager allocations scale with these inputs and run before any guard.
Concrete sites in xrspatial/interpolate/_kriging.py:
_experimental_variogram line 73: np.triu_indices(n, k=1) allocates two int64 arrays of length N*(N-1)/2, plus dists and semivar of the same length. At N=50_000 that's about 20 GB before any computation runs.
_build_kriging_matrix lines 137-142: builds the N×N pairwise distance matrix and the (N+1)×(N+1) kriging matrix, then calls np.linalg.inv(). O(N²) memory and O(N³) time.
_kriging_predict line 178: builds a (grid_pixels, N+1) k0 matrix. A 5000×5000 template with 1000 points works out to about 200 GB.
A caller with a moderately large point set or template hits one of these silently. The process either OOMs or the cupy allocator raises a raw error that says nothing about which input was too large.
Same pattern as the fixes in #1287 (kde), #1295 (resample), #1296 (sieve), #1299 (sky_view_factor), and #1302 (landforms).
The dask backends still call _kriging_predict per chunk, so chunk size bounds the prediction allocation. The variogram and kriging-matrix builds run once on the full point set and need a guard regardless of backend.
Expected behavior
kriging() raises MemoryError before allocating, with a message that names the offending input (point count N or grid pixel count) so the user knows what to reduce.
Reproduce
import numpy as np
import xarray as xr
from xrspatial.interpolate import kriging
n = 100_000
x = np.random.rand(n)
y = np.random.rand(n)
z = np.random.rand(n)
template = xr.DataArray(np.zeros((10, 10)), dims=['y', 'x'],
coords={'y': np.arange(10), 'x': np.arange(10)})
kriging(x, y, z, template)
Fix sketch
- Estimate worst-case memory at the top of
kriging(): max of variogram pair arrays (3 * N*(N-1)/2 * 8), kriging matrix (2 * (N+1)² * 8), and prediction k0 matrix (grid_pixels * (N+1) * 8).
- Compare against
0.8 * _available_memory_bytes() using the helper imported from xrspatial.zonal, same pattern as balanced_allocation.py.
- Raise
MemoryError naming N and grid_pixels when the estimate exceeds the threshold.
Describe the bug
kriging()takes an arbitrary point countNand template grid size with no upper bound. Three eager allocations scale with these inputs and run before any guard.Concrete sites in
xrspatial/interpolate/_kriging.py:_experimental_variogramline 73:np.triu_indices(n, k=1)allocates two int64 arrays of lengthN*(N-1)/2, plusdistsandsemivarof the same length. At N=50_000 that's about 20 GB before any computation runs._build_kriging_matrixlines 137-142: builds the N×N pairwise distance matrix and the (N+1)×(N+1) kriging matrix, then callsnp.linalg.inv(). O(N²) memory and O(N³) time._kriging_predictline 178: builds a(grid_pixels, N+1)k0matrix. A 5000×5000 template with 1000 points works out to about 200 GB.A caller with a moderately large point set or template hits one of these silently. The process either OOMs or the cupy allocator raises a raw error that says nothing about which input was too large.
Same pattern as the fixes in #1287 (kde), #1295 (resample), #1296 (sieve), #1299 (sky_view_factor), and #1302 (landforms).
The dask backends still call
_kriging_predictper chunk, so chunk size bounds the prediction allocation. The variogram and kriging-matrix builds run once on the full point set and need a guard regardless of backend.Expected behavior
kriging()raisesMemoryErrorbefore allocating, with a message that names the offending input (point countNor grid pixel count) so the user knows what to reduce.Reproduce
Fix sketch
kriging(): max of variogram pair arrays (3 * N*(N-1)/2 * 8), kriging matrix (2 * (N+1)² * 8), and prediction k0 matrix (grid_pixels * (N+1) * 8).0.8 * _available_memory_bytes()using the helper imported fromxrspatial.zonal, same pattern asbalanced_allocation.py.MemoryErrornaming N and grid_pixels when the estimate exceeds the threshold.