Fix dask aggregate boundary contamination and clean up cumulative bookkeeping

**Describe the bug**

Two related problems in `xrspatial/resample.py`'s dask aggregate paths (`_run_dask_numpy` and `_run_dask_cupy`):

1. Boundary contamination. The aggregate dask path calls `dask.array.overlap.overlap` with `boundary='nearest'`. At the global edge of the input array, the overlap pad is filled with duplicated edge cells. Output pixels whose aggregate window straddles that edge then sample those duplicates, which biases min/max/median. Mean is less affected because the duplicates are real values, but they're still triple-counted near corners.

2. Wasted/inconsistent cumulative bookkeeping. The aggregate path computes `global_in_h`, `cum_in_y`, `cum_in_x`, `out_y`, `out_x`, `cum_out_y`, `cum_out_x` once before `_ensure_min_chunksize` may rechunk for the depth requirement, then conditionally recomputes when the rechunk changed the layout. The first compute is wasted, and the conditional recompute uses `data.chunks[0] != tuple(cum_in_y[1:] - cum_in_y[:-1])` as a roundabout chunk-equality check.

**Expected behavior**

Aggregate dask results should match eager numpy bit-identically for min/max/median (same kernel, no boundary padding bias). The bookkeeping should compute once.

**Fix**

- Use `boundary=np.nan` on the aggregate overlap. The aggregate kernels already skip NaN via `if not np.isnan(v)` and return NaN for empty windows, so padded NaN cells are ignored naturally.
- Compute `min_size` from the scale-driven minimum and the depth-driven `max(2*depth_y+1, 2*depth_x+1)` up front, call `_ensure_min_chunksize` once, then build the cumulative arrays once.
- Leave the interp dask path on `boundary='nearest'` so it stays consistent with scipy's `mode='nearest'` semantics that the eager numpy interp path uses.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dask aggregate boundary contamination and clean up cumulative bookkeeping #1469

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fix dask aggregate boundary contamination and clean up cumulative bookkeeping #1469

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions