Skip to content

resample: fix dask nearest/bilinear chunk-seam values on downsampling (#2610)#2627

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-resample-2026-05-29
May 29, 2026
Merged

resample: fix dask nearest/bilinear chunk-seam values on downsampling (#2610)#2627
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-resample-2026-05-29

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2610

What

  • The interp dask path used a fixed overlap depth of 1 input pixel for nearest and bilinear. Block-centered mapping sends an output pixel to input coordinate (o + 0.5) * (in/out) - 0.5, which on a downsample can land up to ~(in/out)/2 pixels past the chunk _output_chunks assigned it to. With depth 1 the overlapped block did not contain that source row/column, so map_coordinates(mode='nearest') clamped to the block edge and returned wrong values for whole chunk-seam rows.
  • Adds a per-axis downsample radius (_downsample_radius) to the interp overlap depth in _run_dask_numpy and _run_dask_cupy, clamped per axis exactly as before. The aggregate path already accounted for this; the interp path now matches.

Backends

  • dask+numpy and dask+cupy: fixed.
  • numpy and cupy (eager): unaffected, no chunking involved.

Test plan

  • New TestInterpDaskDownsampleSeam covers nearest/bilinear/cubic at ratios 0.33/0.2/0.1 with chunk layouts that do not divide the input evenly, plus an asymmetric-per-axis case. Fails on main, passes here.
  • Full test_resample.py suite: 217 passed (includes cupy and dask+cupy parity on a CUDA host).
  • Off-band sweep of 660 ratio/size/chunk/method combinations: dask matches eager numpy within float32 round-off.

The interp dask path used a fixed overlap depth of 1 input pixel for
nearest/bilinear. Block-centered mapping places an output pixel's source
coordinate up to ~(in/out)/2 pixels past the chunk _output_chunks assigned
it to, so on a downsample the overlapped block was missing the true source
row/column and map_coordinates clamped to the block edge, returning wrong
values for whole chunk-seam rows.

Add a per-axis downsample radius to the interp overlap depth in
_run_dask_numpy and _run_dask_cupy, clamped per axis as before. Eager
numpy/cupy are unaffected. Adds TestInterpDaskDownsampleSeam covering large
ratios with chunks that do not divide evenly.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: resample dask nearest/bilinear chunk-seam fix (#2610)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • _downsample_radius re-imports math at function scope. _ensure_min_chunksize and the aggregate branches already do the same local import, so this matches existing style. A module-level import math would be cleaner, but that pattern predates this PR, so leave it for a bug fix.

What looks good

  • Root cause is correctly identified, and the fix mirrors the depth logic the aggregate path already uses. _downsample_radius returns 0 for upsampling, so the upsample paths keep depth 1 and pay nothing.
  • The augmented depth feeds the min_size chunk-sizing, the _add_overlap call, and the block function's origin offset from the same value, so they cannot drift apart.
  • The per-axis clamp against global_in - 1 is preserved, so tiny inputs and skinny axes still work. Checked on 2x2, single-chunk, and skinny-column cases.
  • Both _run_dask_numpy and _run_dask_cupy are patched, so dask+numpy and dask+cupy stay in parity. Eager numpy/cupy are untouched.
  • Test coverage holds up: the new class fails on main and passes here, uses random data (a gradient would hide the bug), and uses chunk sizes that do not divide the input evenly. An off-band sweep of 660 ratio/size/chunk/method combinations all matched eager numpy within float32 round-off.

Checklist

  • Algorithm matches the block-centered mapping; depth now covers the downsample displacement
  • All four backends consistent (cupy and dask+cupy verified on a CUDA host)
  • NaN handling unchanged (fix only widens overlap)
  • Edge cases covered (small inputs, skinny axes, asymmetric per-axis ratios)
  • Dask chunk boundaries handled correctly (the point of the fix)
  • No premature materialization or extra copies
  • Benchmark not needed (bug fix, no new API)
  • README feature matrix already lists resample
  • Docstrings present; new helper documents the radius rationale

@brendancol brendancol merged commit 6609982 into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

resample: dask nearest/bilinear gives wrong chunk-seam values when downsampling with uneven chunks

1 participant