reproject: fix cupy backend divergence from numpy on pyproj-fallback CRS pairs (#2620)#2629
Conversation
…arity (#2620) The cupy backend fell back to cupyx.scipy.ndimage.map_coordinates when no CUDA coordinate kernel covered the CRS pair (e.g. projected->projected like EPSG:32633 -> EPSG:3857). That resampler diverged from numpy: it bled the cval=0.0 constant into the half-pixel boundary band instead of renormalizing, and used a B-spline for cubic instead of Catmull-Rom. Both eager and dask+cupy fast paths now always use _resample_cupy_native, which matches the numpy kernels exactly and accepts CPU coordinate arrays. Adds TestCupyPyprojFallbackParity covering the boundary-band renormalization and numpy/cupy/dask+cupy agreement for nearest/bilinear/cubic on a projected->projected reprojection.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: reproject cupy backend divergence on pyproj-fallback CRS pairs (#2620)
Blockers
None.
Suggestions
None blocking. One thing worth noting for the record: this PR fixes the resampler divergence, but the prior inspection note (#2508) also flagged a separate deferred item. _reproject_dask(is_cupy=True) returns a dask array with cupy chunks but numpy meta, which could make _apply_vertical_shift_dask run the numpy shift on cupy blocks. That path is untouched here and stays deferred. It is out of scope for this PR; flagging only so it is not lost.
Nits
_resample_cupyis now unused by the reproject pipeline but stays in_interpolate.pybecauseTestCuPyResamplerClippingstill calls it directly. That is fine, but its docstring still says "CuPy equivalent of_resample_numpy", which is now misleading since the pipeline no longer treats it as equivalent. A one-line docstring note that it is kept only for the clipping unit tests would stop a future reader from wiring it back in.
What looks good
- Root cause is right and the fix is the minimal one. The native CUDA kernels already match the numpy numba kernels (existing
test_cubic_nan_fallback_matches_cpuproves it), so routing all cupy resampling through them removes the divergence without new kernel code. - Both affected call sites are covered: the per-chunk
_reproject_chunk_cupy(2D and the 3D band loop) and the eager_reproject_dask_cupyfast path. _resample_cupy_nativeaccepts numpy coordinate arrays and transfers them, so the pyproj-fallback case works without forcing coordinates onto the GPU first.- The double nodata-to-NaN conversion in the dask path (window converted at the call site, then again inside the native resampler) is harmless. NaN stays NaN.
- Tests assert both the finite/nodata mask equality and the value agreement, across nearest/bilinear/cubic plus the dask+cupy path. The standalone boundary-band test pins the specific failure mode (zero bleed at r=4.6 and r=-0.4).
Checklist
- Algorithm matches reference (native kernels match numpy numba kernels, verified by existing and new tests)
- All implemented backends produce consistent results (numpy/cupy/dask+cupy now agree to ~3e-13)
- NaN handling is correct (mask equality asserted; native kernel does its own nodata-to-NaN)
- Edge cases covered (boundary band, descending y, three resampling modes)
- Dask chunk boundaries handled (dask+cupy parity test passes with 32x32 chunks)
- No premature materialization or unnecessary copies
- Benchmark not needed (no perf-critical change; native kernel was already the GPU fast path)
- README feature matrix not applicable (no new function, backend support unchanged)
- Docstrings/comments present and accurate (inline comments explain the routing choice)
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after 5311c66)
The one nit from the first pass is resolved: _resample_cupy's docstring now states it is not numerically equivalent to _resample_numpy (B-spline cubic, cval boundary bleed) and that it is kept only for the clipping unit tests, with a pointer to #2620. The TestCuPyResamplerClipping tests still pass since the change is docstring-only.
The Suggestion item (the _reproject_dask(is_cupy=True) meta vs cupy-chunks mismatch noted in #2508) stays deferred. It is a distinct defect in a different code path and out of scope here.
No new findings. No blockers.
…eproject-2026-05-29
Closes #2620
What changed
_reproject_chunk_cupyand the eager_reproject_dask_cupyfast path now always resample through_resample_cupy_native(the hand-written CUDA kernels that match the numpy numba kernels exactly), instead of dropping tocupyx.scipy.ndimage.map_coordinateswhen the CUDA coordinate transform is unavailable._resample_cupyimport inreproject/__init__.py. The function itself stays in_interpolate.py(still referenced by existing unit tests).Why
On a projected-to-projected reprojection (e.g. EPSG:32633 -> EPSG:3857) neither CRS is geographic WGS84/NAD83, so no CUDA kernel applies and the cupy backend used
map_coordinates. That path diverged from numpy two ways: it bled thecval=0.0constant into the half-pixel boundary band instead of renormalizing (per-edge-pixel errors ~534), and its order=3 cubic is a B-spline rather than the Catmull-Rom the numpy/native kernels use (~0.45 interior). After the change, numpy and cupy agree to ~3e-13.Backend coverage
Test plan
TestCupyPyprojFallbackParity: boundary-band renormalization + numpy/cupy/dask+cupy agreement for the three resampling modes on a projected-to-projected reprojectiontest_reproject.pysuite (385 passed)test_reproject_coverage_*,test_reproject_cupy_gate_2564,test_interpolation(53 passed)