Skip to content

Fix dask reproject dtype and same-CRS dask merge#1450

Merged
brendancol merged 1 commit intomainfrom
dask-dtype-and-same-crs-merge
May 4, 2026
Merged

Fix dask reproject dtype and same-CRS dask merge#1450
brendancol merged 1 commit intomainfrom
dask-dtype-and-same-crs-merge

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #1447

What changed

_reproject_dask

Picks the output template dtype from the source dtype rather than hard-coding np.float64. Integer sources stay integer (matching the per-chunk function which already clamps back); floats keep float64. The dask graph's meta and chunk dtype now agree.

_reproject_chunk_cupy

Same integer round-trip as the numpy chunk: cp.clip(cp.round(result), info.min, info.max).astype(orig_dtype) for integer inputs.

_merge_block_adapter + _merge_dask

_merge_dask precomputes same_crs_list[i] = (src_crs_i == tgt_crs) and passes it through functools.partial. The per-block adapter tries _place_same_crs first when the flag is set; if it returns None (resolutions too far apart) it falls back to _reproject_chunk_numpy. Same shortcut the eager _merge_inmemory already used.

Tests

Added under TestDaskDtypeParity, TestMergeDaskParity, TestCupyReprojectParity:

  • test_dask_reproject_int8_preserves_dtype
  • test_dask_reproject_uint16_preserves_dtype
  • test_dask_reproject_float32_stays_float64
  • test_merge_dask_same_crs_matches_eager (bit-equal)
  • test_merge_dask_different_crs_matches_eager (rtol=1e-10)
  • test_cupy_reproject_matches_numpy (rtol=1e-5, skipped without cupy)
  • test_dask_cupy_reproject_matches_numpy (rtol=1e-5, skipped without cupy or dask)

Test plan

  • pytest xrspatial/tests/test_reproject.py -> 143 passed locally (cupy and dask+cupy tests run; no skips on this machine)
  • CI green

The dask reproject path always built its output template with dtype
float64, but the per-chunk numpy function clamps integer sources back
to their original dtype. Dask then advertised float64 in its meta
while the chunks underneath returned int. Pick the template dtype
from the source instead, mirroring the eager path. Mirror the same
integer round-trip in the cupy chunk for backend parity.

The dask merge adapter always called the reproject chunk, even when
source CRS matched target CRS. The eager merge has used the direct
pixel placement shortcut (`_place_same_crs`) for that case, so dask
and eager produced numerically different pixels for same-CRS inputs.
Precompute `same_crs_list` in `_merge_dask` and pass it through, then
let the per-block adapter try `_place_same_crs` first and fall back
to the reproject chunk when resolutions are too far apart.

Add tests for:
- dask reproject preserving int8 / uint16 dtype
- dask reproject keeping float32 -> float64 (regression guard)
- same-CRS dask merge bit-equal to eager merge
- different-CRS dask merge matching eager within rtol=1e-10
- end-to-end cupy parity vs numpy
- end-to-end dask+cupy parity vs numpy

Closes #1447
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 4, 2026
@brendancol brendancol merged commit abf77db into main May 4, 2026
10 of 11 checks passed
@brendancol brendancol deleted the dask-dtype-and-same-crs-merge branch May 4, 2026 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backend parity: dask reproject dtype, same-CRS dask merge

1 participant