Fix dask reproject dtype and same-CRS dask merge#1450
Merged
brendancol merged 1 commit intomainfrom May 4, 2026
Merged
Conversation
The dask reproject path always built its output template with dtype float64, but the per-chunk numpy function clamps integer sources back to their original dtype. Dask then advertised float64 in its meta while the chunks underneath returned int. Pick the template dtype from the source instead, mirroring the eager path. Mirror the same integer round-trip in the cupy chunk for backend parity. The dask merge adapter always called the reproject chunk, even when source CRS matched target CRS. The eager merge has used the direct pixel placement shortcut (`_place_same_crs`) for that case, so dask and eager produced numerically different pixels for same-CRS inputs. Precompute `same_crs_list` in `_merge_dask` and pass it through, then let the per-block adapter try `_place_same_crs` first and fall back to the reproject chunk when resolutions are too far apart. Add tests for: - dask reproject preserving int8 / uint16 dtype - dask reproject keeping float32 -> float64 (regression guard) - same-CRS dask merge bit-equal to eager merge - different-CRS dask merge matching eager within rtol=1e-10 - end-to-end cupy parity vs numpy - end-to-end dask+cupy parity vs numpy Closes #1447
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1447
What changed
_reproject_daskPicks the output template dtype from the source dtype rather than hard-coding
np.float64. Integer sources stay integer (matching the per-chunk function which already clamps back); floats keep float64. The dask graph's meta and chunk dtype now agree._reproject_chunk_cupySame integer round-trip as the numpy chunk:
cp.clip(cp.round(result), info.min, info.max).astype(orig_dtype)for integer inputs._merge_block_adapter+_merge_dask_merge_daskprecomputessame_crs_list[i] = (src_crs_i == tgt_crs)and passes it throughfunctools.partial. The per-block adapter tries_place_same_crsfirst when the flag is set; if it returns None (resolutions too far apart) it falls back to_reproject_chunk_numpy. Same shortcut the eager_merge_inmemoryalready used.Tests
Added under
TestDaskDtypeParity,TestMergeDaskParity,TestCupyReprojectParity:test_dask_reproject_int8_preserves_dtypetest_dask_reproject_uint16_preserves_dtypetest_dask_reproject_float32_stays_float64test_merge_dask_same_crs_matches_eager(bit-equal)test_merge_dask_different_crs_matches_eager(rtol=1e-10)test_cupy_reproject_matches_numpy(rtol=1e-5, skipped without cupy)test_dask_cupy_reproject_matches_numpy(rtol=1e-5, skipped without cupy or dask)Test plan
pytest xrspatial/tests/test_reproject.py-> 143 passed locally (cupy and dask+cupy tests run; no skips on this machine)