Fix write_geotiff_gpu NaN-to-sentinel substitution (#1599)#1600
Merged
Conversation
The GPU writer (write_geotiff_gpu / to_geotiff with gpu=True) emitted raw NaN bytes for missing pixels even when nodata=<finite> was supplied, while the CPU writer substituted NaN with the sentinel before encoding. xrspatial-only round-trips were unaffected because the reader masks both NaN and the sentinel, but external readers that mask only on the GDAL_NODATA tag (rasterio, GDAL, QGIS) treated NaN pixels as valid data. rasterio reported 100% valid pixels on a GPU file with 25 NaN inputs vs the CPU file's 25-invalid count. Mirror the CPU writer's NaN-to-sentinel rewrite on the CuPy array before compression. Gate on float dtype and finite nodata. Copy defensively before mutating so a caller-owned CuPy buffer is not modified, matching the CPU writer's arr.copy() at the equivalent step. Add test_gpu_writer_nan_sentinel_1599.py: 7 regression tests covering sentinel substitution, CPU/GPU byte equivalence, caller buffer preservation, no-NaN no-op, NaN sentinel skip, rasterio-visible mask parity, and 3D multiband substitution. Discovered during the 2026-05-11 geotiff accuracy sweep.
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a GeoTIFF GPU-writer behavioral divergence where write_geotiff_gpu/to_geotiff(..., gpu=True) could write raw NaN float bytes even when a finite nodata sentinel was requested, causing external GDAL-based readers (e.g., rasterio/QGIS) to treat those pixels as valid data.
Changes:
- Add a GPU-side NaN→sentinel substitution step (float dtype + finite
nodata) before GPU compression to match the CPU writer’s behavior. - Add a regression test module covering substitution correctness, CPU/GPU parity, caller-buffer non-mutation, and an external-reader mask check.
- Update the accuracy sweep tracking CSV entry for the GeoTIFF pass.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Adds NaN-to-sentinel rewrite logic in write_geotiff_gpu prior to compression. |
xrspatial/geotiff/tests/test_gpu_writer_nan_sentinel_1599.py |
Introduces regression tests for GPU writer NaN/sentinel behavior and CPU/GPU parity. |
.claude/sweep-accuracy-state.csv |
Records the pass-13 accuracy sweep status and notes for issue #1599. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+2594
to
+2609
| # paths must produce byte-equivalent files for the same input. The | ||
| # rewrite is in-place on the GPU array; ``arr`` is either a fresh | ||
| # ``cupy.asarray`` copy of caller data (numpy/dask inputs) or the | ||
| # caller-owned CuPy array. In the latter case we copy once before | ||
| # mutating to keep parity with the CPU writer's defensive copy | ||
| # semantics around in-place sentinel writes on user-owned buffers. | ||
| if (nodata is not None | ||
| and np_dtype.kind == 'f' | ||
| and not np.isnan(float(nodata))): | ||
| nan_mask = cupy.isnan(arr) | ||
| if bool(nan_mask.any()): | ||
| # When ``arr`` is the caller's CuPy buffer (came in as | ||
| # ``data.data`` on a DataArray that holds a CuPy array), an | ||
| # in-place rewrite would mutate the user's array. Copy | ||
| # first; the CPU writer takes the same defensive copy via | ||
| # ``arr.copy()`` at the matching line. |
Comment on lines
+174
to
+175
| to_geotiff(da_cpu, p_cpu, crs=4326, nodata=-9999) | ||
| write_geotiff_gpu(da_gpu, p_gpu, crs=4326, nodata=-9999) |
…r test - Update the in-code comment around the GPU NaN-to-sentinel rewrite to reflect the actual unconditional-copy behavior. The previous comment implied a caller-owned/fresh-buffer split that the code did not enforce; spell out instead why we copy in every case rather than tracking provenance through the upstream branch tree. - Pin compression='deflate' on both the CPU and GPU writers in the external-reader (rasterio) regression test. The default codec is ZSTD, and some rasterio/GDAL builds in the wild ship without ZSTD support, which would have failed the round-trip for environment reasons unrelated to the nodata mask under test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #1599.
write_geotiff_gpu(andto_geotiff(..., gpu=True))emitted raw NaN bytes for missing pixels even when
nodata=<finite>was supplied, while the CPU writer substituted NaN with the sentinel
before encoding. xrspatial-only round-trips were unaffected because
the reader masks both NaN and the sentinel, but external readers that
mask only on the GDAL_NODATA tag (rasterio, GDAL, QGIS) treated NaN
pixels as valid data.
Reproducer from the issue:
After the fix both rows read
valid=2975 / total=3000.Fix
Mirror the CPU writer's NaN-to-sentinel rewrite on the CuPy array
before compression. Gate on float dtype + finite nodata. Copy
defensively before mutating so a caller-owned CuPy buffer is not
modified, matching the CPU writer's
arr.copy()at the equivalentstep.
Tests
test_gpu_writer_nan_sentinel_1599.pyadds 7 regression tests:Test plan
Discovered during the 2026-05-11 geotiff accuracy sweep (pass 13).