Skip to content

Fix read_geotiff_dask losing nodata mask in non-first chunks (#1597)#1601

Merged
brendancol merged 1 commit into
mainfrom
deep-sweep-metadata-geotiff-2026-05-11-1597
May 11, 2026
Merged

Fix read_geotiff_dask losing nodata mask in non-first chunks (#1597)#1601
brendancol merged 1 commit into
mainfrom
deep-sweep-metadata-geotiff-2026-05-11-1597

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #1597.

Summary

  • For integer rasters with an in-range nodata sentinel, read_geotiff_dask declared the dask array as float64 but only promoted individual chunks to float64 when the chunk actually contained a sentinel pixel. Chunks without a sentinel stayed at the file's integer dtype.
  • Dask concatenation preallocates the result buffer from the first chunk's actual dtype, so a sentinel landing in any non-first chunk caused the compute result to come back as integer with the would-be NaNs cast to 0 (and a RuntimeWarning about an invalid cast).
  • Threading the resolved target_dtype unconditionally through _delayed_read_window keeps every chunk's dtype consistent with the dask graph. Out-of-range sentinels still keep the file's native dtype because effective_dtype short-circuits that case earlier in read_geotiff_dask.

Test plan

  • New regression suite test_dask_int_nodata_chunks_1597.py covering the sentinel-in-corner case, per-chunk dtype uniformity, the out-of-range Reading uint TIFF with negative GDAL_NODATA sentinel raises OverflowError #1581 path, and float-raster parity.
  • Existing tests still pass: test_nodata_out_of_range_1581, test_attrs_parity_1548, test_vrt_int_nodata_1564, test_metadata_round_trip_1484, test_dtype_read, plus 657 non-GPU geotiff tests.

For an integer raster paired with an in-range nodata sentinel,
read_geotiff_dask declares the output dtype as float64 but per-chunk
_delayed_read_window only promoted a chunk to float64 when that
chunk's mask actually hit. With the sentinel in a non-first chunk the
top-left chunk stayed uint16, and dask concatenation preallocated the
result from the first chunk's dtype -- silently casting later float64
chunks back to uint16 and converting their NaNs to 0.

Always thread the resolved target_dtype through _delayed_read_window
so every chunk lands as float64. Out-of-range sentinels still keep
the file's native dtype because effective_dtype already short-circuits
that case earlier in read_geotiff_dask.

Tests cover the sentinel-in-corner regression, per-chunk dtype
uniformity, the out-of-range #1581 path, and float-raster parity.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 11, 2026
@brendancol brendancol requested a review from Copilot May 11, 2026 18:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a correctness issue in the GeoTIFF dask reader where nodata masking could be silently lost when an in-range integer nodata sentinel only appears in non-first chunks, causing dask concatenation to coerce later chunks and convert NaN to 0.

Changes:

  • Thread the resolved target_dtype through each delayed chunk read to keep per-chunk dtypes consistent with the dask array’s declared dtype.
  • Add a regression test suite covering the non-first-chunk sentinel case, per-chunk dtype uniformity, out-of-range sentinel behavior, and float-raster parity.
  • Update sweep metadata to track issue #1597 inspection state.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
xrspatial/geotiff/__init__.py Ensures dask-delayed chunk reads receive the resolved target_dtype to prevent per-chunk dtype divergence that breaks nodata masking.
xrspatial/geotiff/tests/test_dask_int_nodata_chunks_1597.py Adds regression coverage for the dtype/nodata issue when the sentinel is only present in non-first chunks.
.claude/sweep-metadata-state.csv Updates internal sweep metadata to reference issue #1597.
Comments suppressed due to low confidence (1)

xrspatial/geotiff/init.py:1665

  • Passing target_dtype unconditionally means _delayed_read_window will now always run arr.astype(target_dtype) for every chunk (because target_dtype is never None). Since astype copies even when the dtype is already correct, this can add a full extra copy per chunk for the common case where no dtype promotion is needed (e.g. no nodata masking, float rasters, or integer rasters without an in-range sentinel). Consider updating _delayed_read_window to only cast when arr.dtype != target_dtype (or use astype(..., copy=False)), so #1597 is fixed without introducing a per-chunk copy/perf regression.
            block = da.from_delayed(
                _delayed_read_window(source,
                                     r0 + win_r0, c0 + win_c0,
                                     r1 + win_r0, c1 + win_c0,
                                     overview_level, nodata,
                                     band_arg,
                                     target_dtype=target_dtype,
                                     http_meta_key=http_meta_key,
                                     max_pixels=max_pixels),
                shape=block_shape,
                dtype=target_dtype,
            )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@brendancol brendancol merged commit bfc0be7 into main May 11, 2026
15 of 16 checks passed
brendancol added a commit that referenced this pull request May 11, 2026
* Skip no-op astype in _delayed_read_window (#1624)

After #1601 widened the call site to always pass target_dtype, every
dask chunk ran arr.astype(target_dtype) even when arr.dtype already
matched. numpy.astype defaults to copy=True and so allocated a same-
dtype chunk-sized buffer and memcpy on every read. Gate the cast on a
real dtype mismatch; the #1597 mask path still promotes uint to float64
inline so every chunk lands in the dask-declared dtype.

Regression test in test_dask_no_op_astype_1624.py wraps read_to_array
to capture an ndarray subclass that records astype calls, then asserts
no same-dtype call survives the per-chunk return path.

* Address review: drop stub test, exercise uint16 fixture array

Copilot review on #1626 flagged two unused-variable issues. Fixes:

- Delete test_float32_chunks_avoid_redundant_copy. It declared
  same_dtype_casts and wrapped _delayed_read_window without recording
  anything, so it only asserted output dtype -- which does not
  validate the #1624 regression. test_astype_skipped_when_dtypes_match
  already covers the same surface properly via an ndarray subclass
  that tracks astype calls; verified it fails on the pre-fix code
  with "Same-dtype astype still runs per chunk".

- In test_uint16_mask_path_still_promotes, exercise the previously-
  unused arr fixture: assert that pixels equal to the 65535 sentinel
  become NaN and every other pixel matches arr.astype(float64).
  Anchors the test to fixture values so regressions in the mask
  branch surface here, not just as dtype drift.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

read_geotiff_dask drops nodata mask when sentinel only appears in non-first chunks

2 participants