Skip to content

geotiff: align eager nodata masking with the lazy/dask path for int sources (#2990)#2994

Merged
brendancol merged 1 commit into
mainfrom
issue-2990
Jun 6, 2026
Merged

geotiff: align eager nodata masking with the lazy/dask path for int sources (#2990)#2994
brendancol merged 1 commit into
mainfrom
issue-2990

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2990

Problem

open_geotiff(..., masked=True) on an integer source with a declared, in-range nodata sentinel returned different results from the eager path than from the lazy (dask) path when no pixel matched the sentinel:

  • Eager: uint16, masked_nodata=False
  • Lazy: float64, masked_nodata=True

The dask path declares float64 up front from the in-range sentinel gate and stamps masked_nodata=True before any chunk decodes. The eager helper only promoted to float64 after finding a matching pixel, so the same file came back two different ways depending on chunks=. This matters because masked_nodata is read as semantic state: the writer's NaN-to-sentinel rewrite and the GPU writer both key off it, not just off the dtype.

Fix

_apply_eager_nodata_mask now promotes a maskable integer source to float64 whenever masking is on, independent of whether any pixel matches. This matches the dask path and rioxarray's open_rasterio(..., masked=True), both of which promote unconditionally. The promotion gate is unchanged for out-of-range, non-finite, and fractional sentinels, which still keep the source integer dtype and report masked_nodata=False on both paths.

nodata_pixels_present keeps its existing meaning (did any pixel match) and stays absent on the lazy path per the documented dask contract. No eager compute is forced on the dask path.

Backend coverage

  • numpy (eager): behavior changed (the fix)
  • cupy (eager GPU): routes through the same helper via _finalize_eager_read, so it picks up the fix
  • dask+numpy / dask+cupy (lazy): already correct, used as the reference

Test plan

  • New TestEagerLazyMaskParity2990 asserts eager and dask agree on dtype and masked_nodata for the no-hit, hit, and out-of-range cases
  • Updated the eager no-hit test to the corrected contract (float64 + masked_nodata=True + nodata_pixels_present=False)
  • Full geotiff suite passes (6177 passed, 81 skipped, 1 xfailed)
  • Updated docstrings (_apply_eager_nodata_mask, open_geotiff, module contract) and attrs_contract.rst

Scope note

The VRT eager/chunked masking helpers (_vrt.py) carry their own per-band sentinel handling and a separate documented divergence; they are left unchanged here. Worth a follow-up look but out of scope for this fix.

…it (#2990)

The eager numpy/GPU mask helper only promoted an integer array to
float64 when at least one pixel matched the declared sentinel. The dask
path declares float64 up front from the same in-range gate and stamps
masked_nodata=True before any chunk decodes, so the two paths returned
different dtypes and masked_nodata values for the same file when no
sentinel pixel was present.

Promote unconditionally in _apply_eager_nodata_mask whenever masking is
on and the sentinel is maskable (finite, integer, in-range), matching
the dask path and rioxarray's masked=True. The promotion gate is
unchanged for out-of-range / non-finite / fractional sentinels, which
stay integer with masked_nodata=False on both paths.

nodata_pixels_present still records whether a pixel matched; it stays
absent on the lazy path per the documented dask contract (no eager
compute). Updates the eager no-hit test to the corrected contract and
adds cross-backend parity tests.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 6, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: align eager nodata masking with the lazy/dask path for int sources (#2990)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None blocking. One thing worth confirming rather than changing: in _apply_eager_nodata_mask (xrspatial/geotiff/_attrs.py:1515-1517) the mask is now computed after the float64 cast (mask = arr == np.float64(nodata_int)), whereas before it ran on the integer buffer. For the in-range integer dtypes that TIFF actually uses (uint8/16, int8/16/32) the cast round-trips exactly through float64, so the equality is unchanged. A sentinel above 2^53 on an int64/uint64 source could in theory lose precision, but the original code had the same exact-cast assumption via arr.dtype.type(nodata_int), and 64-bit integer TIFF rasters with such sentinels are not a path this reader exercises. No action needed unless 64-bit integer sources become a supported tier.

Nits (optional improvements)

None.

What looks good

  • The fix is the minimal change: drop the matching-pixel gate on the float promotion, keep the finite/integer/in-range gate that protects against unmatchable sentinels. The out-of-range and non-finite branches are untouched, so those still keep the integer dtype on both paths.
  • The direction is correct. rioxarray's open_rasterio(masked=True) promotes an integer source to float unconditionally, and the dask path already did the same, so aligning the eager path to the lazy path (rather than the reverse) matches the ecosystem and avoids forcing an eager compute on dask.
  • nodata_pixels_present keeps its separate meaning and is still computed eagerly, so the did-any-pixel-match signal is preserved while masked_nodata becomes the consistent did-masking-run flag.
  • The GPU eager path routes through the same helper via _finalize_eager_read, so cupy picks up the fix without a separate kernel.
  • Tests cover the three relevant cases (no-hit, hit, out-of-range) and assert eager-vs-dask agreement directly. The stale eager no-hit test was updated to the corrected contract rather than deleted.
  • Docstrings and attrs_contract.rst were updated so the contract text no longer claims promotion depends on a matching pixel.

Checklist

  • Algorithm matches reference (rioxarray masked=True semantics; dask path)
  • All implemented backends produce consistent results (eager numpy/GPU now match dask)
  • NaN handling is correct (sentinel pixels still rewritten to NaN when present)
  • Edge cases are covered by tests (no-hit, hit, out-of-range)
  • Dask chunk boundaries handled correctly (dask path unchanged, used as reference)
  • No premature materialization (lazy path untouched; no eager compute added)
  • Benchmark not needed (no new function, masking cost unchanged in the common hit case)
  • README feature matrix not applicable (no new function, no backend change)
  • Docstrings present and accurate (updated for the new contract)

Scope note

The PR explicitly leaves the VRT masking helpers (_vrt.py) alone. They carry per-band sentinel handling and their own documented divergence, and the VRT lazy graph dtype comes from _effective_dtype_for_bands rather than the in-range sentinel gate, so the VRT eager/lazy story is self-contained and different from the main reader. Folding it into this PR would expand scope into a separate subsystem. A follow-up issue for VRT parity would be reasonable.

Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (#2990)

No code changes were needed after the first review. The first pass found no blockers, suggestions, or nits requiring edits.

Disposition of the one confirm-don't-change item (float-cast equality precision in _apply_eager_nodata_mask): dismissed. The mask now runs after the float64 cast, but for the integer dtypes TIFF actually uses (uint8/16, int8/16/32) the cast round-trips exactly, so the equality is unchanged. The prior code carried the same exact-cast assumption via arr.dtype.type(nodata_int), and 64-bit integer sentinels above 2^53 are not a path this reader exercises. No precision regression.

The full geotiff suite passes (6177 passed, 81 skipped, 1 xfailed), including the new cross-backend parity tests and the updated eager no-hit test.

@brendancol brendancol merged commit eb84c3d into main Jun 6, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: eager vs lazy nodata masking diverge on dtype and masked_nodata for int sources with no sentinel pixels

1 participant