Describe the bug
open_geotiff(..., masked=True) on an integer-dtype source that declares a nodata sentinel returns a different result from the eager path than from the lazy (dask) path when the read window has no pixels matching the sentinel.
Same file, same masked=True, only chunks= differs:
- Eager:
dtype=uint16, masked_nodata=False, nodata_pixels_present=False
- Lazy:
dtype=float64, masked_nodata=True, nodata_pixels_present absent
The lazy path declares float64 and stamps masked_nodata=True from the graph dtype before any chunk is decoded (xrspatial/geotiff/_backends/dask.py near lines 424 and 521). The eager path only promotes the integer array to float64 when at least one sentinel pixel actually matches (_apply_eager_nodata_mask in xrspatial/geotiff/_attrs.py near line 1483); with no match it keeps the integer dtype and reports masked_nodata=False.
This matters because masked_nodata is read as semantic state, not just dtype decoration. The writer's _should_restore_nan_sentinel keys the NaN-to-sentinel rewrite off it, and the GPU writer reads it too. The two backends disagree on whether masking happened for the same input.
Expected behavior
The eager and lazy paths should report the same masked_nodata and the same dtype for the same input. rioxarray's open_rasterio(..., masked=True) always promotes an integer source to float regardless of whether a sentinel pixel is present, so the lazy path's behavior is the correct reference. The eager path should match: when masked=True on a maskable integer source with a declared sentinel, promote to float64 unconditionally and report masked_nodata=True, even when no sentinel pixel matches.
nodata_pixels_present is a separate signal and is allowed to stay absent on the lazy path, since a strict per-chunk reduction would force an eager compute. This issue does not propose forcing eager compute on the lazy path. It only proposes making the masked_nodata flag and the output dtype consistent.
Reproduction
Write a uint16 GeoTIFF declaring nodata=9999 whose pixels are all in the range 1..50, then open it with masked=True both eager and with chunks= set, and compare dtype and masked_nodata.
Additional context
Found during a code review of the geotiff backends. The eager promotion gate on a matching sentinel pixel is the divergent behavior.
Describe the bug
open_geotiff(..., masked=True)on an integer-dtype source that declares a nodata sentinel returns a different result from the eager path than from the lazy (dask) path when the read window has no pixels matching the sentinel.Same file, same
masked=True, onlychunks=differs:dtype=uint16,masked_nodata=False,nodata_pixels_present=Falsedtype=float64,masked_nodata=True,nodata_pixels_presentabsentThe lazy path declares
float64and stampsmasked_nodata=Truefrom the graph dtype before any chunk is decoded (xrspatial/geotiff/_backends/dask.pynear lines 424 and 521). The eager path only promotes the integer array to float64 when at least one sentinel pixel actually matches (_apply_eager_nodata_maskinxrspatial/geotiff/_attrs.pynear line 1483); with no match it keeps the integer dtype and reportsmasked_nodata=False.This matters because
masked_nodatais read as semantic state, not just dtype decoration. The writer's_should_restore_nan_sentinelkeys the NaN-to-sentinel rewrite off it, and the GPU writer reads it too. The two backends disagree on whether masking happened for the same input.Expected behavior
The eager and lazy paths should report the same
masked_nodataand the same dtype for the same input. rioxarray'sopen_rasterio(..., masked=True)always promotes an integer source to float regardless of whether a sentinel pixel is present, so the lazy path's behavior is the correct reference. The eager path should match: whenmasked=Trueon a maskable integer source with a declared sentinel, promote to float64 unconditionally and reportmasked_nodata=True, even when no sentinel pixel matches.nodata_pixels_presentis a separate signal and is allowed to stay absent on the lazy path, since a strict per-chunk reduction would force an eager compute. This issue does not propose forcing eager compute on the lazy path. It only proposes making themasked_nodataflag and the output dtype consistent.Reproduction
Write a uint16 GeoTIFF declaring nodata=9999 whose pixels are all in the range 1..50, then open it with
masked=Trueboth eager and withchunks=set, and comparedtypeandmasked_nodata.Additional context
Found during a code review of the geotiff backends. The eager promotion gate on a matching sentinel pixel is the divergent behavior.