Fix OverflowError on uint TIFF with negative GDAL_NODATA sentinel (#1581)#1583
Conversation
) Reading a uint TIFF whose GDAL_NODATA tag is a negative sentinel (e.g. uint16 + -9999, common on legacy GDAL files) raised OverflowError on every backend because the nodata-mask code did arr.dtype.type(int(nodata)) with no range check. Add _int_nodata_in_range helper in _reader.py and gate the four cast sites (numpy eager open_geotiff, _apply_nodata_mask_gpu cupy path, _delayed_read_window dask path, _resolve_masked_fill / _sparse_fill_value in _reader.py). Out-of-range sentinels are treated as a no-op for value matching since the file dtype cannot represent them; attrs['nodata'] still carries the original sentinel so write round-trips preserve the GDAL_NODATA tag. read_geotiff_dask no longer promotes file dtype to float64 when the sentinel is unrepresentable, matching the materialized array. Eight regression tests cover the helper, both eager and dask read paths, the in-range non-regression case, and the GPU helper (cupy-gated).
There was a problem hiding this comment.
Pull request overview
Fixes GeoTIFF reads that crash with OverflowError when an unsigned integer raster declares a negative GDAL_NODATA sentinel (e.g. uint16 + -9999) by range-checking before performing dtype casts during nodata masking / fill resolution. This aligns behavior across eager NumPy, Dask-windowed reads, GPU nodata masking, and LERC/sparse fill handling.
Changes:
- Add
_int_nodata_in_rangeand use it to guard integer nodata sentinel casts in_reader.pyfill-resolution helpers. - Gate nodata masking casts in
open_geotiff,_apply_nodata_mask_gpu, and the dask_delayed_read_windowpath to avoidOverflowErrorfor out-of-range sentinels. - Add a regression test module covering helper behavior and eager/dask/GPU no-crash behavior for out-of-range integer sentinels.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Adds range checks before integer nodata casts across eager, dask window reads, and GPU nodata masking to prevent OverflowError on out-of-range sentinels. |
xrspatial/geotiff/_reader.py |
Introduces _int_nodata_in_range and uses it to guard dtype casts when resolving LERC masked fill and sparse-tile fill values. |
xrspatial/geotiff/tests/test_nodata_out_of_range_1581.py |
New regression tests validating out-of-range nodata behavior across helper/eager/dask and GPU helper behavior. |
.claude/sweep-accuracy-state.csv |
Updates internal audit tracking entry for the GeoTIFF accuracy sweep. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cupy = pytest.importorskip('cupy') | ||
| from xrspatial.geotiff import _apply_nodata_mask_gpu | ||
|
|
||
| arr_gpu = cupy.array([[1, 2, 3], [4, 5, 6]], dtype=cupy.uint16) | ||
| out = _apply_nodata_mask_gpu(arr_gpu, -9999) |
There was a problem hiding this comment.
@copilot apply changes based on this feedback
There was a problem hiding this comment.
Applied in 3eb2528. I updated the GPU regression test to use a cupy+CUDA availability gate (_gpu_available + @_gpu_only), so it now skips cleanly when CuPy imports but no CUDA device is available.
Agent-Logs-Url: https://github.com/xarray-contrib/xarray-spatial/sessions/4cc0b125-1f1a-43d0-8eb2-9130618be079 Co-authored-by: brendancol <433221+brendancol@users.noreply.github.com>
Summary
GDAL_NODATAis a negative sentinel (e.g. uint16 +-9999, common on legacy GDAL files) raisedOverflowErroron every backend. The nodata-mask code didarr.dtype.type(int(nodata))with no range check._int_nodata_in_rangehelper and gate the four cast sites: numpy eager, cupy_apply_nodata_mask_gpu, dask_delayed_read_window, and_resolve_masked_fill/_sparse_fill_valuein_reader.py.attrs['nodata']still carries the original sentinel so write round-trips keep the GDAL_NODATA tag intact. This matches rasterio's behavior.Fixes #1581.
Test plan
test_nodata_out_of_range_1581.py: helper, eager read, dask read, in-range non-regression, GPU helper (cupy-gated)test_gpu_nodata_1542.py,test_nodata_no_extra_copy_1553.py,test_vrt_int_nodata_1564.py,test_lerc_valid_mask.pystill pass