geotiff: apply nodata mask against post-MinIsWhite sentinel (#1809)#1817
Merged
brendancol merged 3 commits intoMay 13, 2026
Merged
Conversation
MinIsWhite inversion was running before the sentinel-to-NaN nodata mask on all four read backends. Because the inversion rewrites the sentinel value (uint8 nodata=0 -> 255, float32 nodata=-9999 -> 9999), the post-inversion equality check matched the wrong pixels: * stored values that equalled the sentinel survived as iinfo.max - sentinel instead of NaN * stored values that happened to equal iinfo.max - sentinel were incorrectly turned into NaN Introduces _miniswhite_inverted_nodata() in _reader.py and stashes the inverted sentinel on geo_info._mask_nodata. Every backend (eager numpy, eager GPU, GPU stripped fallback, dask chunk closure) routes its mask through the new field while attrs['nodata'] keeps the original sentinel for write-side round-trip. The dask graph builder picks up the IFD photometric off geo_info via _read_geo_info / _parse_cog_http_meta so the closure nodata is inverted at graph-build time. 9 regression tests in test_miniswhite_nodata_1809.py cover uint8 with nodata=0, uint16 with nodata=65535, float32 with nodata=-9999 across numpy, dask, and GPU backends, plus no-collision and no-nodata controls. Closes #1809
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes MinIsWhite GeoTIFF reads so nodata masking compares against the post-inversion sentinel while preserving the original nodata value in attrs for round-trip writes.
Changes:
- Adds
_miniswhite_inverted_nodata()and threads inverted sentinel handling through eager, dask, HTTP COG, and GPU read paths. - Preserves original
attrs['nodata']while using an internal mask sentinel for data masking. - Adds regression tests for MinIsWhite + nodata behavior across available backends.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/_reader.py |
Computes/stashes MinIsWhite-adjusted nodata sentinel during array reads. |
xrspatial/geotiff/__init__.py |
Uses adjusted sentinel in eager, dask, and GPU nodata masking paths. |
xrspatial/geotiff/tests/test_miniswhite_nodata_1809.py |
Adds regression coverage for MinIsWhite nodata masking cases. |
.claude/sweep-accuracy-state.csv |
Updates sweep state notes for the fixed geotiff issue. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| import importlib.util | ||
| import os | ||
| import tempfile |
Comment on lines
+3394
to
+3397
| # When MinIsWhite was applied, the mask must use the inverted | ||
| # sentinel; otherwise the original sentinel. | ||
| _gpu_mask_value = (_mw_mask_nodata if _mw_mask_nodata is not None | ||
| else nodata) |
Remove unused os/tempfile imports from the MinIsWhite regression test and route the GPU tiled read's CPU-fallback branches through GeoInfo._mask_nodata so a sparse-tile, planar=2 auto-fallback, or final-fallback decode of a MinIsWhite raster masks against the post-inversion sentinel rather than the original.
PR #1803 forwarded the caller's max_pixels to read_to_array inside read_vrt's source loop so a tiny VRT output cannot force a huge source decode (#1796). The output-window check at the source read enforces that correctly. A separate per-tile dimension check at the same call sites also consumed the caller's max_pixels, so a caller setting max_pixels as an output budget (e.g. 10_000) failed the per-tile sanity check on any normal source whose default tile size is 256x256 (= 65_536 pixels). Use MAX_PIXELS_DEFAULT for the per-tile dim check at the two call sites in _read_tiles (local) and _read_tiles_cog_http (HTTP). The output-window check at the same functions continues to enforce the user-supplied max_pixels, preserving the #1796 protection.
This was referenced May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #1809.
MinIsWhite inversion was running before the sentinel-to-NaN nodata mask on every read backend (eager numpy, eager GPU, GPU stripped fallback, dask chunk closure). Because the inversion rewrites the original sentinel value (uint8
nodata=0->255, float32nodata=-9999->9999), the post-inversion equality check matched the wrong pixels:iinfo.max - sentinelinstead ofNaNiinfo.max - sentinelwere incorrectly turned intoNaNAll four backends agreed on the same wrong answer, so backend-parity tests caught nothing.
Fix
Introduce
_miniswhite_inverted_nodata()in_reader.pyand stash the inverted sentinel ongeo_info._mask_nodata. Every backend now routes its sentinel-to-NaN mask through that field, whileattrs['nodata']keeps the original sentinel for write-side round-trip.The dask graph builder picks up
_ifd_photometricand_ifd_samples_per_pixelfrom_read_geo_info/_parse_cog_http_metaso the closure nodata is inverted at graph-build time rather than per chunk.Test plan
test_miniswhite_nodata_1809.pycovering:test_xrs_plot_no_palette,test_plot_geotiff_deprecated,test_tile_size_positive_works,test_dstrect_at_cap_succeeds) reproduce on main and are unrelated.test_miniswhite_backend_parity_1797.pyandtest_min_is_white_inversionstill pass unchanged.