Fix read_vrt(band=None) leaving non-first-band integer sentinels unmasked#1612
Merged
Conversation
read_vrt(band=None) on multi-band integer VRTs with per-band <NoDataValue> tags only masks band 0's sentinel. Bands 1+ keep their integer sentinels as literal finite values in the returned float64 array. Same class as #1598 / PR #1602, which fixed the single-band band=N case but not band=None. Auto-mode classifier denied gh issue create, so the finding is documented in the state CSV pending user authorization to file.
read_vrt on a multi-band integer VRT with per-band <NoDataValue> tags applied vrt.bands[0].nodata across the full 3-D array. Bands 1+ kept their integer sentinels as literal finite values in the returned float64 array, breaking the convention that attrs['nodata'] is present iff sentinel pixels have already been NaN-masked. The float-VRT path masks per-band correctly inline in _vrt._read_data (lines 296-297 and 347-351). PR #1602 fixed the band=N single-band selection. This change extends the per-band masking to the band=None multi-band path: when ndim == 3 and band is None, walk vrt.bands and mask each arr[..., i] slice against its own <NoDataValue>. Masks are collected against the integer array first, then promotion to float64 happens once if any band actually hit. Gating reuses the same range check that PR #1583 added to other read paths (refactored into _sentinel_for_dtype): out-of-range, non-finite, and fractional sentinels become a no-op for value matching rather than raising OverflowError. attrs['nodata'] for band=None reads still carries band 0's sentinel - that's the documented "first band wins" contract for the canonical attr that cannot encode per-band values. 7 regression tests in test_vrt_multiband_int_nodata_1611.py cover the uint16 per-band case, int32 negative sentinels, mixed presence, no-sentinel-anywhere dtype preservation, out-of-range gating, the band=N non-regression, and the attrs contract. All 135 existing vrt/nodata geotiff tests still pass.
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes xrspatial.geotiff.read_vrt(band=None) for multi-band integer VRTs so each band’s <NoDataValue> sentinel is NaN-masked against its own per-band value (previously only band 0’s sentinel was applied, leaving bands 1+ with literal sentinel integers in the returned array).
Changes:
- Add per-band integer nodata masking for
band=None/ 3D VRT reads, aligning integer behavior with the existing float VRT masking behavior. - Introduce a small in-function helper to safely cast VRT nodata values to an integer dtype sentinel only when representable (finite, integral, and in-range).
- Add a dedicated regression test suite covering mixed per-band sentinels, signed/unsigned cases, and “no promotion when no sentinel hits” behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Implements per-band integer nodata masking for read_vrt(band=None) on 3D outputs, with guarded sentinel casting. |
xrspatial/geotiff/tests/test_vrt_multiband_int_nodata_1611.py |
Adds regression tests for multi-band integer VRT per-band nodata masking (issue #1611). |
.claude/sweep-accuracy-state.csv |
Updates the accuracy sweep state entry to record the issue/fix status. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Two Copilot inline review comments on PR #1612: * test_vrt_multiband_int_nodata_1611.py:21 imported pytest but never used it. Drop the import so flake8 F401 stays clean. * __init__.py per-band integer-nodata path collected every band's boolean mask into a list before promoting and applying, holding O(bands * H * W) booleans simultaneously at peak. Refactor to walk vrt.bands once, promote arr to float64 on the first band with a hit, and apply each subsequent band's mask in-place. Peak boolean memory is now O(H * W). int_arr keeps the original integer view alive so masks compare against the integer sentinel dtype. All 7 #1611 regression tests still pass; 135 vrt/nodata geotiff tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
read_vrt(band=None)so bands 1+ on a multi-band integer VRT with per-band<NoDataValue>tags get NaN-masked against their own sentinel. Before this change only band 0's sentinel was applied, leaving bands 1+ with literal integer sentinels (e.g. 65000.0) in the returned float64 array.band=Nsingle-band selection). The float-VRT path already masks per-band correctly in_vrt._read_data(lines 296-297 and 347-351); this brings the integer path into line.attrs['nodata']forband=Nonereads still carries band 0's sentinel (documented contract: one attr can't encode per-band values). Only the pixel-level mask changes.Closes #1611.
Implementation notes
_sentinel_for_dtype(nodata_val, dtype)helper insideread_vrtmirrors the gating from PR Fix OverflowError on uint TIFF with negative GDAL_NODATA sentinel (#1581) #1583's_int_nodata_in_range: returnsNonefor non-integer dtype, non-finite, fractional, or out-of-range sentinels (so they become a no-op for value matching rather than raisingOverflowError).band is None and ndim == 3: walkvrt.bands, compute each band's sentinel mask against the integer array first, then promote to float64 once if any band actually hit. Avoids needless promotion when no sentinel is present anywhere.band=Nor 2-D output) keeps its existing fast path.Test plan
test_multiband_uint16_per_band_sentinel_each_masked— the previously-broken case (both band sentinels now NaN)test_multiband_int32_negative_per_band_sentinel— signed dtype, negative sentinelstest_multiband_only_one_band_has_sentinel_present— non-hitting band keeps its value, no spurious NaNtest_multiband_no_sentinel_present_anywhere_keeps_int_dtype— skip promotion when no band hitstest_multiband_per_band_out_of_range_sentinel_is_no_op— uint16 +<NoDataValue>-9999</NoDataValue>band doesn't raise, masks the in-range bandtest_multiband_band_kwarg_still_per_band_post_pr1602—band=0/band=1still route through PR Fix read_vrt(band=N) using band 0's nodata sentinel (#1598) #1602's single-band pathtest_multiband_attrs_nodata_still_band0—attrs['nodata']contract unchangedvrt/nodatageotiff tests still pass (pytest -k "vrt or nodata")test_reader.py,test_writer.py,test_dask_cupy_combined.pystill passNotes
Discovered during the geotiff accuracy sweep (pass 14, 2026-05-11). State CSV updated in this PR.