Summary
read_vrt(path, band=N) for N > 0 unconditionally uses vrt.bands[0].nodata to populate attrs['nodata'] and to drive the integer-with-nodata float64/NaN promotion. When band N has a different nodata sentinel than band 0:
attrs['nodata'] advertises band 0's value (wrong).
- The integer-to-float64 promotion mask is built from band 0's sentinel, so band N's actual sentinel pixels stay as literal integers instead of becoming
NaN.
- The returned array dtype stays integer when it should have been promoted to
float64.
The non-VRT readers (open_geotiff, read_geotiff_dask, read_geotiff_gpu) all read the per-band nodata from the file's IFD for the selected band; only the VRT path has this bug.
Repro
import numpy as np, tempfile, os
from xrspatial.geotiff import read_vrt
from xrspatial.geotiff._writer import write
with tempfile.TemporaryDirectory() as d:
a = np.array([[1, 2], [3, 65535]], dtype=np.uint16)
b = np.array([[7, 8], [9, 65000]], dtype=np.uint16)
pa, pb = os.path.join(d,'a.tif'), os.path.join(d,'b.tif')
write(a, pa, nodata=65535, compression='none', tiled=False)
write(b, pb, nodata=65000, compression='none', tiled=False)
vrt = os.path.join(d, 'm.vrt')
with open(vrt, 'w') as f:
f.write(f'''<VRTDataset rasterXSize="2" rasterYSize="2">
<GeoTransform>0,1,0,0,0,-1</GeoTransform>
<VRTRasterBand dataType="UInt16" band="1">
<NoDataValue>65535</NoDataValue>
<SimpleSource><SourceFilename>{pa}</SourceFilename><SourceBand>1</SourceBand>
<SrcRect xOff="0" yOff="0" xSize="2" ySize="2"/>
<DstRect xOff="0" yOff="0" xSize="2" ySize="2"/></SimpleSource>
</VRTRasterBand>
<VRTRasterBand dataType="UInt16" band="2">
<NoDataValue>65000</NoDataValue>
<SimpleSource><SourceFilename>{pb}</SourceFilename><SourceBand>1</SourceBand>
<SrcRect xOff="0" yOff="0" xSize="2" ySize="2"/>
<DstRect xOff="0" yOff="0" xSize="2" ySize="2"/></SimpleSource>
</VRTRasterBand>
</VRTDataset>''')
r = read_vrt(vrt, band=1)
print(r.dtype, r.attrs.get('nodata'), r.values.tolist())
# Currently: uint16, 65535.0, [[7,8],[9,65000]]
# Expected: float64, 65000.0, [[7,8],[9,NaN]]
Root cause
In xrspatial/geotiff/__init__.py::read_vrt, around line 2735:
nodata = None
if vrt.bands:
nodata = vrt.bands[0].nodata
This always reads bands[0] rather than bands[band if band is not None else 0]. The downstream integer-promotion block (lines 2749 onward) then uses the wrong sentinel.
Proposed fix
When band is not None, source the nodata sentinel from vrt.bands[band].nodata. The internal _vrt.read_vrt already uses the per-band sentinel inside its source-read loop, so this only patches the public-layer attr emission and post-decode integer promotion.
Scope
Categories: 4 (dtype/nodata semantics).
Severity: MEDIUM -- requires multi-band VRT with per-band sentinels, but result is silently wrong.
Summary
read_vrt(path, band=N)for N > 0 unconditionally usesvrt.bands[0].nodatato populateattrs['nodata']and to drive the integer-with-nodata float64/NaN promotion. When band N has a different nodata sentinel than band 0:attrs['nodata']advertises band 0's value (wrong).NaN.float64.The non-VRT readers (
open_geotiff,read_geotiff_dask,read_geotiff_gpu) all read the per-band nodata from the file's IFD for the selected band; only the VRT path has this bug.Repro
Root cause
In
xrspatial/geotiff/__init__.py::read_vrt, around line 2735:This always reads
bands[0]rather thanbands[band if band is not None else 0]. The downstream integer-promotion block (lines 2749 onward) then uses the wrong sentinel.Proposed fix
When
band is not None, source the nodata sentinel fromvrt.bands[band].nodata. The internal_vrt.read_vrtalready uses the per-band sentinel inside its source-read loop, so this only patches the public-layer attr emission and post-decode integer promotion.Scope
Categories: 4 (dtype/nodata semantics).
Severity: MEDIUM -- requires multi-band VRT with per-band sentinels, but result is silently wrong.