Describe the bug
open_geotiff(path, overview_level=N) with N >= 1 returns a DataArray with
its geographic metadata silently dropped. The CRS, the transform attr, and
the georeferenced y/x coordinates are all replaced with defaults (unit
transform, integer pixel indices, crs=None). This happens even when the file
was written by xrspatial.geotiff.to_geotiff(..., cog=True) itself, because
to_geotiff writes the GeoKeys, ModelPixelScale, and ModelTiepoint only on the
level-0 IFD. Most GDAL-style COG writers do the same.
The reader passes a single selected IFD to
xrspatial.geotiff._geotags.extract_geo_info. When the overview IFD has no
geokeys, the helper returns a default GeoTransform() with has_georef=False
and no CRS, and _populate_attrs_from_geo_info + _geo_to_coords then build a
non-georeferenced DataArray.
Downstream spatial ops break: slope, aspect, distance, and anything that reads
attrs['res'] or derives pixel sizes from the coord arrays gets the wrong
answer (typically off by the overview's reduction factor) with no warning.
Repro
import numpy as np
import xarray as xr
import tempfile
import os
from xrspatial.geotiff import to_geotiff, open_geotiff
arr = np.arange(1024 * 1024, dtype=np.float32).reshape(1024, 1024)
y = np.arange(1024, dtype=np.float64) * (-0.5) + 200.0
x = np.arange(1024, dtype=np.float64) * 0.5 + 100.0
da = xr.DataArray(arr, dims=['y', 'x'],
coords={'y': y, 'x': x},
attrs={'crs': 4326})
with tempfile.TemporaryDirectory() as tmp:
path = os.path.join(tmp, 'cog.tif')
to_geotiff(da, path, cog=True, overview_levels=[1, 2, 3])
r0 = open_geotiff(path, overview_level=0)
r1 = open_geotiff(path, overview_level=1)
print('level 0 transform:', r0.attrs['transform'])
print('level 1 transform:', r1.attrs.get('transform'))
print('level 1 crs: ', r1.attrs.get('crs'))
Output:
level 0 transform: (0.5, 0.0, 99.75, 0.0, -0.5, 200.25)
level 1 transform: (1.0, 0.0, 0.0, 0.0, -1.0, 0.0) # default; should scale pixel size by 2 around the same origin
level 1 crs: None # CRS dropped
All four read backends are affected the same way (open_geotiff numpy,
open_geotiff(chunks=N) dask, open_geotiff(gpu=True) cupy, and
open_geotiff(gpu=True, chunks=N) dask+cupy).
Expected behavior
Reading any overview level should produce a DataArray with the same CRS as the
full-resolution read, a transform whose pixel size scales by the overview
reduction factor, and y/x coords covering the same geographic extent as level
0.
Fix
When an overview IFD lacks ModelPixelScale / ModelTiepoint /
ModelTransformation or its GeoKeyDirectory, fall back to the geo info from the
first full-resolution IFD in the file, then rescale the pixel size by the
integer ratio width_full / width_overview (and the same for height). Put the
fix in a shared helper so all four backends inherit it.
Additional context
select_overview_ifd already filters out mask and page IFDs before indexing.
extract_geo_info is shared by the numpy, dask, cupy, and dask+cupy read
paths.
- No existing tests assert on transform/crs/coords for
overview_level >= 1.
Describe the bug
open_geotiff(path, overview_level=N)withN >= 1returns a DataArray withits geographic metadata silently dropped. The CRS, the
transformattr, andthe georeferenced
y/xcoordinates are all replaced with defaults (unittransform, integer pixel indices,
crs=None). This happens even when the filewas written by
xrspatial.geotiff.to_geotiff(..., cog=True)itself, becauseto_geotiffwrites the GeoKeys, ModelPixelScale, and ModelTiepoint only on thelevel-0 IFD. Most GDAL-style COG writers do the same.
The reader passes a single selected IFD to
xrspatial.geotiff._geotags.extract_geo_info. When the overview IFD has nogeokeys, the helper returns a default
GeoTransform()withhas_georef=Falseand no CRS, and
_populate_attrs_from_geo_info+_geo_to_coordsthen build anon-georeferenced DataArray.
Downstream spatial ops break: slope, aspect, distance, and anything that reads
attrs['res']or derives pixel sizes from the coord arrays gets the wronganswer (typically off by the overview's reduction factor) with no warning.
Repro
Output:
All four read backends are affected the same way (
open_geotiffnumpy,open_geotiff(chunks=N)dask,open_geotiff(gpu=True)cupy, andopen_geotiff(gpu=True, chunks=N)dask+cupy).Expected behavior
Reading any overview level should produce a DataArray with the same CRS as the
full-resolution read, a
transformwhose pixel size scales by the overviewreduction factor, and y/x coords covering the same geographic extent as level
0.
Fix
When an overview IFD lacks ModelPixelScale / ModelTiepoint /
ModelTransformation or its GeoKeyDirectory, fall back to the geo info from the
first full-resolution IFD in the file, then rescale the pixel size by the
integer ratio
width_full / width_overview(and the same for height). Put thefix in a shared helper so all four backends inherit it.
Additional context
select_overview_ifdalready filters out mask and page IFDs before indexing.extract_geo_infois shared by the numpy, dask, cupy, and dask+cupy readpaths.
overview_level >= 1.