Context
The GeoTIFF module is now large enough that release readiness should be judged by an explicit support contract, not by the fact that many edge cases have tests. VRT support is useful for xarray-spatial users, but it should remain a conservative interoperability layer rather than an attempt to reimplement GDAL VRT completely.
Recommendation: keep VRT support, but document and enforce a narrow supported subset. Anything outside that subset should fail closed with actionable errors rather than partially working, silently dropping metadata, or returning pixels whose downstream geospatial meaning is ambiguous.
Recommended VRT support contract
Supported for release:
- Simple GDAL VRT mosaics backed by GeoTIFF sources.
- Sources with compatible CRS, transform orientation, pixel size, dtype, band count, and band layout.
- Windowed reads where source windows map cleanly to destination windows.
- Lazy/dask reads over the same supported subset.
- Explicit handling of nodata, including mixed-band nodata rejection by default.
- Missing source policy controlled by
missing_sources, with raise as the default.
Not promised for release unless explicitly added later:
- Warped VRTs / reprojection VRTs.
- Arbitrary resampling semantics beyond the implemented, tested subset.
- Mixed CRS, mixed resolution, mixed dtype, or mixed band metadata unless there is a documented explicit opt-in.
- Nested VRTs.
- Complex source/mask/alpha semantics not fully modeled by the attrs contract.
- Full GDAL VRT parity.
The docs should say this plainly: VRT is an advanced compatibility feature for simple mosaics, not a GDAL replacement.
Implementation plan
PR 1: Publish the VRT contract
- Add a VRT support matrix to the GeoTIFF reference docs.
- Mark VRT as
advanced in prose wherever GeoTIFF tiers are described.
- Document the supported subset and known non-goals above.
- Add examples of safe usage and examples that intentionally raise.
- Ensure public docstrings for
open_geotiff(... .vrt ...), read_vrt, and write_vrt match the same contract.
Acceptance criteria:
- Docs clearly distinguish supported simple mosaics from unsupported GDAL VRT features.
- No doc path implies full GDAL VRT compatibility.
PR 2: Centralize VRT capability validation
- Add a single validator for parsed VRT metadata before read execution.
- Validate CRS compatibility, dtype compatibility, band count, nodata policy, transform orientation, pixel-size compatibility, source rectangle/destination rectangle sanity, and supported resampling.
- Make direct
read_vrt calls and open_geotiff(... .vrt ...) use the same validator.
- Ensure unsupported features fail before any partial mosaic work starts where feasible.
Acceptance criteria:
- Unsupported VRT features raise typed/actionable errors at graph build or eager read setup.
- Direct and dispatcher entry points have equivalent failures.
PR 3: Add metadata parity tests for VRT, not just pixel tests
- Add parity tests asserting
attrs and coords for eager, dask, and GPU-enabled routes where applicable.
- Cover
transform, crs, crs_wkt, nodata, masked_nodata, georef_status, raster_type, gdal_metadata_xml, and extra_tags where the contract promises them.
- Add negative tests for mixed CRS, mixed nodata, unsupported resampling, bad source/destination rectangles, and missing sources.
Acceptance criteria:
- VRT tests fail if pixels match but metadata is wrong or missing.
- Mixed or ambiguous metadata cannot silently flatten to one output value.
PR 4: Lock backend parity for VRT and sidecar/overview interactions
- Add a focused backend parity matrix for VRT reads across eager and dask.
- Include external overview sidecar sources where VRT references GeoTIFFs with
.tif.ovr pyramids.
- Assert metadata parity, not only values.
- Confirm windowed reads shift coords and
attrs['transform'] consistently.
Acceptance criteria:
- The VRT path cannot pass by returning correct pixel values with wrong georeferencing attrs.
- Windowed eager and lazy VRT reads agree on shape, coords, attrs, and values.
PR 5: Harden URL/source routing used by GeoTIFF and VRT
- Centralize source scheme classification using
urlparse(...).scheme.lower().
- Make HTTP(S) detection case-insensitive everywhere.
- Ensure uppercase
HTTP:// / HTTPS:// still goes through _HTTPSource and SSRF/DNS-pinning checks, not fsspec.
- Add regression tests for uppercase schemes and private-host rejection.
Acceptance criteria:
- No code path uses case-sensitive
startswith(('http://', 'https://')) for security-relevant dispatch.
- HTTP(S) SSRF protections apply regardless of scheme casing.
PR 6: Release gate / audit checklist
- Add a release checklist for GeoTIFF promises:
- local GeoTIFF read/write
- COG read/write
- HTTP/fsspec reads
- nodata lifecycle
- attrs contract
- VRT supported subset
- GPU experimental paths
- Add or update tests so each promised tier has at least one explicit parity gate.
- Keep experimental/internal-only features documented as such.
Acceptance criteria:
- Release notes can accurately state what is stable, advanced, experimental, and unsupported.
- Every promised VRT behavior has a regression test.
Notes from review
The current module already contains a lot of defensive work. The remaining risk is mostly boundary drift between entry points and backends: eager vs dask vs GPU, local vs HTTP/fsspec, GeoTIFF vs VRT, and pixel parity vs metadata parity. For release readiness, the priority should be to reduce those drift surfaces and make unsupported behavior impossible to mistake for supported behavior.
Context
The GeoTIFF module is now large enough that release readiness should be judged by an explicit support contract, not by the fact that many edge cases have tests. VRT support is useful for xarray-spatial users, but it should remain a conservative interoperability layer rather than an attempt to reimplement GDAL VRT completely.
Recommendation: keep VRT support, but document and enforce a narrow supported subset. Anything outside that subset should fail closed with actionable errors rather than partially working, silently dropping metadata, or returning pixels whose downstream geospatial meaning is ambiguous.
Recommended VRT support contract
Supported for release:
missing_sources, withraiseas the default.Not promised for release unless explicitly added later:
The docs should say this plainly: VRT is an advanced compatibility feature for simple mosaics, not a GDAL replacement.
Implementation plan
PR 1: Publish the VRT contract
advancedin prose wherever GeoTIFF tiers are described.open_geotiff(... .vrt ...),read_vrt, andwrite_vrtmatch the same contract.Acceptance criteria:
PR 2: Centralize VRT capability validation
read_vrtcalls andopen_geotiff(... .vrt ...)use the same validator.Acceptance criteria:
PR 3: Add metadata parity tests for VRT, not just pixel tests
attrsand coords for eager, dask, and GPU-enabled routes where applicable.transform,crs,crs_wkt,nodata,masked_nodata,georef_status,raster_type,gdal_metadata_xml, andextra_tagswhere the contract promises them.Acceptance criteria:
PR 4: Lock backend parity for VRT and sidecar/overview interactions
.tif.ovrpyramids.attrs['transform']consistently.Acceptance criteria:
PR 5: Harden URL/source routing used by GeoTIFF and VRT
urlparse(...).scheme.lower().HTTP:///HTTPS://still goes through_HTTPSourceand SSRF/DNS-pinning checks, not fsspec.Acceptance criteria:
startswith(('http://', 'https://'))for security-relevant dispatch.PR 6: Release gate / audit checklist
Acceptance criteria:
Notes from review
The current module already contains a lot of defensive work. The remaining risk is mostly boundary drift between entry points and backends: eager vs dask vs GPU, local vs HTTP/fsspec, GeoTIFF vs VRT, and pixel parity vs metadata parity. For release readiness, the priority should be to reduce those drift surfaces and make unsupported behavior impossible to mistake for supported behavior.