Add VRT missing source policy#1806
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds an explicit, API-level policy to control how read_vrt handles unreadable/missing VRT source files, preserving the historical “warn and continue” behavior by default while enabling a fail-fast option for production pipelines (closes #1799).
Changes:
- Add
missing_sources={'warn','raise'}parameter to both internal VRT assembly and the publicxrspatial.geotiff.read_vrtAPI. - Validate
missing_sourcesvalues and implement fail-fast behavior when set to'raise'. - Add tests and reference documentation describing the new policy.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/_vrt.py |
Adds missing_sources handling to the internal VRT reader and integrates it with the existing strict-mode behavior. |
xrspatial/geotiff/__init__.py |
Exposes missing_sources on the public read_vrt API and forwards it to the internal reader; documents the new parameter. |
xrspatial/geotiff/tests/test_vrt_missing_sources_policy_1799.py |
Adds regression tests for warn vs raise behavior and policy validation. |
docs/source/reference/geotiff.rst |
Documents the new missing-source policy option for VRT reads. |
Comments suppressed due to low confidence (1)
xrspatial/geotiff/_vrt.py:907
- The fallback warning text still suggests only
XRSPATIAL_GEOTIFF_STRICT=1as the way to raise on unreadable VRT sources. Now thatmissing_sources='raise'exists, the warning message should mention that option as well (otherwise users may miss the new API-level fail-fast control).
f"VRT source {src.filename!r} could not be read "
f"({type(e).__name__}: {e}); skipping. The output "
f"mosaic will have a hole at this tile. Inspect "
f"``DataArray.attrs['vrt_holes']`` or set "
f"XRSPATIAL_GEOTIFF_STRICT=1 to raise instead.",
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
611
to
615
| def read_vrt(vrt_path: str, *, window=None, | ||
| band: int | None = None, | ||
| max_pixels: int | None = None) -> tuple[np.ndarray, VRTDataset]: | ||
| max_pixels: int | None = None, | ||
| missing_sources: str = 'warn') -> tuple[np.ndarray, VRTDataset]: | ||
| """Read a VRT file by assembling pixel data from its source files. |
Comment on lines
+3795
to
+3796
| preserves the historical behavior: emit ``GeoTIFFFallbackWarning``, | ||
| record ``attrs['vrt_holes']``, and return a partial mosaic. |
This was referenced May 13, 2026
brendancol
added a commit
that referenced
this pull request
May 13, 2026
Resolves conflict in xrspatial/geotiff/__init__.py: keeps the `_read_vrt_dask` dispatch hook from the PR branch. All other geotiff changes from main (#1791, #1793, #1801, #1802, #1803, #1804, #1805, #1806) were already integrated into the working tree by the prior 7329dd9 commit; this merge just records the parent so git recognises the reconciliation.
brendancol
added a commit
that referenced
this pull request
May 14, 2026
#1811) * geotiff: forward missing_sources from open_geotiff to read_vrt (#1810) read_vrt gained a missing_sources='warn'|'raise' policy kwarg in #1806 but the documented dispatcher open_geotiff did not expose it. Callers who wanted strict failure on broken VRT sources had to call read_vrt directly, defeating the dispatcher contract. Same class of bug as #1561 / #1605 / #1685 / #1795: a backend kwarg reachable on the routed-to function is unreachable through the documented entry point. Fix mirrors the on_gpu_failure pattern that open_geotiff already uses for GPU-only kwargs: - Add missing_sources= to open_geotiff with a sentinel default so the dispatcher can tell "caller never set it" (skip forwarding, let the read_vrt default of 'warn' apply) from "caller set it" (forward verbatim). - Reject missing_sources on non-VRT sources with a clear ValueError so callers learn the policy is being ignored instead of getting a silent drop. - Document the kwarg in the open_geotiff docstring with the same values and semantics as read_vrt. Regression tests in test_open_geotiff_missing_sources_1810.py cover the signature, warn/raise/invalid policies through the dispatcher, non-VRT rejection, and the no-kwarg backward-compatible path. * sweep: update api-consistency state for geotiff (v5) Record the v5 api-consistency sweep against geotiff. One MEDIUM finding filed and fixed (#1810). Cross-sibling write-return-type drift remains deferred at LOW. * geotiff: per-tile dim check uses default cap, not caller budget (#1823) PR #1803 forwarded the caller's max_pixels to read_to_array inside read_vrt's source loop so a tiny VRT output cannot force a huge source decode (#1796). The output-window check at the source read enforces that correctly. A separate per-tile dimension check at the same call sites also consumed the caller's max_pixels, so a caller setting max_pixels as an output budget (e.g. 10_000) failed the per-tile sanity check on any normal source whose default tile size is 256x256 (= 65_536 pixels). Use MAX_PIXELS_DEFAULT for the per-tile dim check at the two call sites in _read_tiles (local) and _read_tiles_cog_http (HTTP). The output-window check at the same functions continues to enforce the user-supplied max_pixels, preserving the #1796 protection.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1799.\n\nAdds read_vrt(missing_sources=...) with warn and raise modes, preserving historical warn behavior while giving pipelines an explicit fail-fast option.\n\nTested: pytest xrspatial/geotiff/tests/test_vrt_missing_sources_policy_1799.py