Reject VRT SourceFilename paths outside the VRT directory (#1671)#1679
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the GeoTIFF VRT reader against path traversal by enforcing that resolved SourceFilename paths remain within the VRT directory unless explicitly allowlisted, addressing security issue #1671.
Changes:
- Enforce VRT source-path containment in
parse_vrt, with an opt-in allowlist viaXRSPATIAL_VRT_ALLOWED_ROOTS. - Expand/adjust tests to cover traversal via
.., symlink escapes, absolute-path rejection, and allowlist behavior. - Document the new containment policy in the public
read_vrtdocstring.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
xrspatial/geotiff/_vrt.py |
Adds allowlist parsing + robust containment check using commonpath, raising ValueError on escapes. |
xrspatial/geotiff/__init__.py |
Documents the new VRT SourceFilename containment/allowlist policy in read_vrt. |
xrspatial/geotiff/tests/test_vrt_path_containment_1671.py |
New regression tests covering traversal, symlink escape, absolute-path policy, and allowlist behavior. |
xrspatial/geotiff/tests/test_security.py |
Updates existing security tests to reflect the new “reject on escape” behavior. |
xrspatial/geotiff/tests/test_features.py |
Adjusts a VRT parsing test to use tmp_path so the new containment policy permits the placeholder source path. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+235
to
+248
| def test_allowlist_empty_entries_ignored(tmp_path, monkeypatch): | ||
| """Empty entries in the allowlist (from stray colons) are skipped.""" | ||
| vrt_dir = _unique_dir(tmp_path, "empty_entry_vrt") | ||
| outside_dir = _unique_dir(tmp_path, "empty_entry_data") | ||
| outside_tif = os.path.join(outside_dir, 'data.tif') | ||
| _write_minimal_tif(outside_tif) | ||
|
|
||
| vrt_path = os.path.join(vrt_dir, 'mosaic.vrt') | ||
| _build_vrt(vrt_path, outside_tif, relative='0') | ||
|
|
||
| # Leading/trailing/embedded empty entries should not crash the parser | ||
| # or accidentally grant access to ``/``. | ||
| value = f":{outside_dir}::" | ||
| monkeypatch.setenv('XRSPATIAL_VRT_ALLOWED_ROOTS', value) |
Comment on lines
+192
to
+194
| def test_allowlist_supports_multiple_roots(tmp_path, monkeypatch): | ||
| """A colon-separated list permits sources under any listed root.""" | ||
| vrt_dir = _unique_dir(tmp_path, "multi_vrt") |
|
|
||
|
|
||
| # Environment variable name used to opt in to absolute source paths | ||
| # outside the VRT directory. Colon-separated (``os.pathsep``) list of |
|
|
||
| # Plant a symlink inside vrt_dir that points to the outside file. | ||
| sym = os.path.join(vrt_dir, 'inside.tif') | ||
| os.symlink(outside_target, sym) |
brendancol
added a commit
that referenced
this pull request
May 12, 2026
Reword "colon-separated" to "os.pathsep-separated" in the VRT module comment and test docstrings so Windows users (where the separator is ';') are not misled. Build the empty-entries env var with os.pathsep so the test stays cross-platform on the Windows CI matrix. Guard os.symlink with try/except OSError + pytest.skip since Windows requires Developer Mode or admin privileges for symlinks and some filesystems do not support them at all. No behaviour change on POSIX.
brendancol
added a commit
that referenced
this pull request
May 12, 2026
Reword "colon-separated" to "os.pathsep-separated" in the VRT module comment and test docstrings so Windows users (where the separator is ';') are not misled. Build the empty-entries env var with os.pathsep so the test stays cross-platform on the Windows CI matrix. Guard os.symlink with try/except OSError + pytest.skip since Windows requires Developer Mode or admin privileges for symlinks and some filesystems do not support them at all. No behaviour change on POSIX.
os.path.realpath only normalises ../ and symlinks; it does not enforce containment. A crafted VRT could ship SourceFilename values like ../../../../etc/passwd and read_to_array would open them. For trusted VRTs this is benign; for VRTs accepted from user input (uploads, untrusted mounts) it is a path-traversal primitive. parse_vrt now canonicalises every SourceFilename with os.path.realpath and then enforces: * relativeToVRT='1' sources must resolve under the VRT directory. * relativeToVRT='0' sources must resolve under the VRT directory or under a directory listed in XRSPATIAL_VRT_ALLOWED_ROOTS (colon- separated, modelled on XRSPATIAL_GEOTIFF_STRICT). Empty entries are ignored. Anything else raises ValueError naming the rejected path and the trusted roots. The relative case always rejects escapes even when the allowlist would otherwise cover the resolved path, so an attacker cannot chain allowlist entries into a relative-source traversal. The read_vrt docstring documents the new behaviour. Existing tests that asserted the old silent-canonicalisation behaviour are updated. test_vrt_path_containment_1671.py covers ../ traversal, symlink traversal, absolute rejection by default, allowlist opt-in (single root, multiple roots, empty entries), relative-source escape vs allowlist interaction, and a happy-path regression for legitimate relative sources.
Reword "colon-separated" to "os.pathsep-separated" in the VRT module comment and test docstrings so Windows users (where the separator is ';') are not misled. Build the empty-entries env var with os.pathsep so the test stays cross-platform on the Windows CI matrix. Guard os.symlink with try/except OSError + pytest.skip since Windows requires Developer Mode or admin privileges for symlinks and some filesystems do not support them at all. No behaviour change on POSIX.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1671.
Problem
The previous path-traversal fix (#1185) called
os.path.realpathon everySourceFilenameand stopped there.realpathnormalises..segments and resolves symlinks; it does not enforce that the resulting path lives under any specific root. A VRT that declaredor a symlink inside
vrt_dirpointing outside it would happily resolve to the target and then get handed toread_to_array. For trusted VRTs this is benign. For VRTs accepted from user input (uploads, untrusted mounts) it is a path-traversal primitive.Fix
parse_vrtstill canonicalises withrealpath, then enforces containment:relativeToVRT='1'sources must resolve underrealpath(vrt_dir). Rejected otherwise, even if the resolved path happens to land under an allowlisted root - a relative source that escapes the VRT directory is always an attack primitive.relativeToVRT='0'sources are accepted when they resolve underrealpath(vrt_dir)(matches the writer'srelative=Falseround trip) or under any entry inXRSPATIAL_VRT_ALLOWED_ROOTS.XRSPATIAL_VRT_ALLOWED_ROOTSis a colon-separated (os.pathsep) list of directory paths, modelled on the existingXRSPATIAL_GEOTIFF_STRICTenv var. Empty entries are ignored so a stray separator does not silently grant access to/.Containment uses
os.path.commonpathrather thanstartswithso a/fooallowlist entry does not accidentally cover/foobar.Rejected paths raise
ValueErrorwith the resolved path, the VRT root, and the allowlist entries spelled out so operators can diagnose without re-reading the source XML.The
read_vrtdocstring documents the new behaviour.Tests
xrspatial/geotiff/tests/test_vrt_path_containment_1671.pyadds 10 tests:../traversal viarelativeToVRT='1'vrt_dirXRSPATIAL_VRT_ALLOWED_ROOTSsingle root / multiple roots / empty entriesTwo existing tests in
test_security.py(test_relative_path_canonicalized,test_absolute_path_also_canonicalized) asserted the old silent-canonicalisation behaviour and were updated to assert rejection. One field-extraction test intest_features.pywas using/data/tile.tifas a placeholder source path and was updated to usetmp_path/tile.tifso the containment check accepts it.Test plan
pytest xrspatial/geotiff/tests/test_vrt_path_containment_1671.py -v(10 passed)pytest xrspatial/geotiff/tests/test_security.py -v(passes)pytest xrspatial/geotiff/tests/ -k vrt(126 passed)deepcopyrecursion failures inTestPalettethat reproduce onmainand are unrelated to this change.