Skip to content

Add URL.joinpath_safe to prevent path traversal from untrusted segments#1673

Draft
aiolibsbot wants to merge 7 commits into
aio-libs:masterfrom
aiolibsbot:koan/joinpath-safe
Draft

Add URL.joinpath_safe to prevent path traversal from untrusted segments#1673
aiolibsbot wants to merge 7 commits into
aio-libs:masterfrom
aiolibsbot:koan/joinpath-safe

Conversation

@aiolibsbot
Copy link
Copy Markdown
Contributor

@aiolibsbot aiolibsbot commented May 16, 2026

What do these changes do?

Add URL.joinpath_safe(*segments) — like joinpath, but every argument is treated as a single opaque path segment. Each segment is quoted with a new PATH_SAFE_QUOTER (identical to PATH_QUOTER but does not protect /, so embedded slashes get percent-encoded). Literal "." / ".." inputs are replaced with "%2E" / "%2E%2E" so normalize_path_segments cannot interpret them as relative-path indicators. The call then delegates to _make_child(..., encoded=True).

Are there changes in behavior for the user?

Yes — new public method on URL. No change to existing methods.

>>> api = URL("https://api.example/api/v1/")
>>> api.joinpath("object/id", "../evil")
URL('https://api.example/api/v1/object/evil')         # traversed!
>>> api.joinpath_safe("object/id", "../evil")
URL('https://api.example/api/v1/object%2Fid/..%2Fevil')  # safe
>>> api.joinpath_safe("..", "etc/passwd")
URL('https://api.example/api/v1/%2E%2E/etc%2Fpasswd')

Untrusted input dropped into joinpath (or /) could previously traverse the base path. joinpath_safe is the explicit-opt-in API for that case. Non-str arguments raise TypeError.

Related issue number

Fixes #1631

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes

@psf-chronographer psf-chronographer Bot added the bot:chronographer:provided There is a change note present in this PR label May 16, 2026
Comment thread tests/test_url.py
Comment thread yarl/_url.py Outdated
@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 16, 2026

@aiolibsbot rr

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 16, 2026

Merging this PR will not alter performance

✅ 99 untouched benchmarks


Comparing aiolibsbot:koan/joinpath-safe (29e13a5) with master (3867a90)

Open in CodSpeed

@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.80%. Comparing base (191ad29) to head (29e13a5).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1673      +/-   ##
==========================================
+ Coverage   99.47%   99.80%   +0.32%     
==========================================
  Files          30       21       -9     
  Lines        5960     4156    -1804     
  Branches      285      245      -40     
==========================================
- Hits         5929     4148    -1781     
+ Misses         22        5      -17     
+ Partials        9        3       -6     
Flag Coverage Δ
CI-GHA 99.73% <100.00%> (+0.25%) ⬆️
MyPy ?
OS-Linux 99.68% <100.00%> (-0.03%) ⬇️
OS-Windows 98.43% <100.00%> (+0.01%) ⬆️
OS-macOS 98.60% <100.00%> (+0.03%) ⬆️
Py-3.10.11 98.41% <100.00%> (+0.01%) ⬆️
Py-3.10.20 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.11.15 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.11.9 98.41% <100.00%> (+0.01%) ⬆️
Py-3.12.10 98.41% <100.00%> (+0.01%) ⬆️
Py-3.12.13 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.13.13 99.68% <100.00%> (+<0.01%) ⬆️
Py-3.13.13t 99.68% <100.00%> (+<0.01%) ⬆️
Py-3.14.4 98.58% <100.00%> (+0.03%) ⬆️
Py-3.14.5 99.63% <100.00%> (-0.03%) ⬇️
Py-3.14.5t 99.68% <100.00%> (+<0.01%) ⬆️
Py-pypy3.11.15-7.3.22 99.30% <100.00%> (+<0.01%) ⬆️
VM-macos-latest 98.60% <100.00%> (+0.03%) ⬆️
VM-ubuntu-latest 99.68% <100.00%> (-0.03%) ⬇️
VM-windows-latest 98.43% <100.00%> (+0.01%) ⬆️
pytest 99.73% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aiolibsbot
Copy link
Copy Markdown
Contributor Author

aiolibsbot commented May 16, 2026

PR Review — Add URL.joinpath_safe to prevent path traversal from untrusted segments

Solid security-hardening addition. The implementation is small, focused, and the input-side guard for . / .. (refactored from the earlier output-side check per @bdraco) reads clearly. Test coverage hits the issue-1631 traversal case, standalone-dot variants, slash/space encoding, percent-literal handling, and the no-op cases; type-error path and parts round-trip are also covered. Docs + CHANGES + spelling wordlist are all updated. Coverage is 100% on the new lines and CodSpeed shows no regression. Only minor polish suggestions (empty-segment behavior pinned by a test, a few more boundary cases, docstring mentioning the %2E encoded form); none are blocking. Note: the diff still carries the pyupgrade v3.20.0→v3.21.2 bump, which has since landed independently on master (c8d8f19) — a rebase would drop it from the diff. Process-side: please address @bdraco's request to use the aio-libs PR template (the "Is it a substantial burden for the maintainers to support this?" section is missing).


🟢 Suggestions

1. Consider handling empty-string segments explicitly (`yarl/_url.py`, L1485-1494)

Empty-string arguments fall through to PATH_SAFE_QUOTER("") and produce an empty segment that _make_child(..., encoded=True) will then concatenate. The behavior is inherited from joinpath, but since joinpath_safe is the security-hardened API, the test matrix should pin down what URL("http://x/p").joinpath_safe("") returns (extra slash? collapsed?) so it's not left as an undefined edge case for callers passing user input. A single parametrize case is enough.

safe_segments.append(PATH_SAFE_QUOTER(segment))
2. Add a few more boundary cases (`tests/test_url.py`, L1119-1175)

The matrix is good for the documented threats, but a couple of cheap additions would harden it:

  1. Empty string segment: ("foo", "", "bar") — pin the documented behavior.
  2. Unicode segment: ("caf\u00e9",) — verify UTF-8 percent-encoding.
  3. Segment containing ? or # — confirms they're percent-encoded and don't introduce a query/fragment (PATH_SAFE_QUOTER's safe="@:" already excludes them, but a regression test is cheap insurance).
  4. Three-dot segment "..." — verify it's not treated as a traversal indicator (only . and .. are).
3. Mention the encoded form in the docstring (`yarl/_url.py`, L1475-1494)

The docstring tells the reader that . / .. segments are percent-encoded but doesn't say to which form. Consider naming it explicitly (%2E / %2E%2E) so users grepping for %2E in their logs can find this method, and so the behavior is part of the contract rather than an implementation detail callers can't rely on.


Checklist

  • Path traversal prevented for untrusted segments
  • Type validation at API boundary
  • No hardcoded secrets / unsafe deserialization
  • Edge case coverage (boundary values, empty input) — suggestion #1, suggestion #2
  • Documentation reflects the change
  • CHANGES entry present
  • Tests validate observable behavior, not source text
  • Public method has a clear, complete docstring — suggestion #3

Summary

Solid security-hardening addition. The implementation is small, focused, and the input-side guard for . / .. (refactored from the earlier output-side check per @bdraco) reads clearly. Test coverage hits the issue-1631 traversal case, standalone-dot variants, slash/space encoding, percent-literal handling, and the no-op cases; type-error path and parts round-trip are also covered. Docs + CHANGES + spelling wordlist are all updated. Coverage is 100% on the new lines and CodSpeed shows no regression. Only minor polish suggestions (empty-segment behavior pinned by a test, a few more boundary cases, docstring mentioning the %2E encoded form); none are blocking. Note: the diff still carries the pyupgrade v3.20.0→v3.21.2 bump, which has since landed independently on master (c8d8f19) — a rebase would drop it from the diff. Process-side: please address @bdraco's request to use the aio-libs PR template (the "Is it a substantial burden for the maintainers to support this?" section is missing).


Automated review by Kōan1f88301
706fce3
29e63b5
22066e1

@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 16, 2026

You must use the aio-libs PR template when opening prs

@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 16, 2026

@aiolibsbot rr

aiolibsbot added a commit to aiolibsbot/yarl that referenced this pull request May 16, 2026
aiolibsbot added a commit to aiolibsbot/yarl that referenced this pull request May 16, 2026
@aiolibsbot
Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/joinpath-safe was rebased onto master and review feedback was applied.

Changes applied

  • Changes applied:
  • yarl/_url.py (joinpath_safe): Moved ./.. check from quoter output to raw input per @bdraco's "this seems a bit hacky" comment. Intent now reads as "encode the literal relative-path tokens" rather than poking at quoter output.
  • tests/test_url.py (parametrize): Changed signature from "base,to_join,expected" to ("base", "to_join,expected") per @bdraco's suggestion — groups to_join,expected as co-varying pair.
  • CI fix: Applied pre-push CI fix
  • CI fix: Pre-push CI fix applied

Stats

6 files changed, 142 insertions(+)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=No commit on master mentions joinpath_safe or path traversal protection for untrusted segments.)
  • Rebased koan/joinpath-safe onto upstream/master
  • Applied review feedback
  • Pre-push CI check: previous run #25966077765 failed
  • Applied pre-push CI fix
  • Pre-push CI fix applied
  • Force-pushed koan/joinpath-safe to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 16, 2026

@aiolibsbot gh_request

Update the PR to use the PR template, add to memory to always use the aio-libs PR template for this repo

aiolibsbot added a commit to aiolibsbot/yarl that referenced this pull request May 16, 2026
@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 16, 2026

@aiolibsbot rr

Koan Bot and others added 3 commits May 16, 2026 19:55
Closes aio-libs#1631.

Each argument is treated as a single opaque path segment: '/' is
percent-encoded so it cannot introduce additional segments, and
whole-segment '.' / '..' are percent-encoded so they cannot be
interpreted as relative-path indicators by normalize_path_segments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aiolibsbot
Copy link
Copy Markdown
Contributor Author

Rebase with requested adjustments

Branch koan/joinpath-safe was rebased onto master and review feedback was applied.

Stats

6 files changed, 142 insertions(+)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=No commit on master adds joinpath_safe or any path-traversal protection for untrusted segments.)
  • Rebased koan/joinpath-safe onto upstream/master
  • Pre-push CI check: previous run #25967055920 failed
  • Pre-push CI fix: no changes needed or Claude found nothing to fix
  • Force-pushed koan/joinpath-safe to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@aiolibsbot aiolibsbot force-pushed the koan/joinpath-safe branch from 22066e1 to 8ba9743 Compare May 16, 2026 19:56
@bdraco
Copy link
Copy Markdown
Member

bdraco commented May 16, 2026

@lexrobin-te Is that what you were hoping for ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: joinpath_safe

2 participants