fix(downloads): sanitize attacker-controlled filenames and verify containment#4867
Merged
Conversation
…tainment GHSA-rv9j-wqjp-2fv4 (critical), GHSA-66xh-g88g-2h8j, GHSA-hpr4-fqgr-xhj9. `DownloadsWatchdog` joined attacker-controlled filenames from CDP (`Page.downloadWillBegin.suggestedFilename`) and `Content-Disposition` headers directly into the configured `downloads_path`. Strings like `../../escape.bin` or `/etc/shadow.bak` would `os.path.join` outside the downloads directory, writing the fetched bytes (also attacker-controlled — the response body is the exploit content) to an arbitrary location with the agent's process privileges. `download_file_from_url` triggers passively for any `Content-Disposition: attachment` response, so this is reachable from any visited site — `allowed_domains` does not mitigate it. Add two private helpers on DownloadsWatchdog: - `_sanitize_download_filename(name)`: keep only the basename, normalize Windows separators, strip null bytes, fall back to `'download'` for empty / pure-traversal inputs. - `_is_path_contained(path, dir)`: realpath containment check for the on-disk sinks. Wire the sanitizer at every attacker-controlled filename ingress: - `download_will_begin_handler` (CDP suggestedFilename → cache + events) - `_handle_cdp_download` (same field, separate path) - Network-monitor Content-Disposition parser - `download_file_from_url` (suggested_filename argument) - `_handle_download` (Playwright `download.suggested_filename`) Wire the containment check at every on-disk write site: - `download_file_from_url` write - `_handle_download` (Playwright save_as path) - `trigger_pdf_download` write (defense in depth — already basename'd)
Agent Task Evaluation Results: 2/2 (100%)View detailed results
Check the evaluate-tasks job for detailed task execution logs. |
Drop the per-call-site rationale text; the helper names already convey what's happening, and the WHY lives in the commit history. Shrink both helper docstrings to one line. No behavior change; 16/16 tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes GHSA-rv9j-wqjp-2fv4 (critical,
triage) and the duplicate medium reports GHSA-66xh-g88g-2h8j / GHSA-hpr4-fqgr-xhj9.DownloadsWatchdogjoined attacker-controlled filenames from CDP (Page.downloadWillBegin.suggestedFilename) andContent-Dispositionheaders directly into the configureddownloads_path. Strings like../../escape.binor/etc/shadow.bakwouldos.path.joinoutside the downloads directory, writing the fetched bytes (also attacker-controlled — the response body IS the exploit content) to an arbitrary location with the agent's process privileges.download_file_from_urltriggers passively for anyContent-Disposition: attachmentresponse, so this is reachable from any visited site —allowed_domainsdoes not mitigate it.Changes
Two new private helpers on
DownloadsWatchdog:_sanitize_download_filename(name)— keep only basename, normalize Windows separators, strip null bytes, fall back to'download'for empty / pure-traversal inputs._is_path_contained(path, dir)—os.path.realpathcontainment check for the on-disk sinks.Sanitizer wired at every attacker-controlled filename ingress:
download_will_begin_handlersuggestedFilename(cache + events)_handle_cdp_downloadContent-Dispositionparserre.search(...).group(1)download_file_from_url(suggested_filename=...)_handle_downloaddownload.suggested_filenameContainment check wired at every on-disk write site:
download_file_from_urlwrite (line ~755)_handle_download(Playwrightsave_aspath)trigger_pdf_downloadwrite (defense in depth — already basename'd, but pinned)Test plan
uv run pytest -vxs tests/ci/security/test_download_filename_sanitization.py— 16 new tests covering: relative traversal, absolute Unix paths, Windows backslash paths, mixed separators, pure-traversal fallback, null-byte stripping, empty/None fallback, normal filename preservation, Unicode preservation; containment helper (inside / nested / escape / dir-itself / sibling-dir);_get_unique_filenamecollision handling on sanitized input.uv run pytest -vx tests/ci/security/— 80/80 pass (existing security suite unchanged).uv run pyright/ruff check/ruff format— clean.uv run pytest -vxs tests/cion CI.Notes
This is the most reachable of the post-0.12.6 critical advisories — no prompt injection or domain bypass needed, any visited site can trigger it. Recommend prioritizing this PR's review.
Do not auto-merge — please review.
Summary by cubic
Fixes a critical path traversal in download handling by sanitizing attacker-controlled filenames and enforcing realpath containment before writing files. Prevents arbitrary file writes via CDP
suggestedFilenameandContent-Dispositionfrom any visited site (fixes GHSA-rv9j-wqjp-2fv4 and duplicates); trims comments/docstrings with no behavior change._sanitize_download_filenameand_is_path_containedhelpers.download.suggested_filename,Content-Disposition, anddownload_file_from_url.download_file_from_url, Playwright save path, PDF export); refuse writes outsidedownloads_dir(covers symlink escapes)._get_unique_filenamecollision handling.Written for commit f0413fb. Summary will update on new commits. Review in cubic