Skip to content

fix(downloads): sanitize attacker-controlled filenames and verify containment#4867

Merged
sauravpanda merged 3 commits into
mainfrom
fix/sec-download-path-traversal
May 19, 2026
Merged

fix(downloads): sanitize attacker-controlled filenames and verify containment#4867
sauravpanda merged 3 commits into
mainfrom
fix/sec-download-path-traversal

Conversation

@sauravpanda
Copy link
Copy Markdown
Collaborator

@sauravpanda sauravpanda commented May 18, 2026

Summary

Closes GHSA-rv9j-wqjp-2fv4 (critical, triage) and the duplicate medium reports GHSA-66xh-g88g-2h8j / GHSA-hpr4-fqgr-xhj9.

DownloadsWatchdog joined attacker-controlled filenames from CDP (Page.downloadWillBegin.suggestedFilename) and Content-Disposition headers directly into the configured downloads_path. Strings like ../../escape.bin or /etc/shadow.bak would os.path.join outside the downloads directory, writing the fetched bytes (also attacker-controlled — the response body IS the exploit content) to an arbitrary location with the agent's process privileges.

download_file_from_url triggers passively for any Content-Disposition: attachment response, so this is reachable from any visited siteallowed_domains does not mitigate it.

Changes

Two new private helpers on DownloadsWatchdog:

  • _sanitize_download_filename(name) — keep only basename, normalize Windows separators, strip null bytes, fall back to 'download' for empty / pure-traversal inputs.
  • _is_path_contained(path, dir)os.path.realpath containment check for the on-disk sinks.

Sanitizer wired at every attacker-controlled filename ingress:

Site What it reads
download_will_begin_handler CDP suggestedFilename (cache + events)
_handle_cdp_download same field, separate code path
Network-monitor Content-Disposition parser re.search(...).group(1)
download_file_from_url(suggested_filename=...) upstream-passed filename
_handle_download Playwright download.suggested_filename

Containment check wired at every on-disk write site:

  • download_file_from_url write (line ~755)
  • _handle_download (Playwright save_as path)
  • trigger_pdf_download write (defense in depth — already basename'd, but pinned)

Test plan

  • uv run pytest -vxs tests/ci/security/test_download_filename_sanitization.py — 16 new tests covering: relative traversal, absolute Unix paths, Windows backslash paths, mixed separators, pure-traversal fallback, null-byte stripping, empty/None fallback, normal filename preservation, Unicode preservation; containment helper (inside / nested / escape / dir-itself / sibling-dir); _get_unique_filename collision handling on sanitized input.
  • uv run pytest -vx tests/ci/security/ — 80/80 pass (existing security suite unchanged).
  • uv run pyright / ruff check / ruff format — clean.
  • Full uv run pytest -vxs tests/ci on CI.

Notes

This is the most reachable of the post-0.12.6 critical advisories — no prompt injection or domain bypass needed, any visited site can trigger it. Recommend prioritizing this PR's review.

Do not auto-merge — please review.


Summary by cubic

Fixes a critical path traversal in download handling by sanitizing attacker-controlled filenames and enforcing realpath containment before writing files. Prevents arbitrary file writes via CDP suggestedFilename and Content-Disposition from any visited site (fixes GHSA-rv9j-wqjp-2fv4 and duplicates); trims comments/docstrings with no behavior change.

  • Bug Fixes
    • Added _sanitize_download_filename and _is_path_contained helpers.
    • Sanitized filenames from CDP events, Playwright download.suggested_filename, Content-Disposition, and download_file_from_url.
    • Enforced containment at all write sites (download_file_from_url, Playwright save path, PDF export); refuse writes outside downloads_dir (covers symlink escapes).
    • Added tests for traversal, absolute paths, Windows/mixed separators, null bytes, empty/None, unicode, containment behavior, and _get_unique_filename collision handling.

Written for commit f0413fb. Summary will update on new commits. Review in cubic

…tainment

GHSA-rv9j-wqjp-2fv4 (critical), GHSA-66xh-g88g-2h8j, GHSA-hpr4-fqgr-xhj9.

`DownloadsWatchdog` joined attacker-controlled filenames from CDP
(`Page.downloadWillBegin.suggestedFilename`) and `Content-Disposition`
headers directly into the configured `downloads_path`. Strings like
`../../escape.bin` or `/etc/shadow.bak` would `os.path.join` outside the
downloads directory, writing the fetched bytes (also attacker-controlled
— the response body is the exploit content) to an arbitrary location
with the agent's process privileges.

`download_file_from_url` triggers passively for any
`Content-Disposition: attachment` response, so this is reachable from any
visited site — `allowed_domains` does not mitigate it.

Add two private helpers on DownloadsWatchdog:

- `_sanitize_download_filename(name)`: keep only the basename, normalize
  Windows separators, strip null bytes, fall back to `'download'` for
  empty / pure-traversal inputs.
- `_is_path_contained(path, dir)`: realpath containment check for the
  on-disk sinks.

Wire the sanitizer at every attacker-controlled filename ingress:

- `download_will_begin_handler` (CDP suggestedFilename → cache + events)
- `_handle_cdp_download` (same field, separate path)
- Network-monitor Content-Disposition parser
- `download_file_from_url` (suggested_filename argument)
- `_handle_download` (Playwright `download.suggested_filename`)

Wire the containment check at every on-disk write site:

- `download_file_from_url` write
- `_handle_download` (Playwright save_as path)
- `trigger_pdf_download` write (defense in depth — already basename'd)
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

Agent Task Evaluation Results: 2/2 (100%)

View detailed results
Task Result Reason
browser_use_pip ✅ Pass Skipped - API key not available (fork PR or missing secret)
amazon_laptop ✅ Pass Skipped - API key not available (fork PR or missing secret)

Check the evaluate-tasks job for detailed task execution logs.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Re-trigger cubic

Drop the per-call-site rationale text; the helper names already convey
what's happening, and the WHY lives in the commit history. Shrink both
helper docstrings to one line.

No behavior change; 16/16 tests still pass.
@sauravpanda sauravpanda merged commit 6f3d3ff into main May 19, 2026
94 checks passed
@sauravpanda sauravpanda deleted the fix/sec-download-path-traversal branch May 19, 2026 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant