fix: add SSRF and path traversal protections#5315
Conversation
iris-clawd
left a comment
There was a problem hiding this comment.
Review: SSRF & Path Traversal Protections
Good security hardening overall — centralized validators applied consistently across tools. A few things to flag:
🔴 DNS Rebinding (validate_url)
The DNS resolution at validation time does not pin the resolved IP for the subsequent HTTP request. An attacker can serve a safe IP on the first lookup (validation pass) and a private IP on the second lookup (when requests.get/post resolves again). This is the classic DNS rebinding bypass for SSRF checks.
Mitigation options:
- Return the resolved IP and force the HTTP client to connect to it directly (e.g., override the
Hostheader or use a customurllib3adapter) - Pin the resolution with a very short-lived DNS cache per request
- At minimum, document this limitation so we know it's a conscious tradeoff
🟡 is_private may be overly broad
ipaddress.is_private blocks RFC 1918, CGNAT (100.64.0.0/10), and Tailscale ranges (100.x.x.x). Users with crews that need to hit internal APIs (common in enterprise/Factory deployments) will get hard failures with no recourse. Consider:
- An opt-out flag or allowlist parameter on
validate_url(e.g.,allowed_hosts) - Or at least document this as a known breaking change for internal-network use cases
🟡 Path validation TOCTOU
validate_path() resolves and checks the path, but the actual file open happens later. A symlink swap between validation and open could bypass the check. Low severity for most use cases, but worth noting in a security-focused PR. The canonical fix is to open() the file first, then check the fd's real path via os.fstat//proc/self/fd, but that's heavier.
🟢 Minor nits
security/__init__.pyis empty — consider re-exportingvalidate_urlandvalidate_pathfor cleaner importsfiles_compressor_tool.pyvalidatesinput_pathbut not any output path — if the output is user-controlled, that's a write-to-arbitrary-location gapbrightdata_unlocker.py: validatesurl(correct — it's the user-supplied target), but the actual HTTP call goes toself.base_url. The validation is on the right value; just noting the indirection.
✅ What looks good
- Clean separation into
security/safe_path.pyandsecurity/safe_url.py - Consistent application across all relevant tools (11 URL, 7 path)
- Dropping the unused
dockerdep is a nice cleanup - Test plan covers the key cases
The DNS rebinding gap is the main one I'd want addressed or explicitly documented before merge. The rest are lower priority. Nice work on this, Greyson. 💬 158
iris-clawd
left a comment
There was a problem hiding this comment.
Approving per request. The DNS rebinding and is_private breadth notes from my earlier review still stand as suggestions for future iteration, but the overall security posture improvement is solid and worth landing. ✅
4ec363d to
6b49989
Compare
|
Rebased on main and unified the security utilities. Here's what changed: Kept from your PR (the good stuff):
Unified with the already-merged #5310:
Result: Single source of truth for path/URL validation, both centralized (RAG chokepoint) and distributed (per-tool) protection layers. 37 tests passing, lint clean. |
CVE-2026-2286: validate_url blocks non-http/https schemes, private IPs, loopback, link-local, reserved addresses. Applied to 11 web tools. CVE-2026-2285: validate_path confines file access to the working directory. Applied to 7 file and directory tools.
Rewrite validated URLs to use the resolved IP, preventing DNS rebinding between validation and request time. SDK-based tools use pin_ip=False since they manage their own HTTP clients. Add allow_private flag for deployments that need internal network access.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move safe_path.py to crewai_tools/security/; add safe_url.py re-export - Keep utilities/safe_path.py as a backwards-compat shim - Update all 21 import sites to use crewai_tools.security.safe_path - files_compressor_tool: validate output_path (user-controlled) - serper_scrape_website_tool: call validate_url() before building payload - brightdata_unlocker: validate_url() already called without assignment (no-op fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mpat shim - security/safe_path.py is the canonical location for all validation - utilities/safe_path.py re-exports for backward compatibility - All tool imports already point to security.safe_path - All review comments already addressed in prior commits
… validator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
b60d547 to
2b39baa
Compare
…move unused safe_url.py
…move unused safe_url.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4f5a9f7. Configure here.
| Returns: | ||
| Base64-encoded image data | ||
| """ | ||
| image_path = validate_file_path(image_path) |
There was a problem hiding this comment.
Validation errors swallowed by try/except in VisionTool
Medium Severity
The validate_file_path call within _encode_image is wrapped by try/except Exception blocks. This catches validation errors and converts them into a generic error string, preventing security exceptions from propagating. This differs from the approach taken for other tools in the same PR, where validation was moved outside these blocks.
Reviewed by Cursor Bugbot for commit 4f5a9f7. Configure here.
* fix: add SSRF and path traversal protections CVE-2026-2286: validate_url blocks non-http/https schemes, private IPs, loopback, link-local, reserved addresses. Applied to 11 web tools. CVE-2026-2285: validate_path confines file access to the working directory. Applied to 7 file and directory tools. * fix: drop unused assignment from validate_url call * fix: DNS rebinding protection and allow_private flag Rewrite validated URLs to use the resolved IP, preventing DNS rebinding between validation and request time. SDK-based tools use pin_ip=False since they manage their own HTTP clients. Add allow_private flag for deployments that need internal network access. * fix: unify security utilities and restore RAG chokepoint validation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: move validation to security/ package + address review comments - Move safe_path.py to crewai_tools/security/; add safe_url.py re-export - Keep utilities/safe_path.py as a backwards-compat shim - Update all 21 import sites to use crewai_tools.security.safe_path - files_compressor_tool: validate output_path (user-controlled) - serper_scrape_website_tool: call validate_url() before building payload - brightdata_unlocker: validate_url() already called without assignment (no-op fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: move validation to security/ package, keep utilities/ as compat shim - security/safe_path.py is the canonical location for all validation - utilities/safe_path.py re-exports for backward compatibility - All tool imports already point to security.safe_path - All review comments already addressed in prior commits * fix: move validation outside try/except blocks, use correct directory validator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: use resolved paths from validation to prevent symlink TOCTOU, remove unused safe_url.py --------- Co-authored-by: Alex <alex@crewai.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>


Summary
Addresses CVE-2026-2285 and CVE-2026-2286.
validate_urlblocks non-http/https schemes, private IPs, loopback, link-local, reserved addresses via DNS resolution. Applied to 11 web/scraping tools.validate_pathconfines file access to the working directory viaPath.resolve()+is_relative_to(). Applied to 7 file/directory tools.Test plan
validate_urlblocksfile://,http://127.0.0.1,http://169.254.169.254validate_urlallowshttps://example.comvalidate_pathblocks/etc/passwd,../../etc/passwdvalidate_pathallows relative paths within CWDNote
Medium Risk
Touches many tools’ runtime inputs and adds DNS-based URL blocking, which can change behavior for previously-accepted paths/hosts and may fail in restricted-network environments. Changes are security-focused and localized but affect common scraping/file-read flows.
Overview
Adds centralized input hardening for tool-supplied file paths and URLs. A new
crewai_tools.security.safe_pathmodule validates file/directory paths by resolving symlinks and enforcing abase_dirboundary (defaulting to CWD), and validates URLs by only allowinghttp/https, blockingfile://, and rejecting hosts that resolve to private/reserved IP ranges (SSRF protection).These validations are now applied across multiple file, directory, RAG ingestion, and web-scraping tools (e.g.,
FileReadTool,DirectoryReadTool,RagTool.add, Firecrawl/Hyperbrowser/Jina/Scrapfly/Serper/Serply, OCR/Vision), withRagTool.addalso rewriting validated path inputs to their resolved form to reduce symlink TOCTOU risk. The priorutilities.safe_pathimplementation is replaced with a backward-compatible re-export, and tests are updated to target the new module.Reviewed by Cursor Bugbot for commit 4f5a9f7. Bugbot is set up for automated code reviews on this repo. Configure here.