Skip to content

🔒 Fix path traversal vulnerability in local_file source#77

Merged
bashandbone merged 5 commits into
mainfrom
jules-security-fix-local-file-traversal-17074725847278981384
Mar 13, 2026
Merged

🔒 Fix path traversal vulnerability in local_file source#77
bashandbone merged 5 commits into
mainfrom
jules-security-fix-local-file-traversal-17074725847278981384

Conversation

@bashandbone
Copy link
Copy Markdown
Contributor

🎯 What: A path traversal vulnerability in crates/recoco-core/src/ops/sources/local_file.rs allows reading arbitrary files outside the root_path.
⚠️ Risk: An attacker with control over the key parameter could craft paths (e.g., ../../etc/passwd or /etc/passwd) that escape the intended root directory, potentially leading to unauthorized disclosure of sensitive system files.
🛡️ Solution: Modified get_value to validate the path by iterating through its components using std::path::Path::new(path).components(). The code now explicitly checks and rejects paths containing ParentDir (..), RootDir (/), or Prefix (e.g., C:) components before attempting to join the path with the root_path. If an unsafe path is provided, it gracefully returns NonExistence.


PR created automatically by Jules for task 17074725847278981384 started by @bashandbone

Added validation in `get_value` to ensure `path` components do not
contain `ParentDir`, `RootDir`, or `Prefix` elements before joining
them with `self.root_path`. This prevents attackers from accessing files
outside the specified root path.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 13, 2026 02:52
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Mar 13, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
recoco-docs 1ae455b Commit Preview URL

Branch Preview URL
Mar 13 2026, 04:18 AM

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR mitigates a path traversal issue in the local_file source by rejecting unsafe key paths before reading from disk.

Changes:

  • Adds component-based validation to reject .., absolute paths (/), and Windows drive/prefix paths (C:).
  • Keeps existing file-include pattern checks, returning NonExistence for rejected paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +137 to +148
let path_obj = Path::new(path);

// Prevent path traversal vulnerabilities by verifying the path
// doesn't contain parent directory or absolute components.
if path_obj.components().any(|c| {
matches!(
c,
std::path::Component::ParentDir
| std::path::Component::RootDir
| std::path::Component::Prefix(_)
)
}) || !self.pattern_matcher.is_file_included(path) {
@bashandbone
Copy link
Copy Markdown
Contributor Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 13, 2026

@bashandbone I've opened a new pull request, #83, to work on those changes. Once the pull request is ready, I'll request review from you.

@github-actions
Copy link
Copy Markdown
Contributor

🤖 Hi @bashandbone, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

@google-labs-jules
Copy link
Copy Markdown
Contributor

@copilot open a new pull request to apply changes based on the comments in this thread

I have updated the PR to also canonicalize both the joined path and the root_path to ensure the resolved target file resides within the allowed root directory, preventing symlink-based traversals. If the file does not exist, it correctly returns NonExistence as before.

Added validation in `get_value` to ensure `path` components do not
contain `ParentDir`, `RootDir`, or `Prefix` elements before joining
them with `self.root_path`. This prevents attackers from accessing files
outside the specified root path.

Also mitigates symlink-based path traversal by canonicalizing and checking
boundaries to ensure the canonicalized target path starts with the
canonicalized root path.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This Pull Request effectively addresses a path traversal vulnerability in the local_file source by implementing component-based path validation and boundary checks using canonicalization. These changes significantly improve the security posture of the local file ingestion by preventing unauthorized access to files outside the specified root directory.

🔍 General Feedback

  • Security: The combination of denylisting unsafe path components (.., absolute paths) and verifying canonical paths against the root is a robust defense-in-depth approach.
  • Performance: The frequent use of synchronous std::fs::canonicalize on the root path in the get_value method introduces unnecessary I/O overhead. Caching this value would be a beneficial optimization.
  • Async Hygiene: Moving towards tokio::fs for all I/O operations in this async executor will help ensure that the runtime threads are never blocked by file system operations.


// Prevent path traversal vulnerabilities by verifying the path
// doesn't contain parent directory or absolute components.
if path_obj.components().any(|c| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 You can use path_obj.is_absolute() to check for RootDir and Prefix components more idiomatically. The current approach with components().any(...) is correct as it also handles ParentDir in the same pass, so this is just a minor stylistic suggestion.

let path = self.root_path.join(path);

// Mitigate symlink-based path traversal by canonicalizing and checking boundaries
let root_canon = match std::fs::canonicalize(&self.root_path) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Calculating the canonical root path on every call to get_value is inefficient as it involves redundant I/O. Since root_path is constant for the lifetime of the Executor, it should be pre-calculated and stored in the Executor struct during construction in build_executor.

Additionally, std::fs::canonicalize is a synchronous operation that can block the async runtime thread. Consider using tokio::fs::canonicalize(...).await instead.

}
Err(e) => Err(e)?,
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 std::fs::canonicalize is a synchronous I/O operation that can block the async executor thread. Since this function is async, it is recommended to use tokio::fs::canonicalize(&path).await to maintain responsiveness.

google-labs-jules Bot and others added 2 commits March 13, 2026 03:46
Added validation in `get_value` to ensure `path` components do not
contain `ParentDir`, `RootDir`, or `Prefix` elements before joining
them with `self.root_path`. This prevents attackers from accessing files
outside the specified root path.

Also mitigates symlink-based path traversal by canonicalizing and checking
boundaries to ensure the canonicalized target path starts with the
canonicalized root path.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
* Initial plan

* fix: add symlink-safe path validation using canonicalize in local_file source

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>

---------

Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
@bashandbone
Copy link
Copy Markdown
Contributor Author

@jules please address the comments/suggestions in this thread

@google-labs-jules
Copy link
Copy Markdown
Contributor

@jules please address the comments/suggestions in this thread

I have addressed all the PR comments. The canonicalization of root_path has been moved to build_executor and cached on the Executor struct to avoid redundant I/O. Additionally, I have replaced the synchronous std::fs::canonicalize with tokio::fs::canonicalize(...).await inside the async get_value method to prevent blocking the async runtime thread.

Added validation in `get_value` to ensure `path` components do not
contain `ParentDir`, `RootDir`, or `Prefix` elements before joining
them with `self.root_path`. This prevents attackers from accessing files
outside the specified root path.

Also mitigates symlink-based path traversal by canonicalizing and checking
boundaries to ensure the canonicalized target path starts with the
canonicalized root path. Cached root path canonicalization in Executor and
swapped to tokio async canonicalize.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
@bashandbone bashandbone merged commit 074eb3e into main Mar 13, 2026
8 of 10 checks passed
@bashandbone bashandbone deleted the jules-security-fix-local-file-traversal-17074725847278981384 branch March 13, 2026 04:18
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Recoco v1.0.0 Mar 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants