Rework path resolvers to use workspace-rooted semantics#98
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #98 +/- ##
==========================================
+ Coverage 81.08% 81.37% +0.29%
==========================================
Files 107 108 +1
Lines 4573 4479 -94
==========================================
- Hits 3708 3645 -63
+ Misses 865 834 -31
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 15 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughRefactors path-resolution to a workspace-root model: adds a new Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/llm-coding-tools-core/src/workspace.rs (1)
71-85: Test modifies process-wide CWD, which can interfere with parallel test execution.
std::env::set_current_diraffects all threads in the process. Ifcargo testruns tests in parallel, this could cause flaky failures in other tests that depend on the current directory.Consider using
#[serial]from theserial_testcrate, or restructuring the test to avoid CWD manipulation (e.g., spawning a subprocess or accepting this as an integration test limitation).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-core/src/workspace.rs` around lines 71 - 85, The test resolve_workspace_root_should_fall_back_to_cwd_when_not_in_repo mutates process-wide CWD via std::env::set_current_dir (using TempDir) which can break parallel tests; fix it by either marking the test to run serially (add #[serial] from the serial_test crate to the test function) or avoid changing the process CWD altogether by running resolve_workspace_root in a subprocess/helper child process (spawn a Command that sets its own cwd to TempDir and invokes a small test helper that calls resolve_workspace_root and prints the result) so the main test harness CWD is not mutated.src/llm-coding-tools-core/README.md (1)
90-104: Consider usingbuilder_with_basein the documentation example for consistency.The example creates
AllowedGlobResolver::new(&root)but then usesGlobPolicy::builder()without a base path. This means the policy patterns (src/**,*.rs,target/**) are matched verbatim rather than being joined with the workspace root.While this works because
AllowedGlobResolvernormalizes paths before checking policy, the pattern semantics differ from what's documented in the "Permission glob semantics" section (lines 107-116), which states "relative patterns are implicitly joined with the workspace root."For documentation consistency, consider either:
- Using
GlobPolicy::builder_with_base(&root)?in the example, or- Adding a note clarifying that
builder()vsbuilder_with_base()have different pattern semantics.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-core/README.md` around lines 90 - 104, Update the README example so the policy patterns are joined with the workspace root by replacing GlobPolicy::builder() with GlobPolicy::builder_with_base(&root)? (referencing AllowedGlobResolver::new and GlobPolicy::builder_with_base) so the example matches the "Permission glob semantics" description; alternatively, if you prefer not to change the example, add a short clarifying note explaining the difference between GlobPolicy::builder() (verbatim patterns) and GlobPolicy::builder_with_base(&root)? (workspace-root-joined relative patterns) and why one is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/llm-coding-tools-core/src/workspace.rs`:
- Around line 20-33: The function resolve_workspace_root must not fall back to a
relative PathBuf; change its signature to return Result<PathBuf, std::io::Error>
(or Option<PathBuf>) and propagate std::env::current_dir()’s error instead of
using unwrap_or_else; inside resolve_workspace_root call
std::env::current_dir()? to obtain an absolute cwd (returning Err on failure),
then run the existing upward .git search on that cwd and return
Ok(candidate.to_path_buf() or Ok(cwd) as before so the function’s callers
receive a guaranteed absolute PathBuf or an error they can handle.
---
Nitpick comments:
In `@src/llm-coding-tools-core/README.md`:
- Around line 90-104: Update the README example so the policy patterns are
joined with the workspace root by replacing GlobPolicy::builder() with
GlobPolicy::builder_with_base(&root)? (referencing AllowedGlobResolver::new and
GlobPolicy::builder_with_base) so the example matches the "Permission glob
semantics" description; alternatively, if you prefer not to change the example,
add a short clarifying note explaining the difference between
GlobPolicy::builder() (verbatim patterns) and
GlobPolicy::builder_with_base(&root)? (workspace-root-joined relative patterns)
and why one is used.
In `@src/llm-coding-tools-core/src/workspace.rs`:
- Around line 71-85: The test
resolve_workspace_root_should_fall_back_to_cwd_when_not_in_repo mutates
process-wide CWD via std::env::set_current_dir (using TempDir) which can break
parallel tests; fix it by either marking the test to run serially (add #[serial]
from the serial_test crate to the test function) or avoid changing the process
CWD altogether by running resolve_workspace_root in a subprocess/helper child
process (spawn a Command that sets its own cwd to TempDir and invokes a small
test helper that calls resolve_workspace_root and prints the result) so the main
test harness CWD is not mutated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 82f4643e-2e81-4c61-a26c-61ec41ec59ae
📒 Files selected for processing (9)
src/llm-coding-tools-agents/README.mdsrc/llm-coding-tools-core/README.mdsrc/llm-coding-tools-core/benches/path_resolvers.rssrc/llm-coding-tools-core/src/lib.rssrc/llm-coding-tools-core/src/path/allowed.rssrc/llm-coding-tools-core/src/path/allowed_glob/mod.rssrc/llm-coding-tools-core/src/path/allowed_glob/policy.rssrc/llm-coding-tools-core/src/path/mod.rssrc/llm-coding-tools-core/src/workspace.rs
💤 Files with no reviewable changes (1)
- src/llm-coding-tools-core/src/path/mod.rs
📜 Review details
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-03-28T02:14:04.465Z
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.
Applied to files:
src/llm-coding-tools-core/src/lib.rssrc/llm-coding-tools-core/src/workspace.rssrc/llm-coding-tools-core/README.mdsrc/llm-coding-tools-core/benches/path_resolvers.rssrc/llm-coding-tools-core/src/path/allowed.rssrc/llm-coding-tools-core/src/path/allowed_glob/policy.rssrc/llm-coding-tools-core/src/path/allowed_glob/mod.rs
🔇 Additional comments (12)
src/llm-coding-tools-core/src/path/allowed.rs (2)
28-28: LGTM! Unified rejection path with consistent error messaging.The removal of external permission fallback simplifies the resolution logic. All rejection paths now flow through the
not_allowed()helper, returning a consistentToolError::InvalidPathmessage. The documentation accurately reflects the new behavior.Also applies to: 199-199, 232-238
334-345: LGTM! Test updated to match new semantics.The renamed test
rejects_absolute_path_outside_all_basescorrectly verifies that absolute paths outside configured directories are now rejected withInvalidPathinstead of being handled via external permissions.src/llm-coding-tools-agents/README.md (1)
113-115: LGTM! Clear documentation of glob semantics.The added documentation succinctly explains the glob pattern behavior (
*vs**), that bareallowmaps to**, and that patterns are workspace-relative. This aligns with the implementation inGlobPolicy::builder_with_base().src/llm-coding-tools-core/src/lib.rs (1)
23-23: LGTM! Clean module addition and re-export.The new
workspacemodule andresolve_workspace_rootre-export follow the crate's existing pattern of exposing key functionality at the root level for ergonomic imports.Also applies to: 33-33
src/llm-coding-tools-core/benches/path_resolvers.rs (1)
100-102: LGTM! Benchmarks updated to use new single-root API.The
build_policyhelper now correctly canonicalizes the current directory before passing tobuilder_with_base(), andAllowedGlobResolver::new()calls reflect the new single workspace root signature. The benchmark structure remains sound.Also applies to: 154-159
src/llm-coding-tools-core/README.md (1)
107-116: LGTM! Clear documentation of glob semantics.The new "Permission glob semantics" section clearly explains the
*vs**behavior, bareallowexpansion, relative/absolute pattern handling, and last-match-wins ordering. This is valuable user-facing documentation.src/llm-coding-tools-core/src/path/allowed_glob/policy.rs (3)
104-110: LGTM! Clean implementation ofbuilder_with_base.The method correctly shell-expands the base path before storing it, and the builder structure properly handles both relative (joined with base) and absolute (pass-through) patterns in
resolve_pattern().
197-212: LGTM! Pattern resolution logic is correct.The
resolve_patternmethod correctly:
- Expands shell variables in the pattern
- Preserves absolute patterns unchanged
- Joins relative patterns with the base path when present
- Normalizes to forward slashes for cross-platform glob matching
430-455: LGTM! Comprehensive test coverage for shell expansion edge cases.The tests properly use
temp_env::with_varto isolate environment variable changes, covering both successful expansion and failure cases for unset variables.Also applies to: 531-545
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (3)
57-62: LGTM! Clean refactor to single workspace root.The
AllowedGlobResolvernow stores a singleworkspace_root: Arc<Path>instead of multiple base directories. The constructor properly expands shell patterns, validates the directory exists, and canonicalizes the path. This simplifies the mental model and aligns with workspace-rooted semantics.Also applies to: 83-106
200-249: LGTM! Resolution tiers correctly enforce containment and policy.Each resolution tier (canonicalize, new-file fast path, soft_canonicalize) properly:
- Checks that the resolved path stays within
workspace_rootviastarts_with()- Normalizes the path and checks glob policy against the absolute path
- Returns early on rejection with consistent error messaging
The pattern of checking policy against
normalize_path(&resolved)ensures glob patterns match the same path format used during policy construction viabuilder_with_base().
276-285: LGTM! Test helper correctly demonstrates intended usage pattern.The
resolver_with_policyhelper shows the recommended pattern: create policy withbuilder_with_base(&root)so relative patterns are joined with the workspace root, matching howAllowedGlobResolverevaluates paths against policy.
The multi-base + external_permission fallback model introduced complexity (canonicalization edge cases, permission bypass surface, cross-platform path comparison issues) without clear benefit over workspace-rooted globs. - Add resolve_workspace_root() (git root > cwd) in new workspace module - Rework AllowedGlobResolver from multi-base-directory to single workspace root - Remove external_permission fallback from AllowedPathResolver and AllowedGlobResolver - Add GlobPolicy::builder_with_base() for workspace-rooted pattern resolution - Update benchmarks and documentation for new workspace-rooted semantics
4b11d4f to
227a56d
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/llm-coding-tools-core/benches/path_resolvers.rs (1)
142-150:⚠️ Potential issue | 🟡 MinorComplex-policy benchmark became materially more permissive.
Switching from
src/**/*.rstosrc/**changes what is matched (and likely match cost), which can make benchmark comparisons less meaningful.Suggested benchmark policy adjustment
let complex_policy = build_policy(|b| { - b.allow("src/**")? + b.allow("src/**/*.rs")? .deny("target/**")? .allow("*.toml")? .deny("*.log")? .allow("benches/**")? .deny("**/test_data/**")? .allow("tests/**/*.rs")? .deny("node_modules/**")? .allow("examples/**") })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-core/benches/path_resolvers.rs` around lines 142 - 150, The benchmark policy became more permissive by changing the pattern from "src/**/*.rs" to "src/**"; revert the matcher on the builder chain (the calls on variable b: allow("src/**")? ... .allow("examples/**")) back to the original more specific Rust-only pattern (e.g., allow("src/**/*.rs")?) so the benchmark matches and cost remain comparable, ensuring the same sequence of allow/deny calls (on b) preserves previous specificity and ordering.
🧹 Nitpick comments (2)
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (2)
207-245: Extract repeated containment/policy validation into a helper.The same validation block is duplicated in all three tiers, which makes future edits error-prone.
Refactor sketch
+fn validate_resolved( + workspace_root: &Path, + policy: Option<&GlobPolicy>, + path: &str, + resolved: PathBuf, +) -> ToolResult<PathBuf> { + if !resolved.starts_with(workspace_root) { + return Err(reject(path)); + } + if let Some(policy) = policy { + let abs_str = normalize_path(&resolved); + if !policy.is_allowed(abs_str.as_ref()) { + return Err(reject(path)); + } + } + Ok(resolved) +} + fn resolve_candidate( workspace_root: &Path, policy: Option<&GlobPolicy>, path: &str, candidate: &Path, ) -> ToolResult<PathBuf> { - if let Ok(resolved) = candidate.canonicalize() { - if !resolved.starts_with(workspace_root) { - return Err(reject(path)); - } - if let Some(policy) = policy { - let abs_str = normalize_path(&resolved); - if !policy.is_allowed(abs_str.as_ref()) { - return Err(reject(path)); - } - } - return Ok(resolved); - } + if let Ok(resolved) = candidate.canonicalize() { + return validate_resolved(workspace_root, policy, path, resolved); + } - if let Some(resolved) = resolve_new_file_fast(candidate) { - ... - } + if let Some(resolved) = resolve_new_file_fast(candidate) { + return validate_resolved(workspace_root, policy, path, resolved); + } - if let Ok(resolved) = soft_canonicalize(candidate) { - ... - } + if let Ok(resolved) = soft_canonicalize(candidate) { + return validate_resolved(workspace_root, policy, path, resolved); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs` around lines 207 - 245, The three branches that handle candidate.canonicalize(), resolve_new_file_fast(), and soft_canonicalize() duplicate the same containment and policy checks; extract that logic into a single helper (e.g., validate_resolved(resolved: PathBuf, path: &Path, workspace_root: &Path, policy: Option<&Policy>) -> Result<PathBuf, Error>) and call it from each branch instead of repeating normalize_path(), starts_with(workspace_root) and policy.is_allowed(...) and the reject(path) error construction; update the branches to return the helper's Result so behavior remains identical.
120-124: Consider a debug assertion infrom_canonicalfor absolute input.Since this constructor intentionally skips validation, a debug-time guard helps catch accidental misuse early.
Small defensive tweak
pub fn from_canonical(workspace_root: impl AsRef<Path>) -> Self { + debug_assert!( + workspace_root.as_ref().is_absolute(), + "AllowedGlobResolver::from_canonical expects an absolute canonical path" + ); Self { workspace_root: Arc::from(workspace_root.as_ref()), policy: None, } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs` around lines 120 - 124, Add a debug-time guard in AllowedGlob::from_canonical to ensure the provided workspace_root is absolute: in the from_canonical function add a debug_assert that workspace_root.as_ref().is_absolute() with a short message like "from_canonical expects an absolute workspace_root"; this keeps the constructor fast but catches accidental misuse during development while leaving release behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/llm-coding-tools-core/benches/path_resolvers.rs`:
- Around line 142-150: The benchmark policy became more permissive by changing
the pattern from "src/**/*.rs" to "src/**"; revert the matcher on the builder
chain (the calls on variable b: allow("src/**")? ... .allow("examples/**")) back
to the original more specific Rust-only pattern (e.g., allow("src/**/*.rs")?) so
the benchmark matches and cost remain comparable, ensuring the same sequence of
allow/deny calls (on b) preserves previous specificity and ordering.
---
Nitpick comments:
In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs`:
- Around line 207-245: The three branches that handle candidate.canonicalize(),
resolve_new_file_fast(), and soft_canonicalize() duplicate the same containment
and policy checks; extract that logic into a single helper (e.g.,
validate_resolved(resolved: PathBuf, path: &Path, workspace_root: &Path, policy:
Option<&Policy>) -> Result<PathBuf, Error>) and call it from each branch instead
of repeating normalize_path(), starts_with(workspace_root) and
policy.is_allowed(...) and the reject(path) error construction; update the
branches to return the helper's Result so behavior remains identical.
- Around line 120-124: Add a debug-time guard in AllowedGlob::from_canonical to
ensure the provided workspace_root is absolute: in the from_canonical function
add a debug_assert that workspace_root.as_ref().is_absolute() with a short
message like "from_canonical expects an absolute workspace_root"; this keeps the
constructor fast but catches accidental misuse during development while leaving
release behavior unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 30581cea-be01-4400-9fd6-8331a8aa2715
📒 Files selected for processing (9)
src/llm-coding-tools-agents/README.mdsrc/llm-coding-tools-core/README.mdsrc/llm-coding-tools-core/benches/path_resolvers.rssrc/llm-coding-tools-core/src/lib.rssrc/llm-coding-tools-core/src/path/allowed.rssrc/llm-coding-tools-core/src/path/allowed_glob/mod.rssrc/llm-coding-tools-core/src/path/allowed_glob/policy.rssrc/llm-coding-tools-core/src/path/mod.rssrc/llm-coding-tools-core/src/workspace.rs
💤 Files with no reviewable changes (1)
- src/llm-coding-tools-core/src/path/mod.rs
✅ Files skipped from review due to trivial changes (2)
- src/llm-coding-tools-agents/README.md
- src/llm-coding-tools-core/README.md
🚧 Files skipped from review as they are similar to previous changes (4)
- src/llm-coding-tools-core/src/lib.rs
- src/llm-coding-tools-core/src/workspace.rs
- src/llm-coding-tools-core/src/path/allowed_glob/policy.rs
- src/llm-coding-tools-core/src/path/allowed.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Blocking Windows
- GitHub Check: Blocking macOS
- GitHub Check: Async Windows
- GitHub Check: Blocking Linux
- GitHub Check: Async Linux
- GitHub Check: Async macOS
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-03-28T02:14:04.465Z
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.
Applied to files:
src/llm-coding-tools-core/benches/path_resolvers.rssrc/llm-coding-tools-core/src/path/allowed_glob/mod.rs
🔇 Additional comments (2)
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (1)
176-248: Workspace-root enforcement is consistently applied across all resolution tiers.Nice work here: lexical escape rejection + per-tier containment checks + policy checks make the new workspace-root model robust.
src/llm-coding-tools-core/benches/path_resolvers.rs (1)
100-101:builder_with_baseand single-root resolver wiring looks correct.These updates align cleanly with the new workspace-rooted resolver API.
Also applies to: 154-159
Consolidate identical containment and policy checks from three resolution branches (canonicalize, resolve_new_file_fast, soft_canonicalize) into a single validate_resolved helper function.
Unix-style /some/path is not absolute on Windows (no drive letter), causing three tests to fail. Use cfg!(windows) to pick appropriate paths and normalize test paths through the same pipeline as production code.
f4bd68f to
1e9e0c6
Compare
Summary
resolve_workspace_root()(git root > cwd)external_permissionruleset fallback from bothAllowedPathResolverandAllowedGlobResolver, and reworkAllowedGlobResolverto accept a single workspace rootGlobPolicy::builder_with_base()so relative glob patterns are resolved against the workspace root at construction time, while absolute patterns pass through unchangedMotivation
The
external_directorypermission was a global ruleset shared across all resolvers, preventing per-tool concrete access configuration.Moving to workspace-rooted glob patterns gives each resolver its own policy, making fine-grained access control straightforward.
Changes
AllowedGlobResolverbasesVec<PathBuf>&Pathworkspace rootwith_external_permission(Ruleset)**under workspace rootAllowedPathResolverwith_external_permission()APIresolve_workspace_root()(git root, then cwd)Stats