Skip to content

Optimize external directory path resolution and add benchmarks#95

Merged
Sewer56 merged 4 commits intomainfrom
optimize-external-directory-resolver
Apr 8, 2026
Merged

Optimize external directory path resolution and add benchmarks#95
Sewer56 merged 4 commits intomainfrom
optimize-external-directory-resolver

Conversation

@Sewer56
Copy link
Copy Markdown
Member

@Sewer56 Sewer56 commented Apr 7, 2026

Summary

Optimize the absolute-path resolution path in AllowedPathResolver and AllowedGlobResolver, add external-directory benchmark coverage, and refresh all reference numbers.

Changes

  • Canonicalize once for absolute paths: base.join(absolute_path) == absolute_path, so canonicalize once then check all bases instead of per-base join + canonicalize
  • Eliminate PathBuf::clone on the matching-base fast path by using .any()
  • Avoid Cow allocation in permission evaluation via direct OsStr→str conversion
  • Add external_directory benchmark group with 5 test cases covering the external permission fallback
  • Refresh all reference numbers from a controlled A/B comparison (100-sample Criterion) on the same hardware
  • Document previously-missing complex_policy and multiple_bases reference numbers

Benchmark Results

Controlled A/B comparison on same hardware (pre-optimization vs this branch):

Pre-existing benchmarks: no regressions

All within ±2% (noise). Some slight improvements:

Benchmark Before After Change
GlobResolver_simple/traversal_reject 21.9 ns 21.0 ns −4%
GlobResolver_complex/traversal_reject 22.0 ns 21.1 ns −4%
GlobResolver_complex/policy_reject 105.7 ns 104.7 ns −2%

External directory: significant improvements

Benchmark Before After Change
AllowedPathResolver/external_existing_file 2.04 µs 548 ns −73%
AllowedGlobResolver/external_existing_file 2.02 µs 535 ns −74%
AllowedPathResolver/external_new_file 5.95 µs 3.30 µs −44%
AllowedGlobResolver/external_new_file 5.90 µs 3.30 µs −44%
AllowedPathResolver/external_rejected 4.02 µs 2.35 µs −42%
AllowedGlobResolver/external_rejected 4.03 µs 2.34 µs −42%
AllowedPathResolver/external_no_ruleset 2.25 µs 2.31 µs +3% (noise)
AllowedGlobResolver/external_no_ruleset 2.27 µs 2.28 µs ~0%
AllowedPathResolver/relative_still_fails 9.78 µs 9.86 µs +1% (noise)
AllowedGlobResolver/relative_still_fails 9.82 µs 9.81 µs ~0%

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 72.61905% with 46 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.17%. Comparing base (3efad1f) to head (b791a61).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...llm-coding-tools-core/src/path/allowed_glob/mod.rs 67.96% 33 Missing ⚠️
src/llm-coding-tools-core/src/path/allowed.rs 79.62% 11 Missing ⚠️
src/llm-coding-tools-core/src/path/mod.rs 81.81% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #95      +/-   ##
==========================================
- Coverage   81.56%   81.17%   -0.40%     
==========================================
  Files         106      106              
  Lines        4389     4472      +83     
==========================================
+ Hits         3580     3630      +50     
- Misses        809      842      +33     
Flag Coverage Δ
async 80.41% <72.61%> (-0.40%) ⬇️
blocking 58.92% <72.02%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...rc/llm-coding-tools-core/benches/path_resolvers.rs 0.00% <ø> (ø)
src/llm-coding-tools-core/src/path/mod.rs 87.50% <81.81%> (+2.65%) ⬆️
src/llm-coding-tools-core/src/path/allowed.rs 77.10% <79.62%> (-0.67%) ⬇️
...llm-coding-tools-core/src/path/allowed_glob/mod.rs 72.02% <67.96%> (-10.93%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b457513e-a844-4fbe-972e-2f7b226eca70

📥 Commits

Reviewing files that changed from the base of the PR and between 193c5e2 and b791a61.

📒 Files selected for processing (3)
  • src/llm-coding-tools-core/benches/path_resolvers.rs
  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
  • src/llm-coding-tools-core/src/path/mod.rs
✅ Files skipped from review due to trivial changes (1)
  • src/llm-coding-tools-core/benches/path_resolvers.rs
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Blocking Windows
  • GitHub Check: Async Windows
  • GitHub Check: Blocking macOS
  • GitHub Check: Async Linux
  • GitHub Check: Async macOS
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-03-28T02:14:04.465Z
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.

Applied to files:

  • src/llm-coding-tools-core/src/path/mod.rs
  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
🔇 Additional comments (9)
src/llm-coding-tools-core/src/path/mod.rs (3)

55-59: LGTM!

Clean simplification of PathAnalysis to focus solely on escape detection, aligning with the removal of the fast-path policy probe mentioned in the PR objectives.


61-90: LGTM!

The escape detection logic is correct: absolute paths cannot escape (early return), and relative paths are tracked via depth with proper handling of .. at depth 0.


131-146: ⚠️ Potential issue | 🟡 Minor

The empty-string fallback is redundant and creates a potential permission bypass for non-UTF8 paths.

On Unix, the fallback in path_as_str (line 138) to path.to_str().unwrap_or("") is logically redundant: path.to_str() internally calls from_utf8, so it will fail whenever the explicit from_utf8 check fails. More importantly, if a canonicalized path somehow contains non-UTF8 bytes (rare but possible on Unix), both checks fail and return "". This empty string is then passed to perm.evaluate("external_directory", ...) where a wildcard pattern like * will match it, potentially allowing unintended access.

Consider detecting non-UTF8 paths explicitly and rejecting them rather than silently converting them to an empty string that may match permission patterns.

⛔ Skipped due to learnings
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (6)

1-38: LGTM!

The module documentation clearly explains the resolution algorithm, including the relative vs absolute dispatch, three-tier resolution strategy, and external fallback behavior. This will help future maintainers understand the security model.


203-224: LGTM!

Clean dispatch logic: escape detection first, then route to specialized handlers. Notably, external_permission is only passed to resolve_absolute, correctly enforcing that relative paths cannot use external directory fallback.


237-302: LGTM!

The three-tier resolution logic correctly handles each scenario (existing file, new file in existing dir, missing parent dirs) while consistently re-validating both containment and glob policy after any resolution that might change the effective path.


311-400: Security fix for external_permission bypass is correctly implemented.

The inside_any_base tracking ensures that paths contained within a base directory but denied by glob policy cannot fall through to try_external. The separation of "containment passed" from "policy allowed" in the .any() closure is correct: the closure sets inside_any_base = true before checking policy, and returns false if policy denies—allowing iteration to continue to other bases while preserving the containment state.

The three-tier repetition (canonicalize, new-file-fast, soft_canonicalize) each with identical containment/policy/external logic could be extracted into a helper, but the current explicit structure is readable and the duplication is acceptable.


411-436: LGTM — consistent with allowed.rs implementation.

The try_external helper correctly implements the three-step external permission check (ruleset presence, canonical path acquisition, permission evaluation). The structure mirrors allowed.rs:282-313, ensuring consistent behavior across both resolvers.

Note: The path_as_str usage at line 431 shares the empty-string fallback concern flagged in mod.rs.


824-853: Good regression test for the security fix.

This test directly verifies that external_permission cannot override a glob-policy denial for paths inside a base directory—the exact scenario flagged in past reviews. The test setup clearly demonstrates the edge case: path is inside base, denied by src/** policy, but external_permission allows *.


Walkthrough

Refactors path resolution by splitting absolute vs relative handling into resolve_absolute and resolve_relative, and centralizes external-directory permission checks in a new try_external helper used by both AllowedPathResolver and AllowedGlobResolver. Resolution now uses a consistent tiered canonicalization sequence (canonicalize → resolve_new_file_fast → soft_canonicalize) and avoids per-base filesystem work for absolute inputs. Adds a platform-aware path_as_str helper, removes dot-component tracking from PathAnalysis, and introduces a new bench_external_directory benchmark (registered under the external_directory group). No public API signatures changed.

Possibly related PRs

  • PR 94 — Implements the same external-directory permission fallback for AllowedPathResolver and AllowedGlobResolver, including with_external_permission and evaluation of the "external_directory" ruleset.
  • PR 89 — Modifies canonicalization and resolution behavior in the allowed-glob resolver, overlapping the extracted absolute/relative dispatch and policy revalidation logic.
  • PR 90 — Touches the same path resolution modules and benchmarks, including absolute vs relative handling and canonicalization changes.
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main changes: optimizing external directory path resolution and adding benchmarks.
Description check ✅ Passed The description is comprehensive with clear summary, detailed changes, benchmark results, and rationale; it exceeds the minimal template requirements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch optimize-external-directory-resolver

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/llm-coding-tools-core/src/path/mod.rs (1)

152-161: Redundant fallback in Unix implementation.

On Unix, OsStr bytes are the same underlying data that path.to_str() validates. If from_utf8(os_str.as_bytes()) fails due to invalid UTF-8, path.to_str() will also return None (since it performs the same UTF-8 check on the same bytes). The fallback on line 159 will therefore always resolve to "".

This can be simplified:

♻️ Simplify by removing redundant fallback
 #[cfg(unix)]
 #[inline]
 pub(crate) fn path_as_str(path: &Path) -> &str {
     use std::os::unix::ffi::OsStrExt;
     let os_str = path.as_os_str();
-    match std::str::from_utf8(os_str.as_bytes()) {
-        Ok(s) => s,
-        Err(_) => path.to_str().unwrap_or(""),
-    }
+    std::str::from_utf8(os_str.as_bytes()).unwrap_or("")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/path/mod.rs` around lines 152 - 161, The Unix
implementation of path_as_str has a redundant fallback to
path.to_str().unwrap_or("") because from_utf8(os_str.as_bytes()) and
path.to_str() perform the same UTF-8 validation; remove the Err branch and
simply return the UTF-8 conversion result (e.g., use
std::str::from_utf8(os_str.as_bytes()).unwrap_or("")) in the path_as_str
function to eliminate the pointless fallback while keeping behavior consistent.
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (1)

287-351: Consider extracting repeated policy-check logic.

The policy verification block (lines 292-300, 314-322, 336-344) is duplicated three times. While this is understandable for a hot path where closure overhead might matter, it increases maintenance burden.

If you're confident the compiler inlines effectively, a helper closure could reduce duplication:

♻️ Optional: Extract policy check to closure
fn resolve_absolute(
    &self,
    path: &str,
    input_path: &Path,
    policy: Option<&GlobPolicy>,
    has_dots: bool,
) -> ToolResult<PathBuf> {
    let fast_policy_input = if !has_dots { Some(path) } else { None };

    let check_bases = |resolved: &Path| -> bool {
        self.base_directories.iter().any(|base_dir| {
            if !resolved.starts_with(base_dir) {
                return false;
            }
            if let Some(policy) = policy {
                let relative_path = resolved.strip_prefix(base_dir).unwrap_or(Path::new(""));
                let normalized_relative = normalize::normalize_path(relative_path);
                if fast_policy_input != Some(normalized_relative.as_ref())
                    && !policy.is_allowed(&normalized_relative)
                {
                    return false;
                }
            }
            true
        })
    };

    if let Ok(resolved) = input_path.canonicalize() {
        if check_bases(&resolved) {
            return Ok(resolved);
        }
        return self.try_external(path, resolved);
    }
    // ... similar for other branches
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs` around lines 287 -
351, The policy verification logic is duplicated three times in
resolve_absolute; extract it into a single helper (either a local closure like
check_bases or a private method) that accepts &Path and returns bool, reuse it
for the canonicalize, resolve_new_file_fast, and soft_canonicalize branches, and
replace each duplicated iterator/policy block with a call to that helper and
then call try_external when the helper returns false. Ensure the helper
references fast_policy_input and policy the same way as the inlined code so
behavior remains identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs`:
- Around line 287-351: The policy verification logic is duplicated three times
in resolve_absolute; extract it into a single helper (either a local closure
like check_bases or a private method) that accepts &Path and returns bool, reuse
it for the canonicalize, resolve_new_file_fast, and soft_canonicalize branches,
and replace each duplicated iterator/policy block with a call to that helper and
then call try_external when the helper returns false. Ensure the helper
references fast_policy_input and policy the same way as the inlined code so
behavior remains identical.

In `@src/llm-coding-tools-core/src/path/mod.rs`:
- Around line 152-161: The Unix implementation of path_as_str has a redundant
fallback to path.to_str().unwrap_or("") because from_utf8(os_str.as_bytes()) and
path.to_str() perform the same UTF-8 validation; remove the Err branch and
simply return the UTF-8 conversion result (e.g., use
std::str::from_utf8(os_str.as_bytes()).unwrap_or("")) in the path_as_str
function to eliminate the pointless fallback while keeping behavior consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 343b219c-e555-49f9-b998-eb7bb61b28e5

📥 Commits

Reviewing files that changed from the base of the PR and between 3efad1f and 7a5e026.

📒 Files selected for processing (4)
  • src/llm-coding-tools-core/benches/path_resolvers.rs
  • src/llm-coding-tools-core/src/path/allowed.rs
  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
  • src/llm-coding-tools-core/src/path/mod.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Semver Checks (Serdesai Full)
  • GitHub Check: Semver Checks (Serdesai Full+Linux)
  • GitHub Check: Async Windows
  • GitHub Check: Blocking Windows
  • GitHub Check: Blocking macOS
  • GitHub Check: Async macOS
  • GitHub Check: Async Linux
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.
📚 Learning: 2026-03-28T02:14:04.465Z
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.

Applied to files:

  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
  • src/llm-coding-tools-core/benches/path_resolvers.rs
  • src/llm-coding-tools-core/src/path/allowed.rs
🔇 Additional comments (5)
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (1)

356-389: Clean helper implementation.

The try_external helper centralizes external permission handling well. The impl Into<Option<PathBuf>> parameter is ergonomic, allowing callers to pass either a PathBuf directly or None.

src/llm-coding-tools-core/src/path/allowed.rs (2)

182-220: LGTM - clean refactoring of absolute path resolution.

The optimization to canonicalize once and check all bases with .any() is correct and efficient. The flow through the three resolution strategies (canonicalize → resolve_new_file_fast → soft_canonicalize) maintains the same semantics as before while avoiding redundant filesystem calls.


222-255: Consistent implementation with glob resolver.

The try_external helper mirrors the implementation in AllowedGlobResolver, maintaining consistency across both resolvers.

src/llm-coding-tools-core/benches/path_resolvers.rs (2)

309-412: Good benchmark coverage for external directory scenarios.

The benchmark function is well-structured with clear documentation and covers the key scenarios:

  • Fast path (existing file with permission)
  • Slow path (new file requiring soft_canonicalize)
  • Rejection paths (denied and no-ruleset)
  • Edge case (relative path with external permission)

The test case matrix with resolver selection logic is clean and maintainable.


332-335: Glob pattern concern is unfounded—* does match nested paths in this implementation.

The wildcard_match function does not use shell glob semantics. The pattern * is treated as a universal wildcard that matches any sequence of bytes, including path separators (/). The fast path at line 347–349 returns true for pattern == "*" regardless of input, and the backtracking implementation allows * to consume any characters without restriction.

With pattern /tmp/xyz/* and input /tmp/xyz/subdir/new_file.txt, the * will successfully match subdir/new_file.txt (including the /). The benchmark correctly tests the intended "soft_canonicalize + permission allow" path.

For absolute paths, base.join(input) == input regardless of base, so
canonicalize once and check all bases instead of per-base FS calls.
Extract resolve_absolute and try_external helpers in both resolvers.
Add external directory and multiple-bases benchmarks.
@Sewer56 Sewer56 force-pushed the optimize-external-directory-resolver branch from 7a5e026 to 9b1fcb0 Compare April 8, 2026 00:18
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (1)

355-377: fast_policy_input optimization is ineffective for absolute paths.

The fast_policy_input is set to the original absolute path (e.g., /home/user/project/src/lib.rs), but it's compared against normalized_relative which is a relative path after stripping the base prefix (e.g., src/lib.rs). These will never match, so the condition at line 367 (fast_policy_input != Some(normalized_relative.as_ref())) is always true for absolute paths, meaning policy.is_allowed() is always called.

This doesn't affect correctness—the policy check still happens—but the optimization is dead code. Consider either:

  1. Removing fast_policy_input from this function since it can't trigger
  2. Computing a relative form to enable the optimization
♻️ Option 1: Simplify by removing the ineffective optimization
 fn resolve_absolute(
     base_directories: &[Arc<Path>],
     external_permission: Option<&Ruleset>,
     policy: Option<&GlobPolicy>,
     path: &str,
     input_path: &Path,
-    has_dots: bool,
+    _has_dots: bool,
 ) -> ToolResult<PathBuf> {
-    let fast_policy_input = if !has_dots { Some(path) } else { None };
-
     // Step 1: canonicalize for existing files.
     if let Ok(resolved) = input_path.canonicalize() {
         // Check if any base claims this path and policy approves.
         let accepted = base_directories.iter().any(|base_dir| {
             if !resolved.starts_with(base_dir) {
                 return false;
             }
             if let Some(policy) = policy {
                 let relative_path = resolved.strip_prefix(base_dir).unwrap_or(Path::new(""));
                 let normalized_relative = normalize::normalize_path(relative_path);
-                if fast_policy_input != Some(normalized_relative.as_ref())
-                    && !policy.is_allowed(&normalized_relative)
-                {
+                if !policy.is_allowed(&normalized_relative) {
                     return false;
                 }
             }
             true
         });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs` around lines 355 -
377, The fast_policy_input optimization is ineffective because fast_policy_input
holds the absolute input_path while the code compares it to normalized_relative
(a path stripped of base_dir), so the shortcut never triggers; remove the
fast_policy_input logic (its declaration and the special-case comparison in the
base_directories.any closure) and always compute normalized_relative and call
policy.is_allowed as currently done, or alternatively compute a relative form of
input_path (using input_path.strip_prefix(base_dir) when canonicalization
succeeds) and compare that to normalized_relative before calling
policy.is_allowed; update references around fast_policy_input,
input_path.canonicalize, normalize::normalize_path, and policy.is_allowed
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs`:
- Around line 355-377: The fast_policy_input optimization is ineffective because
fast_policy_input holds the absolute input_path while the code compares it to
normalized_relative (a path stripped of base_dir), so the shortcut never
triggers; remove the fast_policy_input logic (its declaration and the
special-case comparison in the base_directories.any closure) and always compute
normalized_relative and call policy.is_allowed as currently done, or
alternatively compute a relative form of input_path (using
input_path.strip_prefix(base_dir) when canonicalization succeeds) and compare
that to normalized_relative before calling policy.is_allowed; update references
around fast_policy_input, input_path.canonicalize, normalize::normalize_path,
and policy.is_allowed accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fb6fad74-02a0-48c7-84b6-8bc8d7dad431

📥 Commits

Reviewing files that changed from the base of the PR and between 7a5e026 and 9b1fcb0.

📒 Files selected for processing (4)
  • src/llm-coding-tools-core/benches/path_resolvers.rs
  • src/llm-coding-tools-core/src/path/allowed.rs
  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
  • src/llm-coding-tools-core/src/path/mod.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/llm-coding-tools-core/src/path/mod.rs
  • src/llm-coding-tools-core/benches/path_resolvers.rs
  • src/llm-coding-tools-core/src/path/allowed.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Async Windows
  • GitHub Check: Blocking macOS
  • GitHub Check: Blocking Windows
  • GitHub Check: Async macOS
  • GitHub Check: Async Linux
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-03-28T02:14:04.465Z
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.

Applied to files:

  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
🔇 Additional comments (5)
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (5)

5-40: Clear and comprehensive documentation of the resolution algorithm.

The documentation accurately describes the dispatch logic and resolution tiers. Good addition for maintainability.


217-239: Clean dispatch pattern for absolute vs relative paths.

The separation into dedicated helpers improves readability and allows for optimization strategies specific to each path type.


254-336: Well-structured relative path resolution with proper policy re-checking.

The tiered approach (canonicalize → new-file-fast → soft_canonicalize) correctly handles existing files, new files in existing directories, and paths with missing parents. The policy re-check after normalization/symlink resolution is essential for security.


439-472: Clean consolidation of external permission evaluation.

The try_external helper nicely encapsulates the external-directory fallback logic. The impl Into<Option<PathBuf>> signature is ergonomic, allowing callers to pass resolved directly or None. The early empty-ruleset check at line 446 avoids unnecessary canonicalization work.


840-852: Test comment clarifies the intent.

Good clarification that external permission only applies to absolute paths.

@Sewer56 Sewer56 marked this pull request as ready for review April 8, 2026 00:36
The fast_policy_input shortcut compared an absolute input path against
a normalized relative path, so it could never match and the code always
fell through to policy.is_allowed. Remove the dead branch, the unused
has_dots parameter, and simplify all three resolution tiers.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/llm-coding-tools-core/src/path/allowed_glob/mod.rs`:
- Around line 352-370: The current canonicalization block uses a single
`accepted` check that conflates "inside a base directory" and "allowed by the
base's glob policy", then calls `try_external(external_permission, path,
resolved)` whenever `accepted` is false; change this so you separately track
whether `resolved` is contained in any `base_directories` (e.g.,
`inside_any_base`) and whether any base's
`policy.is_allowed(&normalize::normalize_path(...))` returned true (e.g.,
`policy_allowed`), and only call `try_external` when `inside_any_base` is false
(i.e., the path is outside all bases). Update the logic around
`input_path.canonicalize()`, the `base_directories.iter().any(...)` closure, and
the eventual decision that returns `Ok(resolved)` or falls back to
`try_external(external_permission, path, resolved)` so that an in-base denial by
`policy.is_allowed` is not bypassed by `external_permission`.
- Around line 227-237: The current fast-path using fast_policy_input (set when
analysis.has_dots is false) treats a policy denial on the raw input as final,
which can incorrectly reject valid paths that normalize or follow symlinks;
update the logic around fast_policy_input and resolve_relative so that a policy
denial on the fast probe is not a hard reject: use the fast probe only as a
cached allow (if the policy allows, return allow immediately), but if the fast
probe denies, continue and run the canonicalized/normalized check (via
resolve_relative using self.base_directories, policy, path, input_path) to make
the final decision; ensure functions like resolve_relative or its caller handle
a "probe denied" outcome by falling back to full resolution rather than
returning deny.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8daf20d6-c19b-492b-b475-f1b004cb9395

📥 Commits

Reviewing files that changed from the base of the PR and between 9b1fcb0 and 193c5e2.

📒 Files selected for processing (1)
  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Async Windows
  • GitHub Check: Async Linux
  • GitHub Check: Blocking macOS
  • GitHub Check: Async macOS
  • GitHub Check: Blocking Windows
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-03-28T02:14:04.465Z
Learnt from: Sewer56
Repo: Sewer56/llm-coding-tools PR: 69
File: src/llm-coding-tools-bubblewrap/src/profile/validation.rs:57-67
Timestamp: 2026-03-28T02:14:04.465Z
Learning: In `src/llm-coding-tools-bubblewrap/src/profile/` (Rust, llm-coding-tools-bubblewrap crate), the `Builder` API paths (workspace, synthetic_home, cache_root, mount lists, overlays, etc.) are always set by trusted application/operator code — the library consumer is the trusted party. Path normalization and `..`-component hardening in validators like `validate_absolute_path` is therefore NOT required to defend against traversal attacks. Untrusted input (LLM-generated shell commands) only enters through `wrap_command`/`execute_command_with_mode`, not through the `Builder`.

Applied to files:

  • src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
🔇 Additional comments (1)
src/llm-coding-tools-core/src/path/allowed_glob/mod.rs (1)

417-459: Nice extraction of the external fallback.

Centralizing canonicalization reuse, permission evaluation, and the shared rejection path in try_external makes the absolute-path tiers much easier to keep aligned.

Comment thread src/llm-coding-tools-core/src/path/allowed_glob/mod.rs Outdated
Comment thread src/llm-coding-tools-core/src/path/allowed_glob/mod.rs
Sewer56 added 2 commits April 8, 2026 02:13
Paths inside a base directory that were denied by glob policy could
incorrectly fall through to try_external() and be approved via
external_permission. Now we track inside_any_base separately from
accepted so that in-base denials are rejected immediately.

- Add not_allowed_error() helper to unify error messages
- Add test rejects_in_base_path_denied_by_policy_even_with_external_permission
…mlinked paths in resolve_relative

The fast-path checked glob policy on the raw input string before
canonicalization, causing it to skip bases where the raw path was denied
but the symlink-resolved path would have been allowed. This also removes
the now-unused has_dots field from PathAnalysis.
@Sewer56 Sewer56 merged commit 31a20e9 into main Apr 8, 2026
21 of 22 checks passed
@Sewer56 Sewer56 deleted the optimize-external-directory-resolver branch April 8, 2026 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant