Skip to content

fix(security): always canonicalize paths before policy check#2111

Merged
senamakel merged 6 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/symlink-bypass-path-validation
May 20, 2026
Merged

fix(security): always canonicalize paths before policy check#2111
senamakel merged 6 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/symlink-bypass-path-validation

Conversation

@M3gA-Mind
Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind commented May 18, 2026

Summary

  • Renamed is_path_allowedis_path_string_allowed (string-only, no symlink resolution) and made the split intent explicit via a doc-comment warning.
  • Added validate_path() — async, canonicalizes via tokio::fs::canonicalize, checks workspace containment, and re-checks forbidden paths against the resolved path.
  • Added validate_parent_path() — same as above but canonicalizes the parent directory for write operations where the target file may not yet exist.
  • Migrated all 8 path-checking call sites (7 filesystem tools + image_info) to the unified API, removing the manual two-step pairing that callers could and did forget.
  • Fixed active exfiltration path: image_info.rs was calling is_path_allowed with no resolved check — an agent could read any file via a symlink inside the workspace.

Problem

is_path_allowed() validated path strings without resolving symlinks. A symlink created inside the allowed workspace pointing to /etc/shadow (or any forbidden file) passed all checks. A separate is_resolved_path_allowed() existed but had to be called manually — two callers (image_info.rs, cron/scheduler.rs) omitted it entirely, leaving active exfiltration paths open.

Solution

  • Unified API: validate_path() is the single entry point for any path used in file I/O. It fast-rejects at the string level, then calls canonicalize to resolve all symlinks, then checks workspace containment, then re-checks forbidden paths against the resolved canonical path. Returning Ok(PathBuf) on success gives callers the canonical path they need for I/O with no extra work.
  • Write-path variant: validate_parent_path() canonicalizes the parent directory for callers that write new files (the target may not exist yet).
  • Token-scanner preserved: is_path_string_allowed() (the renamed string-only check) is kept pub for cron/scheduler.rs::detect_forbidden_path_arguments, which scans shell command tokens without doing file I/O — the string-only check is intentionally correct there.
  • Forced audit: Removing the old is_path_allowed name produced compile errors at every call site, ensuring each was reviewed and migrated.
  • Design tradeoff: validate_path is async because tokio::fs::canonicalize is async. All existing callers already run in async contexts, so this is zero friction.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case)
  • Diff coverage ≥ 80%pnpm test:coverage + cargo test -p openhuman run locally; 6 new async tests cover all symlink attack vectors on changed lines
  • Coverage matrix updated — N/A: security hardening of existing behaviour, no new feature rows
  • All affected feature IDs from the matrix listed under ## Related — N/A: no new feature IDs
  • No new external network dependencies introduced — N/A: pure in-process path validation
  • Manual smoke checklist updated if this touches release-cut surfaces — N/A: internal security policy, no UI change
  • Linked issue closed via Closes #NNN in ## Related

Impact

  • All platforms (desktop macOS/Windows/Linux): path validation now always resolves symlinks before policy checks.
  • Security: eliminates the symlink-bypass class of attacks on the agent filesystem tools and CEF image reader.
  • No user-visible change: the policy enforcement tightens but legitimate agent file access is unaffected.
  • Performance: one additional canonicalize syscall per file operation — negligible for agent tool use cadence.

Related


AI Authored PR Metadata

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/symlink-bypass-path-validation
  • Commit SHA: 0f076cf9da1024d79de7236ef5365879da062e09

Validation Run

  • pnpm --filter openhuman-app format:check — clean
  • pnpm --filter openhuman-app compile — no errors
  • Focused tests: cargo test -p openhuman -- policy — 6 new tests pass, all existing pass
  • Rust fmt/check: cargo fmt --check + cargo clippy -p openhuman — clean
  • Tauri fmt/check: N/A — no Tauri shell changes

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: symlinks are now resolved before path policy checks; previously a symlink inside the workspace to a forbidden file was allowed
  • User-visible effect: none under normal use; agents attempting to read forbidden files via symlinks will now receive an error

Parity Contract

  • Legacy behavior preserved: all string-level checks (.. traversal, null bytes, URL-encoded traversal, forbidden prefix matching) unchanged in is_path_string_allowed
  • Guard/fallback/dispatch parity checks: symlink_metadata guards in edit_file.rs and apply_patch.rs preserved — they enforce a stricter "no symlinks even within workspace" policy for those tools

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: N/A
  • Resolution: N/A

Summary by CodeRabbit

  • New Features

    • Added async path- and parent-path validation APIs to resolve and validate targets before file ops.
  • Bug Fixes

    • Stronger path validation preventing symlink escapes, traversal, null-byte and encoded-traversal attacks.
    • Tilde (~) expansion and forbidden-directory checks enforced consistently; writes into forbidden locations blocked.
  • Refactor

    • Centralized, string-based path checks used across file tools; parent-path validation now runs before directory creation.
  • Tests

    • Expanded coverage for resolved-path/parent-path validation and edge cases.

Review Change Stack

Merges the split is_path_allowed/is_resolved_path_allowed API into a
single validate_path() that resolves symlinks before checking workspace
boundaries and forbidden paths. Adds validate_parent_path() for write
operations where the target file may not yet exist.

Two callers (image_info.rs, cron/scheduler.rs) were missing the
resolved check entirely — image_info.rs could be used to exfiltrate
files via a symlink inside the workspace.

Closes tinyhumansai#1927
@M3gA-Mind M3gA-Mind requested a review from a team May 18, 2026 13:22
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

SecurityPolicy now separates string checks (is_path_string_allowed) from resolved-path validation and adds async APIs validate_path and validate_parent_path that canonicalize targets, enforce workspace containment, and reject forbidden directories; filesystem tools and tests were updated to use these APIs.

Changes

Security Refactor: Unified Path Validation with Symlink Resolution

Layer / File(s) Summary
Security Policy Core Refactoring
src/openhuman/security/policy.rs
Adds expand_tilde; renames is_path_allowedis_path_string_allowed; adds validate_path and validate_parent_path which canonicalize, check workspace containment, and match against forbidden directories.
Security Policy Test Coverage
src/openhuman/security/policy_tests.rs
Updates synchronous tests to use is_path_string_allowed and adds async Unix tests for validate_path and validate_parent_path, covering symlink resolution and forbidden-path cases.
Cron Scheduler Integration
src/openhuman/cron/scheduler.rs
Uses is_path_string_allowed for forbidden path-argument checks.
Filesystem Read & Scan Tools
src/openhuman/tools/impl/browser/image_info.rs, src/openhuman/tools/impl/filesystem/file_read.rs, src/openhuman/tools/impl/filesystem/list_files.rs, src/openhuman/tools/impl/filesystem/grep.rs
Tools now call validate_path(path).await (rate-limit recording moved before validation where applied) and use returned resolved PathBufs for metadata/read/scan operations.
Filesystem Write-Path Tools
src/openhuman/tools/impl/filesystem/file_write.rs, src/openhuman/tools/impl/filesystem/csv_export.rs
Tools now call validate_parent_path(path).await to obtain a resolved target before creating directories and writing files.
Filesystem Modify-Path Tools
src/openhuman/tools/impl/filesystem/apply_patch.rs, src/openhuman/tools/impl/filesystem/edit_file.rs
Apply Patch and Edit File now use is_path_string_allowed for string checks and validate_path(...).await for resolved-path validation, returning early on validation failures.

Sequence Diagram(s)

sequenceDiagram
  participant Tool as FileTool (e.g. ImageInfoTool)
  participant Policy as SecurityPolicy
  participant FS as Filesystem
  Tool->>Policy: validate_path(path).await / validate_parent_path(path).await
  Policy->>FS: canonicalize / read deepest existing ancestor
  FS-->>Policy: resolved PathBuf or error
  Policy->>Policy: is_resolved_path_allowed + forbidden checks
  Policy-->>Tool: Ok(resolved PathBuf) or Err(message)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • senamakel

Poem

A rabbit hops along the trail,
Expands the tilde without fail,
Canonical paths it traces neat,
Symlinks halted where they meet,
Safe gardens kept beneath its feet—🐇✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: centralizing path canonicalization into the security policy layer before policy checks to fix a symlink bypass vulnerability.
Linked Issues check ✅ Passed The PR fully addresses #1927's coding objectives: canonicalization is enforced before policy checks, the fragmented API is unified, symlink validation is added across all call sites, and forbidden-path checks now operate on resolved paths.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the symlink bypass vulnerability: renaming is_path_allowed to is_path_string_allowed, adding validate_path and validate_parent_path async APIs, migrating 8 call sites, and adding regression tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. label May 18, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/security/policy.rs`:
- Around line 792-800: The current checks in validate_path and
validate_parent_path only prefix-match canonicalized targets against raw
forbidden strings, which fails for relative forbidden entries and file-specific
forbids; fix by resolving each forbidden entry against the workspace root (after
expand_tilde) and canonicalizing that forbidden_path, then compare the canonical
target (resolved_str or result) against the canonical forbidden_path using
equality or starts_with(forbidden_path + "/") so both exact-file and subtree
forbids are caught; update both the loop over self.forbidden_paths in
validate_path and the corresponding loop in validate_parent_path to perform this
canonicalized resolution and re-check the final target (resolved_str/result)
rather than matching against the raw forbidden string.

In `@src/openhuman/tools/impl/filesystem/csv_export.rs`:
- Around line 193-200: The current flow calls tokio::fs::create_dir_all(parent)
on the raw relative_path before calling
self.security.validate_parent_path(&relative_path), which can create dirs
outside the workspace; change the order so you first call
self.security.validate_parent_path(&relative_path).await, obtain the
resolved_target (the validated safe parent), and only then call
tokio::fs::create_dir_all(&resolved_target) (or create under the resolved
parent) so directory creation happens under the validated path (mirror the
validate-then-create pattern used in file_write); update references to
parent/relative_path accordingly and remove the pre-validation create_dir_all
call.

In `@src/openhuman/tools/impl/filesystem/file_write.rs`:
- Around line 72-79: The code currently calls
tokio::fs::create_dir_all(parent).await before validating the resolved parent
path, which can create directories outside the sandbox; change the flow to first
call self.security.validate_parent_path(path).await and use its Ok(p) (e.g.,
resolved_target) as the canonical parent path, then call
tokio::fs::create_dir_all(resolved_target).await (or equivalent) and proceed
with the rest of file_write logic; update references to
parent/workspace_dir.join(path) to use the validated resolved_target so
directory creation is anchored to the resolved, safe path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d908f73d-079b-4f7f-9a8f-fa70e551377a

📥 Commits

Reviewing files that changed from the base of the PR and between 70fdedc and 0f076cf.

📒 Files selected for processing (11)
  • src/openhuman/cron/scheduler.rs
  • src/openhuman/security/policy.rs
  • src/openhuman/security/policy_tests.rs
  • src/openhuman/tools/impl/browser/image_info.rs
  • src/openhuman/tools/impl/filesystem/apply_patch.rs
  • src/openhuman/tools/impl/filesystem/csv_export.rs
  • src/openhuman/tools/impl/filesystem/edit_file.rs
  • src/openhuman/tools/impl/filesystem/file_read.rs
  • src/openhuman/tools/impl/filesystem/file_write.rs
  • src/openhuman/tools/impl/filesystem/grep.rs
  • src/openhuman/tools/impl/filesystem/list_files.rs

Comment thread src/openhuman/security/policy.rs Outdated
Comment thread src/openhuman/tools/impl/filesystem/csv_export.rs Outdated
Comment thread src/openhuman/tools/impl/filesystem/file_write.rs Outdated
…te before create_dir_all

- In validate_path/validate_parent_path, switch from string starts_with to
  path-component-aware comparison for forbidden_paths.
- Resolve relative forbidden entries against the workspace root so entries
  like "secrets" correctly block workspace/secrets/ even after canonicalization.
- Skip absolute forbidden entries that are ancestors of the workspace root
  (e.g. /tmp when workspace is /tmp/…) — the workspace containment check
  already guarantees the resolved path is safe.
- validate_parent_path now walks up to the deepest existing ancestor before
  canonicalizing, so it works without requiring the parent directory to exist.
- file_write and csv_export now call validate_parent_path BEFORE create_dir_all,
  then create directories at the validated canonical location. This prevents a
  symlinked path component from causing directory creation outside the workspace
  before the security check fires.

Fixes 25 failing filesystem tests (false-positive forbidden-path rejections
when workspace is under /tmp) and closes the pre-create-dir_all attack surface.
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 18, 2026
…forbidden-entry symlink regression test

- image_info.rs: remove redundant "Path not allowed: " prefix — validate_path
  already returns a complete user-facing error string.
- policy_tests.rs: add validate_path_blocks_symlink_to_relative_forbidden_entry
  to lock in the be29669 fix where relative forbidden entries (e.g. "secrets")
  were not resolved against the workspace root and could be bypassed via a
  symlink pointing into the forbidden directory.
@M3gA-Mind
Copy link
Copy Markdown
Contributor Author

Addressed the two actionable items from the self-review in commit 61343cff:

  • image_info.rs: Removed the redundant "Path not allowed: " prefix on the validate_path error — validate_path already returns a complete user-facing message, so the outer wrapper was producing doubled text like "Path not allowed: Path not allowed by security policy: …".
  • policy_tests.rs: Added validate_path_blocks_symlink_to_relative_forbidden_entry — a regression test for the be296695 fix where relative forbidden entries (e.g. "secrets") were not resolved against the workspace root. The test asserts that both a direct path (secrets/token.txt) and a symlink pointing into the forbidden directory (link/token.txt) are correctly blocked.

(The workspace_dir.canonicalize() loop note from the review was incorrect — it was already computed outside the loop; no change needed there.)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/openhuman/security/policy_tests.rs (1)

1239-1253: 💤 Low value

Consider using /etc/hosts for better cross-platform coverage.

/etc/hostname doesn't exist on macOS, so the symlink will dangle. The test still passes because canonicalize() fails, but it's not actually exercising the forbidden-path logic on macOS. Using /etc/hosts (which exists on both Linux and macOS) would ensure the test verifies the intended behavior across all Unix platforms.

♻️ Suggested change
 #[cfg(unix)]
 #[tokio::test]
 async fn validate_path_blocks_symlink_to_forbidden_path() {
     let workspace = tempfile::tempdir().unwrap();
-    // /etc/hostname is readable on most Unix systems
+    // /etc/hosts exists on both Linux and macOS
     let link = workspace.path().join("link");
-    std::os::unix::fs::symlink("/etc/hostname", &link).unwrap();
+    std::os::unix::fs::symlink("/etc/hosts", &link).unwrap();
     let policy = SecurityPolicy {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/security/policy_tests.rs` around lines 1239 - 1253, The test
validate_path_blocks_symlink_to_forbidden_path creates a symlink to
/etc/hostname which may not exist on macOS causing a dangling symlink and not
exercising the forbidden-path logic; change the target to /etc/hosts (which
exists on both Linux and macOS) when creating the symlink in that test so
canonicalize() resolves and the policy.validate_path("link").await assertion
truly checks forbidden path behavior; update the inline comment accordingly and
keep the rest of the test unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/security/policy_tests.rs`:
- Around line 1239-1253: The test validate_path_blocks_symlink_to_forbidden_path
creates a symlink to /etc/hostname which may not exist on macOS causing a
dangling symlink and not exercising the forbidden-path logic; change the target
to /etc/hosts (which exists on both Linux and macOS) when creating the symlink
in that test so canonicalize() resolves and the
policy.validate_path("link").await assertion truly checks forbidden path
behavior; update the inline comment accordingly and keep the rest of the test
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d8590f56-bd74-480f-81d2-836922fe642f

📥 Commits

Reviewing files that changed from the base of the PR and between be29669 and 61343cf.

📒 Files selected for processing (2)
  • src/openhuman/security/policy_tests.rs
  • src/openhuman/tools/impl/browser/image_info.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/tools/impl/browser/image_info.rs

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
Lines 888-896 of policy.rs were uncovered — the forbidden_paths loop
inside validate_parent_path had no test. Add
validate_parent_path_blocks_forbidden_path to assert that writing a new
file into a relative-forbidden directory is rejected.
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walkthrough

Solid security fix that unifies the symlink-bypass-prone two-step path validation (is_path_allowed + manual canonicalize + is_resolved_path_allowed) into a single validate_path() / validate_parent_path() API that always canonicalizes before checking. All 8 call sites are migrated, the old function name is retired to force compile-time audit, and 8 new async tests cover the key attack vectors. The approach is sound and the critical bugs CodeRabbit flagged are all resolved.

One major concern: the forbidden-path checking logic is duplicated across two security-critical functions, which creates a maintenance risk where a future fix to one could miss the other.

Change Summary

File Change type Description
security/policy.rs Core change Renamed is_path_allowedis_path_string_allowed, extracted expand_tilde, added validate_path() and validate_parent_path()
security/policy_tests.rs Tests Renamed assertions, added 8 new async tests covering symlink attacks
cron/scheduler.rs Migration 1-line rename to is_path_string_allowed (string-only context, correct)
browser/image_info.rs Migration Replaced manual check + existence test with validate_path()
filesystem/apply_patch.rs Migration Early is_path_string_allowed + later validate_path() — correct two-phase approach
filesystem/csv_export.rs Migration validate_parent_path() before create_dir_all — fixes the pre-validation dir creation bug
filesystem/edit_file.rs Migration Replaced 3-step check with validate_path()
filesystem/file_read.rs Migration Replaced 3-step check with validate_path()
filesystem/file_write.rs Migration validate_parent_path() before create_dir_all — same fix as csv_export
filesystem/grep.rs Migration Replaced 3-step check with validate_path()
filesystem/list_files.rs Migration Replaced 3-step check with validate_path()

Comment thread src/openhuman/security/policy.rs Outdated
Comment thread src/openhuman/security/policy.rs
Comment thread src/openhuman/security/policy.rs
…; expand tilde on input

- Extract `check_resolved_against_forbidden` to de-duplicate ~20 lines of
  security-critical forbidden-path loop shared between `validate_path` and
  `validate_parent_path`. A single fix site eliminates the risk of one
  function diverging from the other.

- Replace sync `self.workspace_dir.canonicalize()` with
  `tokio::fs::canonicalize(...).await` inside both async functions so the
  tokio runtime is never blocked on a slow or networked filesystem.

- Expand tilde on the input path before joining with `workspace_dir`.
  Previously `workspace_dir.join("~/foo.txt")` produced a literal `~/`
  component; now the path is expanded first so `~/foo.txt` correctly
  resolves to `$HOME/foo.txt` and the workspace-escape check fires.

- Add two regression tests covering the tilde-expansion fix.

Addresses review comments from graycyrus on PR tinyhumansai#2111.
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review — all prior changes addressed

All three findings from my previous review have been resolved in d2720ec9:

Finding Status
[major] Duplicated forbidden-path loop in validate_path/validate_parent_path Fixed — extracted check_resolved_against_forbidden shared helper
[minor] Sync std::fs::canonicalize in async functions Fixed — replaced with tokio::fs::canonicalize
[minor] Tilde not expanded before workspace_dir.join() Fixed — both functions now call expand_tilde on input; two new tests verify the behavior

The refactored code is clean. The helper centralizes security-critical forbidden-path logic so future changes only need to touch one place. Test coverage is thorough — 10 async tests covering all symlink attack vectors, tilde edge cases, and forbidden-path resolution.

No new issues found. Nice work on this security hardening, @M3gA-Mind.

# Conflicts:
#	src/openhuman/security/policy.rs
@senamakel senamakel merged commit a8eac1a into tinyhumansai:main May 20, 2026
26 checks passed
mtkik pushed a commit to mtkik/openhuman-meet that referenced this pull request May 21, 2026
CodeGhost21 pushed a commit to CodeGhost21/openhuman that referenced this pull request May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security: Symlink bypass in is_path_allowed() — no canonicalization

3 participants