fix(security): always canonicalize paths before policy check#2111
Conversation
Merges the split is_path_allowed/is_resolved_path_allowed API into a single validate_path() that resolves symlinks before checking workspace boundaries and forbidden paths. Adds validate_parent_path() for write operations where the target file may not yet exist. Two callers (image_info.rs, cron/scheduler.rs) were missing the resolved check entirely — image_info.rs could be used to exfiltrate files via a symlink inside the workspace. Closes tinyhumansai#1927
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughSecurityPolicy now separates string checks ( ChangesSecurity Refactor: Unified Path Validation with Symlink Resolution
Sequence Diagram(s)sequenceDiagram
participant Tool as FileTool (e.g. ImageInfoTool)
participant Policy as SecurityPolicy
participant FS as Filesystem
Tool->>Policy: validate_path(path).await / validate_parent_path(path).await
Policy->>FS: canonicalize / read deepest existing ancestor
FS-->>Policy: resolved PathBuf or error
Policy->>Policy: is_resolved_path_allowed + forbidden checks
Policy-->>Tool: Ok(resolved PathBuf) or Err(message)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/security/policy.rs`:
- Around line 792-800: The current checks in validate_path and
validate_parent_path only prefix-match canonicalized targets against raw
forbidden strings, which fails for relative forbidden entries and file-specific
forbids; fix by resolving each forbidden entry against the workspace root (after
expand_tilde) and canonicalizing that forbidden_path, then compare the canonical
target (resolved_str or result) against the canonical forbidden_path using
equality or starts_with(forbidden_path + "/") so both exact-file and subtree
forbids are caught; update both the loop over self.forbidden_paths in
validate_path and the corresponding loop in validate_parent_path to perform this
canonicalized resolution and re-check the final target (resolved_str/result)
rather than matching against the raw forbidden string.
In `@src/openhuman/tools/impl/filesystem/csv_export.rs`:
- Around line 193-200: The current flow calls tokio::fs::create_dir_all(parent)
on the raw relative_path before calling
self.security.validate_parent_path(&relative_path), which can create dirs
outside the workspace; change the order so you first call
self.security.validate_parent_path(&relative_path).await, obtain the
resolved_target (the validated safe parent), and only then call
tokio::fs::create_dir_all(&resolved_target) (or create under the resolved
parent) so directory creation happens under the validated path (mirror the
validate-then-create pattern used in file_write); update references to
parent/relative_path accordingly and remove the pre-validation create_dir_all
call.
In `@src/openhuman/tools/impl/filesystem/file_write.rs`:
- Around line 72-79: The code currently calls
tokio::fs::create_dir_all(parent).await before validating the resolved parent
path, which can create directories outside the sandbox; change the flow to first
call self.security.validate_parent_path(path).await and use its Ok(p) (e.g.,
resolved_target) as the canonical parent path, then call
tokio::fs::create_dir_all(resolved_target).await (or equivalent) and proceed
with the rest of file_write logic; update references to
parent/workspace_dir.join(path) to use the validated resolved_target so
directory creation is anchored to the resolved, safe path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d908f73d-079b-4f7f-9a8f-fa70e551377a
📒 Files selected for processing (11)
src/openhuman/cron/scheduler.rssrc/openhuman/security/policy.rssrc/openhuman/security/policy_tests.rssrc/openhuman/tools/impl/browser/image_info.rssrc/openhuman/tools/impl/filesystem/apply_patch.rssrc/openhuman/tools/impl/filesystem/csv_export.rssrc/openhuman/tools/impl/filesystem/edit_file.rssrc/openhuman/tools/impl/filesystem/file_read.rssrc/openhuman/tools/impl/filesystem/file_write.rssrc/openhuman/tools/impl/filesystem/grep.rssrc/openhuman/tools/impl/filesystem/list_files.rs
…te before create_dir_all - In validate_path/validate_parent_path, switch from string starts_with to path-component-aware comparison for forbidden_paths. - Resolve relative forbidden entries against the workspace root so entries like "secrets" correctly block workspace/secrets/ even after canonicalization. - Skip absolute forbidden entries that are ancestors of the workspace root (e.g. /tmp when workspace is /tmp/…) — the workspace containment check already guarantees the resolved path is safe. - validate_parent_path now walks up to the deepest existing ancestor before canonicalizing, so it works without requiring the parent directory to exist. - file_write and csv_export now call validate_parent_path BEFORE create_dir_all, then create directories at the validated canonical location. This prevents a symlinked path component from causing directory creation outside the workspace before the security check fires. Fixes 25 failing filesystem tests (false-positive forbidden-path rejections when workspace is under /tmp) and closes the pre-create-dir_all attack surface.
…forbidden-entry symlink regression test - image_info.rs: remove redundant "Path not allowed: " prefix — validate_path already returns a complete user-facing error string. - policy_tests.rs: add validate_path_blocks_symlink_to_relative_forbidden_entry to lock in the be29669 fix where relative forbidden entries (e.g. "secrets") were not resolved against the workspace root and could be bypassed via a symlink pointing into the forbidden directory.
|
Addressed the two actionable items from the self-review in commit
(The |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/security/policy_tests.rs (1)
1239-1253: 💤 Low valueConsider using
/etc/hostsfor better cross-platform coverage.
/etc/hostnamedoesn't exist on macOS, so the symlink will dangle. The test still passes becausecanonicalize()fails, but it's not actually exercising the forbidden-path logic on macOS. Using/etc/hosts(which exists on both Linux and macOS) would ensure the test verifies the intended behavior across all Unix platforms.♻️ Suggested change
#[cfg(unix)] #[tokio::test] async fn validate_path_blocks_symlink_to_forbidden_path() { let workspace = tempfile::tempdir().unwrap(); - // /etc/hostname is readable on most Unix systems + // /etc/hosts exists on both Linux and macOS let link = workspace.path().join("link"); - std::os::unix::fs::symlink("/etc/hostname", &link).unwrap(); + std::os::unix::fs::symlink("/etc/hosts", &link).unwrap(); let policy = SecurityPolicy {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/security/policy_tests.rs` around lines 1239 - 1253, The test validate_path_blocks_symlink_to_forbidden_path creates a symlink to /etc/hostname which may not exist on macOS causing a dangling symlink and not exercising the forbidden-path logic; change the target to /etc/hosts (which exists on both Linux and macOS) when creating the symlink in that test so canonicalize() resolves and the policy.validate_path("link").await assertion truly checks forbidden path behavior; update the inline comment accordingly and keep the rest of the test unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/security/policy_tests.rs`:
- Around line 1239-1253: The test validate_path_blocks_symlink_to_forbidden_path
creates a symlink to /etc/hostname which may not exist on macOS causing a
dangling symlink and not exercising the forbidden-path logic; change the target
to /etc/hosts (which exists on both Linux and macOS) when creating the symlink
in that test so canonicalize() resolves and the
policy.validate_path("link").await assertion truly checks forbidden path
behavior; update the inline comment accordingly and keep the rest of the test
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d8590f56-bd74-480f-81d2-836922fe642f
📒 Files selected for processing (2)
src/openhuman/security/policy_tests.rssrc/openhuman/tools/impl/browser/image_info.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src/openhuman/tools/impl/browser/image_info.rs
Lines 888-896 of policy.rs were uncovered — the forbidden_paths loop inside validate_parent_path had no test. Add validate_parent_path_blocks_forbidden_path to assert that writing a new file into a relative-forbidden directory is rejected.
graycyrus
left a comment
There was a problem hiding this comment.
Walkthrough
Solid security fix that unifies the symlink-bypass-prone two-step path validation (is_path_allowed + manual canonicalize + is_resolved_path_allowed) into a single validate_path() / validate_parent_path() API that always canonicalizes before checking. All 8 call sites are migrated, the old function name is retired to force compile-time audit, and 8 new async tests cover the key attack vectors. The approach is sound and the critical bugs CodeRabbit flagged are all resolved.
One major concern: the forbidden-path checking logic is duplicated across two security-critical functions, which creates a maintenance risk where a future fix to one could miss the other.
Change Summary
| File | Change type | Description |
|---|---|---|
security/policy.rs |
Core change | Renamed is_path_allowed → is_path_string_allowed, extracted expand_tilde, added validate_path() and validate_parent_path() |
security/policy_tests.rs |
Tests | Renamed assertions, added 8 new async tests covering symlink attacks |
cron/scheduler.rs |
Migration | 1-line rename to is_path_string_allowed (string-only context, correct) |
browser/image_info.rs |
Migration | Replaced manual check + existence test with validate_path() |
filesystem/apply_patch.rs |
Migration | Early is_path_string_allowed + later validate_path() — correct two-phase approach |
filesystem/csv_export.rs |
Migration | validate_parent_path() before create_dir_all — fixes the pre-validation dir creation bug |
filesystem/edit_file.rs |
Migration | Replaced 3-step check with validate_path() |
filesystem/file_read.rs |
Migration | Replaced 3-step check with validate_path() |
filesystem/file_write.rs |
Migration | validate_parent_path() before create_dir_all — same fix as csv_export |
filesystem/grep.rs |
Migration | Replaced 3-step check with validate_path() |
filesystem/list_files.rs |
Migration | Replaced 3-step check with validate_path() |
…; expand tilde on input
- Extract `check_resolved_against_forbidden` to de-duplicate ~20 lines of
security-critical forbidden-path loop shared between `validate_path` and
`validate_parent_path`. A single fix site eliminates the risk of one
function diverging from the other.
- Replace sync `self.workspace_dir.canonicalize()` with
`tokio::fs::canonicalize(...).await` inside both async functions so the
tokio runtime is never blocked on a slow or networked filesystem.
- Expand tilde on the input path before joining with `workspace_dir`.
Previously `workspace_dir.join("~/foo.txt")` produced a literal `~/`
component; now the path is expanded first so `~/foo.txt` correctly
resolves to `$HOME/foo.txt` and the workspace-escape check fires.
- Add two regression tests covering the tilde-expansion fix.
Addresses review comments from graycyrus on PR tinyhumansai#2111.
graycyrus
left a comment
There was a problem hiding this comment.
Re-review — all prior changes addressed
All three findings from my previous review have been resolved in d2720ec9:
| Finding | Status |
|---|---|
[major] Duplicated forbidden-path loop in validate_path/validate_parent_path |
Fixed — extracted check_resolved_against_forbidden shared helper |
[minor] Sync std::fs::canonicalize in async functions |
Fixed — replaced with tokio::fs::canonicalize |
[minor] Tilde not expanded before workspace_dir.join() |
Fixed — both functions now call expand_tilde on input; two new tests verify the behavior |
The refactored code is clean. The helper centralizes security-critical forbidden-path logic so future changes only need to touch one place. Test coverage is thorough — 10 async tests covering all symlink attack vectors, tilde edge cases, and forbidden-path resolution.
No new issues found. Nice work on this security hardening, @M3gA-Mind.
# Conflicts: # src/openhuman/security/policy.rs
…ansai#2111) Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
…ansai#2111) Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
Summary
is_path_allowed→is_path_string_allowed(string-only, no symlink resolution) and made the split intent explicit via a doc-comment warning.validate_path()— async, canonicalizes viatokio::fs::canonicalize, checks workspace containment, and re-checks forbidden paths against the resolved path.validate_parent_path()— same as above but canonicalizes the parent directory for write operations where the target file may not yet exist.image_info) to the unified API, removing the manual two-step pairing that callers could and did forget.image_info.rswas callingis_path_allowedwith no resolved check — an agent could read any file via a symlink inside the workspace.Problem
is_path_allowed()validated path strings without resolving symlinks. A symlink created inside the allowed workspace pointing to/etc/shadow(or any forbidden file) passed all checks. A separateis_resolved_path_allowed()existed but had to be called manually — two callers (image_info.rs,cron/scheduler.rs) omitted it entirely, leaving active exfiltration paths open.Solution
validate_path()is the single entry point for any path used in file I/O. It fast-rejects at the string level, then callscanonicalizeto resolve all symlinks, then checks workspace containment, then re-checks forbidden paths against the resolved canonical path. ReturningOk(PathBuf)on success gives callers the canonical path they need for I/O with no extra work.validate_parent_path()canonicalizes the parent directory for callers that write new files (the target may not exist yet).is_path_string_allowed()(the renamed string-only check) is keptpubforcron/scheduler.rs::detect_forbidden_path_arguments, which scans shell command tokens without doing file I/O — the string-only check is intentionally correct there.is_path_allowedname produced compile errors at every call site, ensuring each was reviewed and migrated.validate_pathisasyncbecausetokio::fs::canonicalizeis async. All existing callers already run in async contexts, so this is zero friction.Submission Checklist
pnpm test:coverage+cargo test -p openhumanrun locally; 6 new async tests cover all symlink attack vectors on changed lines## Related— N/A: no new feature IDsCloses #NNNin## RelatedImpact
canonicalizesyscall per file operation — negligible for agent tool use cadence.Related
O_NOFOLLOW//proc/self/fdmitigations for TOCTOU (noted in implementation as out of scope for this issue)AI Authored PR Metadata
Linear Issue
Commit & Branch
fix/symlink-bypass-path-validation0f076cf9da1024d79de7236ef5365879da062e09Validation Run
pnpm --filter openhuman-app format:check— cleanpnpm --filter openhuman-app compile— no errorscargo test -p openhuman -- policy— 6 new tests pass, all existing passcargo fmt --check+cargo clippy -p openhuman— cleanValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
Parity Contract
..traversal, null bytes, URL-encoded traversal, forbidden prefix matching) unchanged inis_path_string_allowedsymlink_metadataguards inedit_file.rsandapply_patch.rspreserved — they enforce a stricter "no symlinks even within workspace" policy for those toolsDuplicate / Superseded PR Handling
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests