Skip to content

fix(cache-memory): reject symlinks in agent memory to prevent Stage 3 credential theft#524

Merged
jamesadevine merged 4 commits into
mainfrom
copilot/fix-symlink-following-issue
May 13, 2026
Merged

fix(cache-memory): reject symlinks in agent memory to prevent Stage 3 credential theft#524
jamesadevine merged 4 commits into
mainfrom
copilot/fix-symlink-following-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 13, 2026

Stage 3's process_agent_memory followed symlinks unconditionally, allowing a Stage 1 agent to plant a symlink (e.g. env.txt -> /proc/self/environ) in the memory staging directory that survives the artifact round-trip (ADO pipeline artifacts are tar-based and preserve symlinks). On Stage 3, the symlink would be read and copied into the output artifact, exfiltrating SC_WRITE_TOKEN and other write-capable credentials into the agent's next-run memory — breaking the core security invariant of the three-stage model.

A second vector was also closed: if the entire agent_memory/ entry in the artifact is itself a symlink to an outside directory, the old is_dir() check would follow it (returning true), then canonicalize() would resolve relative to the symlink target — causing every collected file to pass the starts_with containment guard and sensitive files to be copied to output.

Summary

collect_files — stop following symlinks during enumeration

  • Replace path.is_dir() (follows symlinks) with entry.file_type().await? (uses raw readdir result, never dereferences). Both file symlinks and directory symlinks are skipped with a warn! and never added to the collected file list.

process_agent_memory — reject symlink at base directory level

  • Replace exists()/is_dir() with tokio::fs::symlink_metadata() (lstat) on memory_source. A symlink at the agent_memory/ level is now detected and rejected before any further processing, closing the bypass where canonicalize() would resolve to the symlink target and make all collected files pass the containment check.

process_agent_memory — TOCTOU canonicalization guard

  • Canonicalize the base directory once before the loop; for each source file, canonicalize its path and assert it starts_with the canonical base. Catches any symlink that races past the collection phase before any read or copy occurs.
// Before the loop
let canonical_base = tokio::fs::canonicalize(&memory_source).await?;

// Per-file guard
let canonical_source = tokio::fs::canonicalize(&source_file).await?;
ensure!(canonical_source.starts_with(&canonical_base), "...");

Tests (Unix-only, 6 new #[cfg(unix)])

  • test_collect_files_skips_file_symlinks — absolute-target file symlink not collected
  • test_collect_files_skips_relative_symlinks — relative-target symlink (../secret.txt) not collected
  • test_collect_files_skips_directory_symlinks — directory symlink not recursed
  • test_process_memory_skips_file_symlinks — sensitive data from symlink target never copied end-to-end
  • test_process_memory_skips_directory_symlinks — directory symlink contents never copied end-to-end
  • test_process_memory_rejects_base_directory_symlink — top-level agent_memory -> /sensitive/dir symlink rejected, nothing copied

Test plan

cargo test --bin ado-aw cache_memory — all 31 tests pass (6 new).

Copilot AI and others added 2 commits May 13, 2026 09:21
… credential theft

Three-part defense against the symlink-following attack described in the
security audit:

1. collect_files: use entry.file_type() (does NOT follow symlinks) instead
   of path.is_dir() when deciding whether to recurse.  Both file symlinks
   and directory symlinks are now skipped with a warning, so they are
   never added to the file list at all.

2. process_agent_memory: canonicalize the memory source base directory
   once before the per-file loop, then canonicalize each source file path
   and verify it starts_with the canonical base.  This TOCTOU guard catches
   any symlink that slips through the collection phase (e.g., via a race
   condition) before any read or copy takes place.

3. Tests (Unix-only): added four new #[cfg(unix)] async tests that create
   real symlinks on disk and confirm:
   - collect_files skips file symlinks
   - collect_files skips directory symlinks (no recursion)
   - process_agent_memory skips file symlinks pointing outside the base
   - process_agent_memory skips directory symlinks pointing outside the base

Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/efdc2afe-1050-4068-8433-cd9a60762981

Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
- Add clarifying comment on else-if branch in collect_files explaining it
  is only reached for real (non-symlink) directories
- Improve canonicalize error message to explain the security purpose
- Add test_collect_files_skips_relative_symlinks covering the relative
  symlink attack vector (e.g. ../secret.txt)
- Fix assertion style: 'must NOT' -> 'must not'

Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/efdc2afe-1050-4068-8433-cd9a60762981

Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix symlink following vulnerability in cache memory module fix(cache-memory): reject symlinks in agent memory to prevent Stage 3 credential theft May 13, 2026
Copilot AI requested a review from jamesadevine May 13, 2026 09:27
@jamesadevine
Copy link
Copy Markdown
Collaborator

/rust-review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

Rust PR Reviewer completed successfully!

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Good fix for a real security vulnerability — the in-directory symlink defense is sound — but there is one remaining gap that could allow a bypass of the entire containment check.


Findings

🐛 Bugs / Logic Issues

execute.rs:164is_dir() follows symlinks on the base directory itself, defeating the containment check

if !memory_source.exists() || !memory_source.is_dir() {

Both exists() and is_dir() follow symlinks. If a Stage 1 agent plants a symlink at the agent_memory/ directory level (i.e., the entire agent_memory entry in the artifact is a symlink to an outside directory), this check passes, and the whole containment guard collapses:

  1. is_dir() returns true (follows the symlink to a real dir) → processing continues
  2. canonical_base = canonicalize(&memory_source) resolves to the symlink target, not the expected artifact path
  3. collect_files enumerates files from the symlink target
  4. For each collected file, canonical_source.starts_with(&canonical_base) is true (all files are within the target), so nothing is blocked
  5. Files from the outside directory are copied into the output artifact

The PR description notes that ADO artifacts are tar-based and preserve symlinks — so a top-level agent_memory -> /proc/self/fd/ (or any sensitive directory) symlink in the artifact is a viable attack vector.

Suggested fix — check symlink status before the is_dir() guard:

// Use lstat-equivalent to avoid following symlinks for the base directory.
let meta = tokio::fs::symlink_metadata(&memory_source).await;
match meta {
    Err(_) => {
        info!("No agent_memory directory found, skipping memory processing");
        return Ok(ExecutionResult::success("No agent memory to process"));
    }
    Ok(m) if m.is_symlink() => {
        warn!("agent_memory is a symlink — skipping to prevent directory escape");
        return Ok(ExecutionResult::success("No agent memory to process"));
    }
    Ok(m) if !m.is_dir() => {
        info!("No agent_memory directory found, skipping memory processing");
        return Ok(ExecutionResult::success("No agent memory to process"));
    }
    Ok(_) => {} // real directory, proceed
}

✅ What Looks Good

  • collect_files fix is correct: using entry.file_type().await? (lstat-based, no symlink following) instead of path.is_dir() is exactly the right approach.
  • TOCTOU guard is sound: computing canonical_base once and checking each file's resolved path against it correctly handles the race between collection and copy.
  • Error handling is idiomatic: .with_context() on the canonicalize call, warn! logging for all skips, and proper continue flow.
  • Test coverage is thorough: the five #[cfg(unix)] tests cover file symlinks, relative symlinks, directory symlinks, and end-to-end copy verification. The negative assertions (checking that symlink targets don't appear in output) are particularly good.

Generated by Rust PR Reviewer for issue #524 · ● 301.1K ·

@jamesadevine
Copy link
Copy Markdown
Collaborator

@copilot address feedback

The existing fix guarded against symlinks *inside* agent_memory, but
if the agent_memory entry itself is a directory symlink the old
`is_dir()` check follows it (returning true), and canonicalize()
resolves relative to the symlink target — so every collected file
passes the starts_with guard and sensitive files are copied.

Fix: replace `exists()/is_dir()` with `symlink_metadata()` (lstat) on
memory_source and explicitly reject the symlink case before any further
processing.

Also adds test_process_memory_rejects_base_directory_symlink which
plants a `agent_memory -> /sensitive/dir` symlink and asserts nothing
is copied to the output.

Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/971a1892-4711-4142-98ad-d85574576689

Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 13, 2026

@copilot address feedback

Fixed in 91fae19. The Rust reviewer correctly identified that is_dir() follows symlinks, so a top-level agent_memory -> /sensitive/dir symlink in the artifact would pass the old guard, causing canonicalize() to resolve relative to the symlink target — making all collected files pass the starts_with check.

The fix replaces exists()/is_dir() with symlink_metadata() (lstat) so the agent_memory entry itself is checked without following symlinks, and a symlink at that level is rejected immediately. A new test (test_process_memory_rejects_base_directory_symlink) verifies this end-to-end.

@jamesadevine jamesadevine marked this pull request as ready for review May 13, 2026 13:07
@jamesadevine jamesadevine merged commit f311d36 into main May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🔴 Red Team Audit — High: Symlink following in cache memory allows Stage 3 credential theft via /proc/self/environ

2 participants