feat: harden prompt guard with symlink rejection and new-file detection#40
feat: harden prompt guard with symlink rejection and new-file detection#40
Conversation
Task #37: Two security improvements for running against untrusted repos: 1. read_repo_instructions() now rejects symlinks before reading. Checks is_symlink() before is_file() to catch both valid and dangling symlinks. Prevents path-traversal attacks where CLAUDE.md -> /etc/passwd. 2. Prompt self-modification guard now detects new files in docs/prompt/ created during a cycle. Snapshots directory listings before the cycle and compares after. Catches agents creating prompt files that bypass the fixed PROMPT_GUARD_FILES list. +9 tests (5 Python symlink, 4 bash guard), 616 total passing.
Code review fixes: 1. Add LC_ALL=C to sort calls feeding comm, ensuring consistent collation across macOS/Linux locales. 2. Replace echo with printf in comm input to avoid spurious blank-line entry when directory listing is empty. 3. Add test for empty directory with no changes (false-positive prevention). 617 tests passing.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b8dd2768f1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| current_listing=$(ls -1 "$current_dir" 2>/dev/null | sort) | ||
| local new_files | ||
| new_files=$(comm -13 "$listing" <(echo "$current_listing")) |
There was a problem hiding this comment.
Detect nested prompt files when checking for new files
The new-file guard compares only top-level entries of each watched directory (ls -1), so adding a file inside an existing subdirectory is not detected. In this repo, docs/prompt/feedback/ already exists; creating docs/prompt/feedback/evil.md after snapshot leaves the top-level listing unchanged, comm reports no additions, and check_prompt_integrity returns success. This creates a bypass for the self-modification guard in exactly the scenario this change is meant to block.
Useful? React with 👍 / 👎.
5-agent audit identified task selection as mesa-optimization: agent optimizes session success over project progress. 5 more agents audited the fix and found 7 issues, all resolved. Phase 1 — Prompt fixes: - Remove "smaller in scope" incentive from evolve-auto.md - Queue order authoritative, handoff advisory - "Tasks I Did NOT Pick and Why" in every handoff - Tracker delta required in session reports - Staleness multiplier (5+ sessions = 2x priority) - All-integration-tasks edge case handled Phase 2 — Task queue: - environment: internal | integration tags - blocked_reason subtypes: environment, dependency, design - .next-id for atomic task ID allocation - archive/ for done tasks (daemon auto-archives) - Tagged #12, #28, #29 blocked-environment - Fixed #43 broken frontmatter (duplicate of #40) Phase 3 — Overseer avoidance detection: - 6 new checks: stale tasks, cherry-picking, stuck integration, weak blocks, max attempts, skip accountability Learnings index: - INDEX.md with categorized one-line summaries (31/31 matched) - Agent reads index, opens files only when relevant - Updated CLAUDE.md, evolve.md Step 1 and Step 6l Meta layer vision (8 new tasks #46-#53): - #46 Healer — between-session trend observer - #47 Multi-agent PR review panel - #48 Human escalation (gh issue create + webhook) - #49 Self-evaluation loop against real repos - #50 Prompt self-refinement via strategist - #51 Cross-session cost intelligence - #52 Codebase world model (MODULE_MAP.md) - #53 Agent generates its own tasks across all dimensions Also: OPERATIONS.md stale task queue + --squash example fixed
Summary
read_repo_instructions()now checksis_symlink()before reading instruction files. Prevents path-traversal attacks where a malicious repo symlinks CLAUDE.md to sensitive files like /etc/passwd.docs/prompt/directory listings and detects new files created during a cycle. Catches agents creating prompt files that bypass the fixedPROMPT_GUARD_FILESlist.Test plan
make check: 616 passing, all checks green