Skip to content

security: close prompt-alert pre-poisoning path (pentest W1)#173

Merged
fazxes merged 1 commit intomainfrom
fix/pentest-prompt-alert-pre-poison-w1
Apr 6, 2026
Merged

security: close prompt-alert pre-poisoning path (pentest W1)#173
fazxes merged 1 commit intomainfrom
fix/pentest-prompt-alert-pre-poison-w1

Conversation

@fazxes
Copy link
Copy Markdown
Member

@fazxes fazxes commented Apr 6, 2026

Summary

  • rm -f "$PROMPT_ALERT" added immediately before check_prompt_integrity in the pentest preflight block in daemon.sh
  • Closes W1: pentest agent could pre-write docs/sessions/prompt-alert.md (gitignored, survives git clean -fd) to inject arbitrary content into the builder prompt on the next cycle
  • 2 regression tests in TestPentestAgentPromptAlertPoisoningGuard
  • W2 (webhook SSRF via non-merge PR filter) tracked as low-priority task docs: restructure README around two distinct products #186 — no code fix needed now

Test plan

  • make check passes (1104 tests)
  • test_rm_prompt_alert_present_before_check_prompt_integrity: verifies rm -f precedes check_prompt_integrity
  • test_rm_prompt_alert_is_adjacent_to_check_prompt_integrity: verifies they stay within 10 lines

daemon.sh now runs rm -f "$PROMPT_ALERT" immediately before
check_prompt_integrity in the pentest preflight block.

docs/sessions/ is gitignored so reset_repo_state (git clean -fd) does NOT
remove prompt-alert.md. A pentest agent that wrote to the file during its
120-turn window would have had that content injected into the builder prompt
on the next cycle as a legitimate alert. check_prompt_integrity is now the
only writer: any pentest-agent-written content is cleared first.

2 regression tests. W2 (webhook SSRF via PR non-merge filter) tracked as
low-priority task #186 -- no fix urgency per pentest report.
@fazxes fazxes merged commit 312c928 into main Apr 6, 2026
7 checks passed
@fazxes fazxes deleted the fix/pentest-prompt-alert-pre-poison-w1 branch April 6, 2026 23:48
fazxes added a commit that referenced this pull request Apr 9, 2026
…one)

Queue before: 72 pending + 9 wontfix-in-active-dir
Queue after: 65 pending + 0 wontfix (all converted to done for archiving)

Merged into primary tasks (5 closures):
- #175 -> #174: both add tests to TestAuthFailureDetection, same PR
- #163 -> #162: both are scoring module tests from PR #158 review, same PR
- #124 -> #122: both validate doc snapshot consistency, same PR scope
- #196 -> #173: both add entries to PROMPT_GUARD_FILES in lib-agent.sh
- #180 -> #179: both touch _is_valid_eval_file() in pick-role.py, same PR

Closed as obsolete (1):
- #78: references non-existent "evolve.md Step 8" and the multi-agent
  review panel replaced by unified review in PR #107

Closed as low-value (1):
- #230: _DELEGATION_ROLE_MAP covers all 8 current agent types; new agent
  types require major framework work making the map update obvious

Converted wontfix -> done for archiving (9):
- #77, #80, #107, #111, #115, #119, #127, #129, #134
  All had wontfix status with rationale already documented; changed to
  done so daemon's archive_done_tasks() housekeeping removes them
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant