Skip to content

security: close two pentest findings -- origin guard blind spot + code-block bypass#181

Merged
fazxes merged 1 commit intomainfrom
fix/pentest-0194-origin-guard-and-0188-codeblock
Apr 7, 2026
Merged

security: close two pentest findings -- origin guard blind spot + code-block bypass#181
fazxes merged 1 commit intomainfrom
fix/pentest-0194-origin-guard-and-0188-codeblock

Conversation

@fazxes
Copy link
Copy Markdown
Member

@fazxes fazxes commented Apr 7, 2026

Summary

Test plan

  • make check — 1136 passed, 0 failed

…e-block bypass

Finding 1 (new, 2026-04-07): check_origin_integrity did not detect modifications
to EXISTING eval/autonomy files pushed directly to origin/main.  Only new files
and named PROMPT_GUARD_FILES were covered.  An agent could falsify an existing
eval report to defeat the EVAL SCORE GATE.

Fix: add PROMPT_GUARD_CONTENT_DIRS (docs/evaluations, docs/autonomy).
save_prompt_snapshots now saves per-file content snapshots of all existing
[0-9]*.md files in those dirs.  check_origin_integrity now has a Loop 3 that
compares each snapshot against current origin/main, applying the same
non-merge-commit filter used in Loop 1 to avoid false positives.

Finding 2 (#188, elevated): _is_valid_eval_file and _is_valid_autonomy_file
accepted **Date**: and score fields embedded inside fenced code blocks.

Fix: add _strip_fenced_code_blocks() helper.  Both validators now check for
**Date**: in the prose-only (stripped) text.  TOTAL: in _is_valid_autonomy_file
is still checked on the full text because the canonical report format embeds
the score table in a code block.

Tests: +2 rejection tests (TestIsValidEvalFile, TestIsValidAutonomyFile).
1136 passed.  make check clean.
Task #188 closed (done).  Task #194 created + closed (done, fixed this session).
@fazxes fazxes merged commit 5a9394d into main Apr 7, 2026
3 checks passed
@fazxes fazxes deleted the fix/pentest-0194-origin-guard-and-0188-codeblock branch April 7, 2026 12:52
fazxes added a commit that referenced this pull request Apr 7, 2026
Closed 20 tasks with evidence:
- DONE (2): #73 (AGENTS.md created), #181 (docs/prompt/ deleted)
- WONTFIX-OBSOLETE (5): #78, #89, #128, #141, #157
  (reference docs/prompt/ or docs/ops/ paths deleted in session #103)
- WONTFIX-DUPLICATE (1): #88 (subset of #69)
- WONTFIX-NEVER-PICKED (12): #66, #69, #90, #96, #112,
  #114, #120, #123, #132, #133, #138, #145
  (low priority, 20-80+ sessions without being picked, speculative)

Priority fix: #103 downgraded from urgent to normal (umbrella epic,
not an actionable urgent fix).
fazxes added a commit that referenced this pull request Apr 7, 2026
…st tasks

Queue cleanup after session #103 major restructuring:

Closed (8):
- #73: AGENTS.md exists (commit 38e1fe5)
- #88: duplicate of #69 (auto-changelog)
- #141: obsolete (docs/prompt/evolve.md deleted)
- #157: obsolete (docs/prompt/feedback/ deleted)
- #159: consolidated into #190
- #161: consolidated into #190
- #181: obsolete (docs/prompt/unified.md deleted)
- #184: done (fixed by PR #179)

Path updates (30+ tasks):
- docs/ -> .recursive/ or Recursive/ops/
- scripts/daemon.sh -> Recursive/engine/daemon.sh
- scripts/lib-agent.sh -> Recursive/engine/lib-agent.sh
- .nightshift.json -> .recursive.json
- nightshift/*.py -> nightshift/{core,owl,raven,infra}/*.py

Pentest tasks created (4):
- #194: budget limiter triple-failure (CONFIRMED)
- #195: python3 -c path interpolation (CONFIRMED)
- #196: .recursive.json prompt guard (THEORETICAL)
- #197: costs.json negative value validation (THEORETICAL)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant