fix(change): enforce Step 9 scope guard against Step 5/6 allowlist (#1123) by Serhan-Asad · Pull Request #1133 · promptdriven/pdd

Serhan-Asad · 2026-05-21T22:41:00Z

Summary

Adds an orchestrator-enforced scope guard to pdd change Step 9 that reverts any worktree edits outside an allowlist derived from Step 5 (docs to update / create / associated documents) and Step 6 (dev-unit prompt paths, dependencies, integration points, frontend prompts, direct-edit candidates). Code, example, and test files from Step 6 tables are intentionally not allowed — Step 9's contract is prompts + docs + scoped direct edits.
Uses a content-aware pre-snapshot of the worktree captured right after Step 8.5 preflight drift-heal. Step 8.5's mutations (.pdd/meta/<basename>_<lang>.json, _run.json, architecture.json) are recognized via (porcelain_status, sha256(content)) and excluded from the Step 9 delta. Step 9 modifying a Step 8.5 file on top, or reverting Step 8.5's mod back to HEAD, is detected and flagged.
On scope violation, the workflow stops with a clear SCOPE_VIOLATION: block (file list + reason) posted to the issue and saved into state. Preflight gates are cleared so resume re-runs Step 8.5 before re-attempting Step 9.

Closes #1123.

Why the existing helper wasn't enough

The existing _revert_out_of_scope_changes in agentic_common.py runs with -uno and never sees untracked files — exactly the shape of the drift in pdd_cloud#1602 (new examples/, tests/, unrelated prompts). This PR uses revert_out_of_scope_changes_with_dirs from agentic_common_worktree.py (which runs with -u) and extends it with two new optional parameters:

pre_snapshot: Mapping[str, Tuple[str, Optional[str]]] — skip entries that match the snapshot's (status, content_hash); existing two-arg callers (agentic_e2e_fix_orchestrator) are unchanged.
strict: bool = False — Step 9 caller passes strict=True so internal git/OSError failures propagate and stop the workflow rather than fail open. Other callers keep the silent-fail default.

Review loop

This change went through four iterations with codex as the reviewer:

Round 1 — initial TDD scaffolding + implementation (24 tests).
Round 2 — snapshot-based delta enforcement, .rst/.txt in fallback detection, fail-closed orchestrator handler. Addressed codex's first three blockers (under-allowlisted preflight metadata, fallback missing doc extensions, silent fail-open).
Round 3 — content-aware snapshot via sha256, deletion detection, strict mode. Addressed codex's next three blockers (status-only delta was content-blind; deletions were invisible; helper still failed open internally).
Round 4 — extended deletion pass to detect "dirty → clean" (Step 9 reverted Step 8.5's mod to HEAD); clear preflight_drift_healed on violation so resume re-runs Step 8.5. Addressed codex's final two blockers.

Final codex verdict: merge-ready.

Test plan

LC_ALL=C pytest tests/test_agentic_change_orchestrator.py tests/test_agentic_common_worktree.py tests/test_issue_1080_porcelain_scope_guards.py -m "not integration and not e2e and not real" --timeout=60 — 286 passed
pylint pdd/agentic_change_orchestrator.py pdd/agentic_common_worktree.py --errors-only — clean
git diff --check — clean
Manual smoke: re-walk the pdd_cloud#1600 / PR #1602 drift case mentally with the new guard:
- 4 cloudbuild YAMLs (legitimate per Split Contract) — would survive only if Step 6 lists them as Direct Edit Candidates.
- examples/gemini_triage_example.py, test_gemini_triage.py, extensions/.../gemini_triage.py, unrelated prompts/backend/{crm,hackathon}_* — all reverted (untracked → removed; tracked → git checkout HEAD --).
- architecture.json Step 9 edits — reverted (Step 10 owns).
Verify the new SCOPE_VIOLATION GitHub comment renders correctly on an actual pdd change run (will land after merge; can't test against production without merging).

🤖 Generated with Claude Code

Add tests for Step 9 scope guard before implementing. Tests fail at import: _parse_step6_devunit_prompts, _parse_step6_dependency_prompts, _parse_step6_integration_prompts, _parse_step6_frontend_prompts, _parse_step5_doc_paths, _build_step9_allowlist, and _enforce_step9_scope don't exist yet. Covers: - Parser tests: prompt-only column extraction from Step 6 tables, placeholder-row skipping, dependency/integration/frontend selectivity, Step 5 section selectivity (Update + Create + Associated, no Conflicts / No Changes Needed). - Allowlist builder tests: union of Step 5/6 paths + direct edits, exclusion of Step 6 code/example/test columns, preflight-healed prompts + companion .pdd/meta/<name>.json files, Step 5 Conflicts exclusion, architecture.json exclusion. - Scope-guard tests over a real git worktree: reverts untracked examples/, tests/, unrelated prompts; reverts tracked architecture.json modifications; preserves in-scope prompts, direct edit candidates, associated docs, healed prompt + meta; no-op on clean worktree; tolerates missing worktree. - Orchestrator integration: scope violation stops workflow with SCOPE_VIOLATION state, last_completed_step not advanced; clean Step 9 proceeds to Step 10.

Step 9 of `pdd change` is supposed to modify only prompts, docs, and explicitly-listed Direct Edit Candidates. The LLM does not reliably honor the scope from the prompt alone; in pdd_cloud#1600 it wrote ~30 unrelated files (CRM/hackathon prompts, new examples and tests, extensions/* code) despite a precise Allowed write set in the issue body. This commit adds a deterministic post-Step-9 scope guard: * New parsers (regex-based, in the same shape as the existing `_parse_direct_edit_candidates`): - `_parse_step6_devunit_prompts` — first column of MODIFY/CREATE tables, prompts only (code/example/test columns never leak in); placeholder rows containing `{...}` are skipped. - `_parse_step6_dependency_prompts` — `.prompt` bullets under Dependencies. - `_parse_step6_integration_prompts` — 3rd column of Integration Points; skips empty cells and `manual update needed`. - `_parse_step6_frontend_prompts` — 2nd column of Frontend Dev Units; skips `None`. - `_parse_step5_doc_paths` — `#### path` headers under Files to Update + Associated Documents, and `- path - purpose` bullets under Files to Create. Excludes Conflicts and No Changes Needed. * `_build_step9_allowlist` unions the Step 5/6 paths, Direct Edit Candidates, preflight-healed prompts, and the `.pdd/meta/<basename>.json` companion files only for healed prompts. `architecture.json` (root and nested) is explicitly excluded — Step 10 owns architecture. No directory prefixes — exact paths only. * `_enforce_step9_scope` is the orchestrator hook. It builds the allowlist from `context["step5_output"]`, `context["step6_output"]`, and `state["preflight_healed_prompt_paths"]`, delegates to the existing `revert_out_of_scope_changes_with_dirs` worktree helper (issue #1080's structured porcelain parser), and returns the list of reverted relative paths. Returns `[]` on missing worktree or non-`.git` paths so the guard is a no-op in non-worktree contexts. * Orchestrator wiring (inside `if step_num == 9:`, before `_parse_changed_files`): when the guard returns a non-empty list, we append a `SCOPE_VIOLATION:` block to `step_output`, print a red console warning with the reverted paths, post a GitHub comment, mark `state["step_outputs"]["9"] = FAILED: SCOPE_VIOLATION...`, do NOT advance `last_completed_step` (matches the "no file changes" fallback at the same site), save state, and return False with `Stopped at step 9: Scope violation — ...`. This is `pdd change` only; other agentic_* orchestrators are not modified.

The Step 6 prompt template emits the Direct Edit Candidates table after an italic description line plus a blank line: ### Direct Edit Candidates (No Prompt) *Files that need scoped direct edits because no corresponding prompt exists* | File | Edit Type | Markers Found | The previous `_parse_direct_edit_candidates` regex required `|` to follow the heading on the very next line, so it silently returned `[]` on real outputs. Before #1123 that was a harmless no-op (Step 9's LLM also emits DIRECT_EDITS markers separately), but with the new scope guard a parser miss means Direct Edit Candidates are NOT in the allowlist and get reverted — breaking the guard's own contract. Loosen the regex to skip optional non-pipe, non-heading lines between the section heading and the table. Add a regression test using the real template format, and update the Step 5/6 fixtures to mirror what the prompt templates actually emit.

…ement Address codex round-2 review: - Snapshot worktree status before Step 9 LLM; only enforce allowlist against paths Step 9 actually changed (status differs from pre-snapshot). Fixes false SCOPE_VIOLATION when Step 8.5 drift-heal mutates .pdd/meta/*_run.json and architecture.json. - Add `.rst`/`.txt` to the worktree-fallback allowed extensions so Step 5 Associated Docs of those types don't break the "no changes" path. - Guard now propagates enforcement errors; orchestrator catches and treats them as a workflow-stopping scope violation (no more silent fail-open). - Case-insensitive section heading match; strip markdown emphasis in path parsers; cleanup test import positioning. #1123 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rict mode Address codex round-3 review: - Snapshot now captures (porcelain_status, sha256_content_hex). Skip rule in revert_out_of_scope_changes_with_dirs requires BOTH to match — a status-only match isn't enough when Step 8.5 left a tracked file modified and Step 9 modified it again. - Detect deletions of pre-snapshot files: iterate pre paths not in post status. Tracked deletions restored via `git checkout HEAD --`; untracked deletions can't auto-restore (Step 8.5 is idempotent) and surface as unrecoverable violations. - Add `strict=True` mode to revert_out_of_scope_changes_with_dirs. The Step 9 caller uses it; other callers (e2e_fix) keep the silent-fail default to preserve their semantics. #1123

…on violation - Pre-snapshot deletion pass now also flags files that exist on disk but are absent from post-Step-9 git status (state went dirty -> clean). Step 9 reverting Step 8.5's tracked-modification or removing an untracked meta file by overwriting it to HEAD content used to be invisible; we now mark these as unrecoverable scope violations. - Scope-violation save path clears state["preflight_drift_healed"] / preflight_healed_worktree so resume re-runs Step 8.5. Without this, resume after a Step-9 revert of Step-8.5's mutations would leave the worktree permanently missing the heal. - state["preflight_healed_prompt_paths"] is intentionally preserved so Step 10's doc-discovery sweep on retry still sees the originally healed prompts; the re-heal will regenerate them identically. - Skipped the optional OSError double-log cleanup. The pattern only fires in strict mode and changing the raise/log flow would risk altering observable behavior tests already assert on; the duplicate is cosmetic per round-3 review notes. #1123

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Serhan-Asad · 2026-05-21T23:22:27Z

Step 7/8: Verification & Final Report (Iteration 1)

Test Results After Fixes

Total: 258 scope-guard tests run (9073 collected, 8815 deselected by -k filter)
Passed: 258
Failed: 0
Previously failing, now passing: 0 (Step 5 was a provider timeout, not a real failure)
New failures: 0

Build Status

Pass — all PR-scope modules import cleanly (agentic_change_orchestrator, agentic_common_worktree, git_porcelain, sync_order, architecture_sync).

Overall Status

All clear.

Issue Alignment

issue_aligned: true — the branch's round 1–4 commits add an orchestrator-enforced post-Step 9 allowlist (snapshot-based scope guard, Step 5/6 dev-unit + Direct Edit Candidate parsing, deletion + content-revert detection, fail-closed enforcement) exactly as #1123 requests.

Issues Summary

Severity	Category	Module	Description	Fixed
—	—	—	No issues found in PR scope	—

Checkup complete.

All Issues Fixed

{
  "success": true,
  "message": "PR #1133 implements the post-Step 9 write-scope guard requested by issue #1123: snapshot-based diff against a Step 5/6-derived allowlist with deletion and content-revert detection and fail-closed enforcement. All 258 targeted scope-guard/allowlist/direct-edit/snapshot/step9/porcelain tests pass; package imports clean. No critical findings.",
  "tech_stack": ["python", "click", "pydantic-v2", "litellm", "pytest"],
  "issue_aligned": true,
  "issues": [],
  "changed_files": []
}

PR Push Status

Pushed fixes to PR branch.

…1123) - Rewrite _parse_direct_edit_candidates without regex backtracking (CodeQL py/redos alert at pdd/agentic_change_orchestrator.py:741). The previous regex had overlapping alternatives that caused exponential backtracking on N blank lines after the heading. - _parse_step5_doc_paths now reconciles Associated Documents against Step 6's confirmed dev-unit prompt set: only docs whose "Discovered via" originating prompt is in Step 6 are allowlisted. Matches the reconciliation rule the Step 9 prompt already enforces; closes the defense-in-depth gap where a misbehaving Step 9 LLM could edit a stale doc and slip past the guard. - Update prompts/agentic_common_worktree_python.prompt to describe the new revert_out_of_scope_changes_with_dirs signature (+ pre_snapshot, + strict) and the new _hash_file_content helper, so `pdd sync` regeneration doesn't quietly lose the fix. - Delete stray empty new_file.txt the checkup commit introduced.

…t, README - Parse the documented include-graph format (`prompts/foo.prompt` → `<include>docs/bar.md</include>`) in Step 5 Discovered via lines. Previous regex anchored to $ and rejected backticks, so on real Step 5 output it never matched and the reconciliation gate became a no-op. Two-step extraction now: pull the Discovered via clause, then match the first .prompt path within it. - Update prompts/agentic_change_orchestrator_python.prompt with a Step 9 Scope Guard section: pre-snapshot capture, _enforce_step9_scope call site, SCOPE_VIOLATION stop path, preflight-clear-on-violation, architecture.json exclusion. Closes the regression risk where `pdd sync` / auto-heal could regenerate the orchestrator and silently drop the new guard. - README Workflow Resumption now lists Step 9 as the third pause point alongside Steps 4 and 7, with the recovery action (refine contract, rerun pdd change; preflight re-runs automatically). #1123

…ment body - agentic_change_orchestrator imports pdd.agentic_common_worktree; add agentic_common_worktree_python.prompt to the prompt's <pdd-dependency> block and to architecture.json's dependencies array so `pdd sync` / auto-heal regeneration sees the dependency. - SCOPE_VIOLATION GitHub comment now passes an explicit `body` argument with the reverted-paths list at the top, bypassing the post_step_comment fallback that truncates output to the first 1000 chars. README claim that the comment lists reverted paths was a false promise for long Step 9 outputs; now it holds. - Apply the same explicit-body pattern to the scope_guard_error branch so the strict-mode failure surface is consistent. Out of scope: auto-heal Cloud Build still fails on backend/generated (prompts/generated_python.prompt set but missing on disk) — this is a pre-existing CI infra issue unrelated to PR #1133 (no such prompt or arch entry exists in this repo). #1123 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

git checkout HEAD -- <path> fails for paths not in HEAD, so staged additions (status X=A: A , AM) survived the scope guard and could land in the PR via Step 13's git add -A. Handle them like copy destinations: git reset HEAD -- <path> then unlink, same pattern already used for C-status entries. strict=True propagates failures. Regression tests use a real git repo: stage a new file (A ) and a staged+modified file (AM); assert the guard removes both and leaves no trace in post-status. #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ns and copies git reset HEAD -- <path> could fail silently (non-zero exit, ignored return code), leaving the staged blob in the index. The code then unlinked the file and reported it as reverted. Step 13's git add -A would then reference a missing or still-staged path. Fix: capture the CompletedProcess from reset; on non-zero return, log a warning and raise under strict=True WITHOUT proceeding to unlink. Applied to both the copy-destination branch and the new staged-addition branch (status X=A). Matching change to pdd/prompts/agentic_common_worktree_python.prompt so the fix survives pdd sync regeneration. Tests: staged_addition_reset_failure_strict asserts OSError is raised and unlink is not attempted; copy_reset_failure_strict covers the copy branch. Both use a fake subprocess.run that returns rc=1 for reset. #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The rename branch called git reset HEAD -- <new> <old> without capturing the return code, then unconditionally proceeded to git checkout HEAD -- <old> and unlink. A failed reset left the staged rename in the index while the code reported it as reverted; Step 13's git add -A could then include the leaked paths. Fix: capture the CompletedProcess; on non-zero return, log a warning and raise under strict=True WITHOUT proceeding to checkout/unlink. Consistent with the same fix applied to copy destinations and staged additions in round-9. Test: rename_reset_failure_strict uses a fake subprocess.run that returns rc=1 for reset; asserts OSError is raised. #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…t sync Three issues found in review: 1. Symlink bypass: _path_in_scope called .resolve() on symlinks, so a symlink at an out-of-scope path whose target was allowlisted was treated as in-scope. Fix: check is_symlink() first; symlinks are always out-of-scope regardless of target (the guard enforces scope on the path Step 9 created, not the content it points at). 2. Checkout-before-unlink ordering in rename revert: unlink(new) ran before checking checkout(old) return code, leaving both sides of the rename damaged on checkout failure. Fix: only unlink AFTER checkout return code is confirmed 0. 3. Source prompt (agentic_common_worktree_python.prompt) did not document the rename reset return-code invariant, checkout-before- unlink ordering, or symlink scope rule. Updated so pdd sync regeneration preserves all three behaviors. Nit: fix two mock tests that used diff-style C100/R100 porcelain records instead of the real git status --porcelain=v1 -z format (C src\0dst\0 and R old\0new\0). New tests: symlink_not_bypassed_by_target (real git repo + symlink), rename_checkout_failure_does_not_unlink (tracks unlink() calls via patch.object, asserts none occur when checkout fails). #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Serhan-Asad and others added 6 commits May 21, 2026 14:03

greptile-apps Bot reviewed May 21, 2026

View reviewed changes

github-advanced-security AI found potential problems May 21, 2026

View reviewed changes

Comment thread pdd/agentic_change_orchestrator.py Fixed

fix: apply checkup fixes for #1123

c709fa0

Serhan-Asad mentioned this pull request May 22, 2026

fix(sync): schedule cyclic dependency SCCs instead of deadlocking (#1125) #1130

Merged

5 tasks

Serhan-Asad and others added 6 commits May 21, 2026 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(change): enforce Step 9 scope guard against Step 5/6 allowlist (#1123)#1133

fix(change): enforce Step 9 scope guard against Step 5/6 allowlist (#1123)#1133
Serhan-Asad wants to merge 14 commits into
mainfrom
fix/issue-1123-scope-guard

Serhan-Asad commented May 21, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

Uh oh!

Serhan-Asad commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Serhan-Asad commented May 21, 2026

Summary

Why the existing helper wasn't enough

Review loop

Test plan

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Serhan-Asad commented May 21, 2026

Step 7/8: Verification & Final Report (Iteration 1)

Test Results After Fixes

Build Status

Overall Status

Issue Alignment

Issues Summary

PR Push Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants