Skip to content

fix(change): enforce Step 9 scope guard against Step 5/6 allowlist (#1123)#1133

Open
Serhan-Asad wants to merge 14 commits into
mainfrom
fix/issue-1123-scope-guard
Open

fix(change): enforce Step 9 scope guard against Step 5/6 allowlist (#1123)#1133
Serhan-Asad wants to merge 14 commits into
mainfrom
fix/issue-1123-scope-guard

Conversation

@Serhan-Asad
Copy link
Copy Markdown
Collaborator

Summary

  • Adds an orchestrator-enforced scope guard to pdd change Step 9 that reverts any worktree edits outside an allowlist derived from Step 5 (docs to update / create / associated documents) and Step 6 (dev-unit prompt paths, dependencies, integration points, frontend prompts, direct-edit candidates). Code, example, and test files from Step 6 tables are intentionally not allowed — Step 9's contract is prompts + docs + scoped direct edits.
  • Uses a content-aware pre-snapshot of the worktree captured right after Step 8.5 preflight drift-heal. Step 8.5's mutations (.pdd/meta/<basename>_<lang>.json, _run.json, architecture.json) are recognized via (porcelain_status, sha256(content)) and excluded from the Step 9 delta. Step 9 modifying a Step 8.5 file on top, or reverting Step 8.5's mod back to HEAD, is detected and flagged.
  • On scope violation, the workflow stops with a clear SCOPE_VIOLATION: block (file list + reason) posted to the issue and saved into state. Preflight gates are cleared so resume re-runs Step 8.5 before re-attempting Step 9.

Closes #1123.

Why the existing helper wasn't enough

The existing _revert_out_of_scope_changes in agentic_common.py runs with -uno and never sees untracked files — exactly the shape of the drift in pdd_cloud#1602 (new examples/, tests/, unrelated prompts). This PR uses revert_out_of_scope_changes_with_dirs from agentic_common_worktree.py (which runs with -u) and extends it with two new optional parameters:

  • pre_snapshot: Mapping[str, Tuple[str, Optional[str]]] — skip entries that match the snapshot's (status, content_hash); existing two-arg callers (agentic_e2e_fix_orchestrator) are unchanged.
  • strict: bool = False — Step 9 caller passes strict=True so internal git/OSError failures propagate and stop the workflow rather than fail open. Other callers keep the silent-fail default.

Review loop

This change went through four iterations with codex as the reviewer:

  1. Round 1 — initial TDD scaffolding + implementation (24 tests).
  2. Round 2 — snapshot-based delta enforcement, .rst/.txt in fallback detection, fail-closed orchestrator handler. Addressed codex's first three blockers (under-allowlisted preflight metadata, fallback missing doc extensions, silent fail-open).
  3. Round 3 — content-aware snapshot via sha256, deletion detection, strict mode. Addressed codex's next three blockers (status-only delta was content-blind; deletions were invisible; helper still failed open internally).
  4. Round 4 — extended deletion pass to detect "dirty → clean" (Step 9 reverted Step 8.5's mod to HEAD); clear preflight_drift_healed on violation so resume re-runs Step 8.5. Addressed codex's final two blockers.

Final codex verdict: merge-ready.

Test plan

  • LC_ALL=C pytest tests/test_agentic_change_orchestrator.py tests/test_agentic_common_worktree.py tests/test_issue_1080_porcelain_scope_guards.py -m "not integration and not e2e and not real" --timeout=60 — 286 passed
  • pylint pdd/agentic_change_orchestrator.py pdd/agentic_common_worktree.py --errors-only — clean
  • git diff --check — clean
  • Manual smoke: re-walk the pdd_cloud#1600 / PR #1602 drift case mentally with the new guard:
    • 4 cloudbuild YAMLs (legitimate per Split Contract) — would survive only if Step 6 lists them as Direct Edit Candidates.
    • examples/gemini_triage_example.py, test_gemini_triage.py, extensions/.../gemini_triage.py, unrelated prompts/backend/{crm,hackathon}_* — all reverted (untracked → removed; tracked → git checkout HEAD --).
    • architecture.json Step 9 edits — reverted (Step 10 owns).
  • Verify the new SCOPE_VIOLATION GitHub comment renders correctly on an actual pdd change run (will land after merge; can't test against production without merging).

🤖 Generated with Claude Code

Serhan-Asad and others added 6 commits May 21, 2026 14:03
Add tests for Step 9 scope guard before implementing. Tests fail at
import: _parse_step6_devunit_prompts, _parse_step6_dependency_prompts,
_parse_step6_integration_prompts, _parse_step6_frontend_prompts,
_parse_step5_doc_paths, _build_step9_allowlist, and _enforce_step9_scope
don't exist yet.

Covers:
- Parser tests: prompt-only column extraction from Step 6 tables,
  placeholder-row skipping, dependency/integration/frontend selectivity,
  Step 5 section selectivity (Update + Create + Associated, no
  Conflicts / No Changes Needed).
- Allowlist builder tests: union of Step 5/6 paths + direct edits,
  exclusion of Step 6 code/example/test columns, preflight-healed
  prompts + companion .pdd/meta/<name>.json files, Step 5 Conflicts
  exclusion, architecture.json exclusion.
- Scope-guard tests over a real git worktree: reverts untracked
  examples/, tests/, unrelated prompts; reverts tracked
  architecture.json modifications; preserves in-scope prompts,
  direct edit candidates, associated docs, healed prompt + meta;
  no-op on clean worktree; tolerates missing worktree.
- Orchestrator integration: scope violation stops workflow with
  SCOPE_VIOLATION state, last_completed_step not advanced; clean
  Step 9 proceeds to Step 10.
Step 9 of `pdd change` is supposed to modify only prompts, docs, and
explicitly-listed Direct Edit Candidates. The LLM does not reliably
honor the scope from the prompt alone; in pdd_cloud#1600 it wrote ~30
unrelated files (CRM/hackathon prompts, new examples and tests,
extensions/* code) despite a precise Allowed write set in the issue
body.

This commit adds a deterministic post-Step-9 scope guard:

* New parsers (regex-based, in the same shape as the existing
  `_parse_direct_edit_candidates`):
    - `_parse_step6_devunit_prompts` — first column of MODIFY/CREATE
      tables, prompts only (code/example/test columns never leak in);
      placeholder rows containing `{...}` are skipped.
    - `_parse_step6_dependency_prompts` — `.prompt` bullets under
      Dependencies.
    - `_parse_step6_integration_prompts` — 3rd column of Integration
      Points; skips empty cells and `manual update needed`.
    - `_parse_step6_frontend_prompts` — 2nd column of Frontend Dev
      Units; skips `None`.
    - `_parse_step5_doc_paths` — `#### path` headers under Files to
      Update + Associated Documents, and `- path - purpose` bullets
      under Files to Create. Excludes Conflicts and No Changes Needed.

* `_build_step9_allowlist` unions the Step 5/6 paths, Direct Edit
  Candidates, preflight-healed prompts, and the `.pdd/meta/<basename>.json`
  companion files only for healed prompts. `architecture.json` (root
  and nested) is explicitly excluded — Step 10 owns architecture.
  No directory prefixes — exact paths only.

* `_enforce_step9_scope` is the orchestrator hook. It builds the
  allowlist from `context["step5_output"]`, `context["step6_output"]`,
  and `state["preflight_healed_prompt_paths"]`, delegates to the
  existing `revert_out_of_scope_changes_with_dirs` worktree helper
  (issue #1080's structured porcelain parser), and returns the list
  of reverted relative paths. Returns `[]` on missing worktree or
  non-`.git` paths so the guard is a no-op in non-worktree contexts.

* Orchestrator wiring (inside `if step_num == 9:`, before
  `_parse_changed_files`): when the guard returns a non-empty list,
  we append a `SCOPE_VIOLATION:` block to `step_output`, print a
  red console warning with the reverted paths, post a GitHub comment,
  mark `state["step_outputs"]["9"] = FAILED: SCOPE_VIOLATION...`,
  do NOT advance `last_completed_step` (matches the "no file changes"
  fallback at the same site), save state, and return False with
  `Stopped at step 9: Scope violation — ...`.

This is `pdd change` only; other agentic_* orchestrators are not
modified.
The Step 6 prompt template emits the Direct Edit Candidates table after
an italic description line plus a blank line:

    ### Direct Edit Candidates (No Prompt)
    *Files that need scoped direct edits because no corresponding prompt exists*

    | File | Edit Type | Markers Found |

The previous `_parse_direct_edit_candidates` regex required `|` to follow
the heading on the very next line, so it silently returned `[]` on real
outputs. Before #1123 that was a harmless no-op (Step 9's LLM also emits
DIRECT_EDITS markers separately), but with the new scope guard a parser
miss means Direct Edit Candidates are NOT in the allowlist and get
reverted — breaking the guard's own contract.

Loosen the regex to skip optional non-pipe, non-heading lines between
the section heading and the table. Add a regression test using the
real template format, and update the Step 5/6 fixtures to mirror what
the prompt templates actually emit.
…ement

Address codex round-2 review:
- Snapshot worktree status before Step 9 LLM; only enforce allowlist
  against paths Step 9 actually changed (status differs from pre-snapshot).
  Fixes false SCOPE_VIOLATION when Step 8.5 drift-heal mutates
  .pdd/meta/*_run.json and architecture.json.
- Add `.rst`/`.txt` to the worktree-fallback allowed extensions so
  Step 5 Associated Docs of those types don't break the "no changes" path.
- Guard now propagates enforcement errors; orchestrator catches and treats
  them as a workflow-stopping scope violation (no more silent fail-open).
- Case-insensitive section heading match; strip markdown emphasis in path
  parsers; cleanup test import positioning.

#1123

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rict mode

Address codex round-3 review:
- Snapshot now captures (porcelain_status, sha256_content_hex). Skip rule
  in revert_out_of_scope_changes_with_dirs requires BOTH to match — a
  status-only match isn't enough when Step 8.5 left a tracked file
  modified and Step 9 modified it again.
- Detect deletions of pre-snapshot files: iterate pre paths not in post
  status. Tracked deletions restored via `git checkout HEAD --`; untracked
  deletions can't auto-restore (Step 8.5 is idempotent) and surface as
  unrecoverable violations.
- Add `strict=True` mode to revert_out_of_scope_changes_with_dirs. The
  Step 9 caller uses it; other callers (e2e_fix) keep the silent-fail
  default to preserve their semantics.

#1123
…on violation

- Pre-snapshot deletion pass now also flags files that exist on disk but
  are absent from post-Step-9 git status (state went dirty -> clean). Step
  9 reverting Step 8.5's tracked-modification or removing an untracked
  meta file by overwriting it to HEAD content used to be invisible; we
  now mark these as unrecoverable scope violations.
- Scope-violation save path clears state["preflight_drift_healed"] /
  preflight_healed_worktree so resume re-runs Step 8.5. Without this,
  resume after a Step-9 revert of Step-8.5's mutations would leave the
  worktree permanently missing the heal.
- state["preflight_healed_prompt_paths"] is intentionally preserved so
  Step 10's doc-discovery sweep on retry still sees the originally healed
  prompts; the re-heal will regenerate them identically.
- Skipped the optional OSError double-log cleanup. The pattern only
  fires in strict mode and changing the raise/log flow would risk
  altering observable behavior tests already assert on; the duplicate
  is cosmetic per round-3 review notes.

#1123
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Comment thread pdd/agentic_change_orchestrator.py Fixed
@Serhan-Asad
Copy link
Copy Markdown
Collaborator Author

Step 7/8: Verification & Final Report (Iteration 1)

Test Results After Fixes

  • Total: 258 scope-guard tests run (9073 collected, 8815 deselected by -k filter)
  • Passed: 258
  • Failed: 0
  • Previously failing, now passing: 0 (Step 5 was a provider timeout, not a real failure)
  • New failures: 0

Build Status

Pass — all PR-scope modules import cleanly (agentic_change_orchestrator, agentic_common_worktree, git_porcelain, sync_order, architecture_sync).

Overall Status

All clear.

Issue Alignment

issue_aligned: true — the branch's round 1–4 commits add an orchestrator-enforced post-Step 9 allowlist (snapshot-based scope guard, Step 5/6 dev-unit + Direct Edit Candidate parsing, deletion + content-revert detection, fail-closed enforcement) exactly as #1123 requests.

Issues Summary

Severity Category Module Description Fixed
No issues found in PR scope

Checkup complete.

All Issues Fixed

{
  "success": true,
  "message": "PR #1133 implements the post-Step 9 write-scope guard requested by issue #1123: snapshot-based diff against a Step 5/6-derived allowlist with deletion and content-revert detection and fail-closed enforcement. All 258 targeted scope-guard/allowlist/direct-edit/snapshot/step9/porcelain tests pass; package imports clean. No critical findings.",
  "tech_stack": ["python", "click", "pydantic-v2", "litellm", "pytest"],
  "issue_aligned": true,
  "issues": [],
  "changed_files": []
}

PR Push Status

Pushed fixes to PR branch.

…1123)

- Rewrite _parse_direct_edit_candidates without regex backtracking
  (CodeQL py/redos alert at pdd/agentic_change_orchestrator.py:741).
  The previous regex had overlapping alternatives that caused
  exponential backtracking on N blank lines after the heading.
- _parse_step5_doc_paths now reconciles Associated Documents against
  Step 6's confirmed dev-unit prompt set: only docs whose
  "Discovered via" originating prompt is in Step 6 are allowlisted.
  Matches the reconciliation rule the Step 9 prompt already enforces;
  closes the defense-in-depth gap where a misbehaving Step 9 LLM
  could edit a stale doc and slip past the guard.
- Update prompts/agentic_common_worktree_python.prompt to describe
  the new revert_out_of_scope_changes_with_dirs signature
  (+ pre_snapshot, + strict) and the new _hash_file_content helper,
  so `pdd sync` regeneration doesn't quietly lose the fix.
- Delete stray empty new_file.txt the checkup commit introduced.
Serhan-Asad and others added 6 commits May 21, 2026 17:19
…t, README

- Parse the documented include-graph format (`prompts/foo.prompt` →
  `<include>docs/bar.md</include>`) in Step 5 Discovered via lines.
  Previous regex anchored to $ and rejected backticks, so on real
  Step 5 output it never matched and the reconciliation gate became a
  no-op. Two-step extraction now: pull the Discovered via clause, then
  match the first .prompt path within it.
- Update prompts/agentic_change_orchestrator_python.prompt with a Step
  9 Scope Guard section: pre-snapshot capture, _enforce_step9_scope
  call site, SCOPE_VIOLATION stop path, preflight-clear-on-violation,
  architecture.json exclusion. Closes the regression risk where `pdd
  sync` / auto-heal could regenerate the orchestrator and silently
  drop the new guard.
- README Workflow Resumption now lists Step 9 as the third pause
  point alongside Steps 4 and 7, with the recovery action (refine
  contract, rerun pdd change; preflight re-runs automatically).

#1123
…ment body

- agentic_change_orchestrator imports pdd.agentic_common_worktree;
  add agentic_common_worktree_python.prompt to the prompt's
  <pdd-dependency> block and to architecture.json's dependencies
  array so `pdd sync` / auto-heal regeneration sees the dependency.
- SCOPE_VIOLATION GitHub comment now passes an explicit `body`
  argument with the reverted-paths list at the top, bypassing the
  post_step_comment fallback that truncates output to the first
  1000 chars. README claim that the comment lists reverted paths
  was a false promise for long Step 9 outputs; now it holds.
- Apply the same explicit-body pattern to the scope_guard_error
  branch so the strict-mode failure surface is consistent.

Out of scope: auto-heal Cloud Build still fails on backend/generated
(prompts/generated_python.prompt set but missing on disk) — this is
a pre-existing CI infra issue unrelated to PR #1133 (no such prompt
or arch entry exists in this repo).

#1123

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git checkout HEAD -- <path> fails for paths not in HEAD, so staged
additions (status X=A: A , AM) survived the scope guard and could
land in the PR via Step 13's git add -A. Handle them like copy
destinations: git reset HEAD -- <path> then unlink, same pattern
already used for C-status entries. strict=True propagates failures.

Regression tests use a real git repo: stage a new file (A ) and a
staged+modified file (AM); assert the guard removes both and leaves
no trace in post-status.

#1123
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ns and copies

git reset HEAD -- <path> could fail silently (non-zero exit, ignored
return code), leaving the staged blob in the index. The code then
unlinked the file and reported it as reverted. Step 13's git add -A
would then reference a missing or still-staged path.

Fix: capture the CompletedProcess from reset; on non-zero return,
log a warning and raise under strict=True WITHOUT proceeding to
unlink. Applied to both the copy-destination branch and the new
staged-addition branch (status X=A). Matching change to
pdd/prompts/agentic_common_worktree_python.prompt so the fix survives
pdd sync regeneration.

Tests: staged_addition_reset_failure_strict asserts OSError is raised
and unlink is not attempted; copy_reset_failure_strict covers the
copy branch. Both use a fake subprocess.run that returns rc=1 for
reset.

#1123
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The rename branch called git reset HEAD -- <new> <old> without
capturing the return code, then unconditionally proceeded to
git checkout HEAD -- <old> and unlink. A failed reset left the
staged rename in the index while the code reported it as reverted;
Step 13's git add -A could then include the leaked paths.

Fix: capture the CompletedProcess; on non-zero return, log a warning
and raise under strict=True WITHOUT proceeding to checkout/unlink.
Consistent with the same fix applied to copy destinations and staged
additions in round-9.

Test: rename_reset_failure_strict uses a fake subprocess.run that
returns rc=1 for reset; asserts OSError is raised.

#1123
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t sync

Three issues found in review:

1. Symlink bypass: _path_in_scope called .resolve() on symlinks, so a
   symlink at an out-of-scope path whose target was allowlisted was
   treated as in-scope. Fix: check is_symlink() first; symlinks are
   always out-of-scope regardless of target (the guard enforces scope
   on the path Step 9 created, not the content it points at).

2. Checkout-before-unlink ordering in rename revert: unlink(new) ran
   before checking checkout(old) return code, leaving both sides of
   the rename damaged on checkout failure. Fix: only unlink AFTER
   checkout return code is confirmed 0.

3. Source prompt (agentic_common_worktree_python.prompt) did not
   document the rename reset return-code invariant, checkout-before-
   unlink ordering, or symlink scope rule. Updated so pdd sync
   regeneration preserves all three behaviors.

Nit: fix two mock tests that used diff-style C100/R100 porcelain
records instead of the real git status --porcelain=v1 -z format
(C  src\0dst\0 and R  old\0new\0).

New tests: symlink_not_bypassed_by_target (real git repo + symlink),
rename_checkout_failure_does_not_unlink (tracks unlink() calls via
patch.object, asserts none occur when checkout fails).

#1123
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdd change: enforce subissue write scope after Step 9 direct edits

3 participants