-
Notifications
You must be signed in to change notification settings - Fork 0
Pipeline Plan 417
I'll provide the comprehensive implementation plan directly below:
When the build loop completes, the pipeline immediately enters stage_review() function. However, there is no verification that the code changes actually match what was planned. A file listed in the plan could be completely untouched, or a file could be modified that was never planned—but neither is caught before burning review/audit/CQ quota.
-
Missing detection:
stage_review()(line 6 inpipeline-stages-review.sh) runs immediately after build with zero consistency checking - Silent drift: Plan-to-build divergence goes undetected until full review cycle
- Class of failure: Repeatable pipeline failure mode: planned work is incomplete, or out-of-scope changes introduce unplanned complexity
- Wasted AI review cycles on unplanned changes
- Missed detection of incomplete builds (planned files untouched)
- No early warning before heavy processing
-
scripts/lib/pipeline-stages-review.sh— Add drift detection helper + integration (~38 lines total)
-
Add
detect_plan_drift()function beforestage_review()- Input:
$1=artifacts_dir,$2=project_root - Output: String of drift warnings (or empty if no drift)
- Logic:
- Read
$artifacts_dir/plan.md - Extract filenames from
## Files to Modifysection (parse markdown bullets) - Get
git diff --name-only HEAD(actual changed files) - Find planned files NOT in actual changes
- Format as
[DRIFT-WARNING] Planned file not modified: <filepath>for each untouched file
- Read
- Error handling: Return empty string (fail-open) if plan.md missing, parse fails, or git fails
- Input:
-
Handle edge cases
-
plan.mdmissing → return "" (no warnings, continue review) -
## Files to Modifynot found → return "" (no warnings) - git diff fails → return "" with warning logged
- All planned files modified → return "" (no drift)
-
-
Call
detect_plan_drift()fromstage_review()- Location: Between ruflo review completion (line 79) and native review prompt construction (line 82)
- Code:
local drift_warnings; drift_warnings=$(detect_plan_drift "$ARTIFACTS_DIR" "$PROJECT_ROOT" 2>/dev/null || true)
-
Inject warnings into
review_prompt- If
[[ -n "$drift_warnings" ]], append section before diff:review_prompt+=" ## Cross-Stage Drift Detected The following files were planned but not modified by the build: ${drift_warnings} Reviewer: Verify whether these files were intentionally skipped or represent incomplete implementation. "
- If
-
Add event logging
- When drift detected:
emit_event "review.drift_detected" "issue=${ISSUE_NUMBER:-0}" - Allows tracking of plan-build divergence frequency
- When drift detected:
-
Create unit test file
scripts/sw-lib-pipeline-stages-review-test.sh- Test case 1: plan.md lists 2 files, git diff shows 1 → expect 1 drift warning
- Test case 2: plan.md lists 2 files, git diff shows 2 → expect no warnings
- Test case 3: plan.md missing → expect no warnings (fail-open)
- Test case 4: git diff fails → expect no warnings (fail-open)
- Test case 5: plan.md has no
## Files to Modify→ expect no warnings
-
Create integration test
- Setup real git repo with commits
- Create plan.md with 3 files
- Create diff showing only 2 files modified
- Run
stage_review() - Verify
review.mdcontains[DRIFT-WARNING]markers
-
Run
npm test- Verify no regressions in existing tests
- Ensure new tests discovered and pass
- Task 1: Implement
detect_plan_drift()helper function with error handling - Task 2: Unit test — 5 test cases for
detect_plan_drift()covering happy path and edge cases - Task 3: Call
detect_plan_drift()fromstage_review()around line 60 - Task 4: Inject drift warnings into review_prompt variable
- Task 5: Add event logging for drift detection (
emit_event "review.drift_detected") - Task 6: Create integration test with full review stage
- Task 7: Run
npm testand verify all tests pass - Task 8: Add inline documentation to helper function
- Task 9: Verify fail-open behavior with corrupted plan.md (manual test)
- Task 10: Final validation — test suite passes, no regressions
-
Unit tests (70%): 5 tests for
detect_plan_drift()function- Correct filename extraction from plan.md
- Accurate git diff comparison
- Missing plan.md handling
- Git command failures
- Parse error handling
-
Integration tests (20%): 2 tests with full review stage
- Drift warnings appear in final review prompt
- No warnings when all planned files changed
-
E2E tests (10%): 1 test for complete pipeline
- Full pipeline: plan → build with partial changes → review detects drift
Happy Path:
- plan.md:
src/a.js,src/b.js,src/c.js - git diff:
src/a.js,src/b.js(changed) - Expected:
[DRIFT-WARNING] Planned file not modified: src/c.jsin review prompt
Error Case 1: Missing plan.md
- No plan.md in artifacts
- Expected: Empty drift warnings, review continues unaffected
Error Case 2: Git failure
- git diff command fails
- Expected: Fails safely, review stage doesn't break
Edge Case: All planned files modified
- plan.md: 2 files
- git diff: Both files + extras modified
- Expected: No drift warnings (all planned work done)
✓ detect_plan_drift() function implemented in pipeline-stages-review.sh
✓ Filenames extracted from ## Files to Modify markdown section
✓ Comparison against git diff --name-only HEAD works correctly
✓ [DRIFT-WARNING] injected for each unmodified planned file
✓ Fail-open: returns empty string if plan.md missing or git fails
✓ Integrated into stage_review() before native review prompt
✓ Drift warnings visible in final review output
✓ Event logging: emit_event "review.drift_detected"
✓ Unit tests pass (5 test cases)
✓ Integration tests pass (2 test cases)
✓ npm test passes (no regressions)
✓ Inline documentation added
✓ Manual fail-open verification completed
✓ Issue #417 resolved, Issue #128 superseded
Approach: Parse markdown section for filenames, compare against git diff
Pros:
- Simple: straightforward regex/grep pattern matching
- Reusable: Works with standardized plan.md format
- Low risk: ~20 lines of code
- Already validated: plan.md format is stable
Cons:
- Fragile to format changes (but low risk)
- Doesn't detect deleted planned files
Chosen because: Simplicity + low risk + proven plan.md structure
Approach: Extract from git log/blame to infer planned vs actual
Pros:
- Format-independent
- Automatically adapts
Cons:
- Complex implementation
- Slow (git operations)
- Uncertain inference
- Higher failure modes
Rejected: Too complex without proportional benefit
Approach: Save planned file list in pipeline-state.md or JSON
Pros:
- Decoupled from plan.md parsing
Cons:
- Requires state management changes
- More infrastructure
- No simplicity gain
Rejected: More infrastructure without benefit
| Risk | Impact | Mitigation |
|---|---|---|
| plan.md parsing breaks on unexpected format | Drift detection fails silently | Fail-open design: return empty string, log warning, review continues |
| git diff performance on large repos | Slow review stage | Already called in review (line 37), no additional overhead |
| False positives: flagging legitimate skips | Reviewer confusion |
[DRIFT-WARNING] provides context, not automatic failure |
| Review prompt becomes too large | Prompt truncation hides warnings |
guard_prompt_size() already handles prompt inflation |
| Git failures (network, permissions) | Review stage breaks | Catch errors, return empty string, don't break review |
- Plan-to-build drift is now detectable before review/audit/CQ burn time
- Drift warnings help reviewers understand scope completeness
- Zero breaking changes to existing review behavior
- Issue #417 closes, Issue #128 superseded
- Test suite passes without regressions