Skip to content

Pipeline Design 232

ezigus edited this page Mar 30, 2026 · 3 revisions

ADR written to .claude/pipeline-artifacts/design.md.

Key decisions in the ADR:

  • Defense-in-depth with 3 independent validation gates — initialize_state (unconditional rm), resume_state (issue-match check), stage_build (injection gate). Each layer independently prevents stale task injection.
  • Test-only approach — production fixes are already committed across 5 PRs. This phase adds a dedicated regression test suite (scripts/sw-pipeline-tasks-md-test.sh) with 9 tests.
  • Conservative error handling — malformed task files (missing - Issue: header) are treated as stale and removed rather than injected.
  • Rejected alternatives: single cleanup point (doesn't protect resume path), centralized manager (over-engineering), completion-time cleanup (fragile to crashes). the issue number from the task file's metadata header.

Constraints: Production fixes are already committed across 5 PRs (1a64c9d0, 4a2f315d, 8bfeb411, 70a8aaeb, a730e469). This design focuses exclusively on verification and test coverage — no production code changes.

Root Cause Hypothesis

  1. Most likely — Missing validation on resume path (CONFIRMED FIXED): resume_state() previously did not validate the task file's issue against the current GITHUB_ISSUE. Stale files survived resume and were injected by stage_build(). Fixed in commits referenced above.
  2. Issue number format inconsistency: GITHUB_ISSUE could contain #232 or 232, while the task file might store either format. If normalization differed across paths, comparisons would fail silently. Fixed by consistent tr -d '#' | xargs normalization in all three paths.
  3. Malformed metadata header: A task file missing the - Issue: line would bypass all issue-based checks and get injected regardless. Fixed by treating extraction failure as grounds for file removal.

Evidence Gathered

All three cleanup paths are confirmed present with correct logic:

  • pipeline-state.sh:463: [[ -n "${TASKS_FILE:-}" ]] && rm -f "$TASKS_FILE" — unconditional cleanup on fresh start
  • pipeline-state.sh:607-620: Issue mismatch detection via extract_issue_from_tasks_file(), with rm -f on mismatch or malformed file
  • pipeline-stages-build.sh:172-189: Defense-in-depth gate — only injects tasks when tasks_issue == current_issue, removes file otherwise
  • helpers.sh:429-435: extract_issue_from_tasks_file() uses grep -m1 -i "^-\{0,1\} *Issue:", strips #, trims whitespace, returns exit 1 on failure
  • Normalization: All three paths use identical echo "${GITHUB_ISSUE:-}" | tr -d '#' | xargs
  • Existing tests: sw-lib-pipeline-stages-test.sh:385-655 already covers initialize cleanup, resume mismatch, stage_build injection gate, and normalization variants

Decision

Test-only verification approach — validate the existing three-layer defense without modifying production code.

The fix uses a defense-in-depth pattern with three independent validation gates:

initialize_state()     -> unconditional rm (layer 1: fresh start)
resume_state()         -> issue-match check + rm on mismatch (layer 2: resume)
stage_build()          -> issue-match check + rm on mismatch (layer 3: injection point)

Each layer independently prevents stale task injection. Even if one layer is bypassed (e.g., resume skipped), the next layer catches the mismatch.

Issue normalization is centralized in extract_issue_from_tasks_file() for the file side, and uses tr -d '#' | xargs inline for GITHUB_ISSUE at each comparison point. This is intentionally not DRY'd into a helper — the inline normalization is 6 characters and keeping it visible at each comparison point makes the validation logic self-documenting.

Error handling: Malformed files (missing - Issue: header) are treated as stale and removed. This is the correct conservative choice — a task file that can't be attributed to an issue should never be injected.

Fix Strategy

No new production fix needed. The existing commits implement the correct three-layer defense. The strategy is:

  1. Extend test coverage in sw-lib-pipeline-stages-test.sh (tests already exist at lines 385-655)
  2. Add a dedicated test file scripts/sw-pipeline-tasks-md-test.sh for focused regression coverage
  3. Cover the exact reproduction scenario (issue #154 tasks leaking into #232 pipeline)
  4. Cover edge cases: unset TASKS_FILE, missing file, 5+ metadata format variations

This differs from a "rewrite" approach because the fixes are already correct — we just need confidence through comprehensive test coverage.

Alternatives Considered

  1. Single cleanup point (only in initialize_state) — Pros: simpler, one place to maintain / Cons: doesn't protect resume path, which is the exact bug scenario. Resume skips initialize_state entirely, so stale files would survive.

  2. Centralized task file manager class/module — Pros: DRY, single responsibility / Cons: over-engineering for a flat file with 3 consumers. The current inline checks are 5-8 lines each and self-documenting. Adding an abstraction layer would obscure the validation logic at the point where it matters most.

  3. Delete task file at pipeline completion instead of checking on resume — Pros: prevents accumulation / Cons: doesn't handle crashes, kill -9, or context exhaustion where cleanup code never runs. The current approach (validate on read) is resilient to incomplete shutdown.

Implementation Plan

  • Files to create: scripts/sw-pipeline-tasks-md-test.sh — dedicated regression test suite with 9 tests covering the exact bug scenario, metadata format variations, edge cases (unset variable, concurrent access, corrupted file, missing file)
  • Files to modify: None (production code already fixed)
  • Dependencies: None new
  • Risk areas:
    • Test isolation: each test must use $TMPDIR-based temp files, never live .claude/ directory
    • Sourcing production functions in test context requires careful mock setup (write_state, warn, etc.)
    • extract_issue_from_tasks_file regex relies on grep -m1 -i — test must verify case-insensitive matching

Validation Criteria

  • Bug reproduction test passes: issue #154 tasks file is deleted when resuming for issue #232
  • All 3 cleanup paths verified: initialize_state (unconditional rm), resume_state (issue mismatch rm), stage_build (injection gate)
  • extract_issue_from_tasks_file handles 5+ metadata formats: - Issue: #232, - Issue: 232, Issue: #232, - Issue: #232, and returns error for missing header
  • Issue normalization consistent: all paths produce bare number (no #, no whitespace) before comparison
  • Edge cases pass: unset TASKS_FILE, missing file, malformed file, concurrent initialization
  • Existing test suite (sw-lib-pipeline-stages-test.sh) passes with zero regressions
  • No production code changes in the final diff

Verification Plan

  1. Run existing tests: bash scripts/sw-lib-pipeline-stages-test.sh — confirms no regressions
  2. Run new test suite: bash scripts/sw-pipeline-tasks-md-test.sh — confirms all 9 scenarios pass
  3. Run full suite: npm test — confirms no cross-contamination
  4. Manual verification: inspect git diff to confirm zero production file changes

Clone this wiki locally