-
Notifications
You must be signed in to change notification settings - Fork 1
Pipeline Design 189
ADR written to .claude/pipeline-artifacts/design.md.
Summary of key decisions:
- Composable middleware (Option B) over 14 per-stage wrappers — most stages share identical orchestration, only build/test are special
-
4 new functions in
pipeline-stages.sh:check_human_directives(),select_stage_model(),broadcast_stage_discovery(),run_stage() -
Self-healing build+test stays inline in
run_pipeline()— counter coupling makes extraction risky for no testability gain -
Error boundaries: cross-cutting concerns never cause stage failure (fail-open/fire-and-forget), only
run_stage()propagates actual stage failures -
~200 lines removed from
run_pipeline(), ~180 added topipeline-stages.shas testable functions -
1 new test file (
sw-lib-pipeline-execution-test.sh) + ~20 new tests in existingsw-lib-pipeline-stages-test.shfault)
- Functions share state via global variables (
ARTIFACTS_DIR,ISSUE_NUMBER,CLAUDE_MODEL, etc.) - Optional modules (
audit_emit,gh_checks_stage_update,ucb1_select_model) may or may not be loaded — all calls must usetype funcname >/dev/null 2>&1guards - The self-healing build+test loop tightly couples two stages with counter management — it cannot be cleanly extracted without breaking the
completedcounter
Extract 4 composable middleware functions into scripts/lib/pipeline-stages.sh, then simplify run_pipeline() to call them. This is Option B from the plan — composable functions rather than 14 per-stage wrappers.
1. check_human_directives(stage_id) — returns 0 (proceed) or 1 (skipped)
Extracts lines 542-569 from run_pipeline(). Handles two file-based intervention mechanisms:
-
skip-stage.txt: grep for stage ID, remove from file if found, emitstage.skippedevent -
human-message.txt: display message, emitpipeline.human_messageevent, delete file
Fail-safe: all file reads guarded with 2>/dev/null || true. If files are missing or malformed, stage proceeds normally.
2. select_stage_model(stage_id) — returns 0 always (best-effort)
Extracts lines 694-763 from run_pipeline(). Three-tier model selection:
-
UCB1 (when
ucb1_select_modelis available and has data): Direct model recommendation from multi-armed bandit -
A/B testing (when
intelligence_recommend_modelis available): Randomized experiment/control split with configurable ratio fromdaemon-config.json - Graduated (when routing file shows >=50 samples): Bypass A/B, use recommended model directly
Side effect: exports CLAUDE_MODEL, emits intelligence.model_ucb1 or intelligence.model_ab event.
3. broadcast_stage_discovery(stage_id) — returns 0 always (fire-and-forget)
Extracts lines 809-821 from run_pipeline(). Maps stage ID to discovery category and file patterns:
-
plan->*.md -
design->*.md,*.ts,*.tsx,*.js -
build->src/*,*.ts,*.tsx,*.js -
test->*.test.*,*_test.* -
review->*.md,*.ts,*.tsx - default ->
*
Calls sw-discovery.sh broadcast as a subprocess. All errors suppressed.
4. run_stage(stage_id, enabled_count, completed_count) — returns 0 (success) or 1 (failure)
Wraps lines 766-852 from run_pipeline() into a composite function that orchestrates:
- Progress display (Stage: id [n/total])
- Status update to
running - Start time recording + event emission
- GitHub Check Run
in_progressupdate - Audit trail
stage.startemission - Delegate to
run_stage_with_retry(stage_id) - On success: mark complete, capture patterns (intake), timing, events, audit, vitals, UCB1 outcome, discovery broadcast, model routing log
- On failure: mark failed, error events, audit, vitals, UCB1 outcome, cancel remaining check runs
Sets LAST_STAGE_ERROR and LAST_STAGE_ERROR_CLASS on failure for caller consumption.
-
Self-healing build+test loop (lines 612-648): Tightly couples two stages with counter management (
completed += 2). Extracting this would require passing mutable counter state through function boundaries, adding complexity for no testability gain. -
Gate checks (lines 664-679): Interactive
readprompt that controls pipeline pause/resume flow. Must stay in the loop toreturn 0fromrun_pipeline(). - Budget enforcement (lines 681-692): Similar flow control — needs to pause the pipeline, not just skip a stage.
- Intelligence skip evaluation (lines 577-586): Already a clean single function call; wrapping it adds nothing.
- CI resume logic (lines 596-609): Artifact verification that may fall through to stage execution.
| Function | Error Boundary | Failure Mode |
|---|---|---|
check_human_directives |
Fail-open | File errors suppressed, stage proceeds |
select_stage_model |
Fail-open | All paths guarded, falls back to _smart_model default sonnet
|
broadcast_stage_discovery |
Fire-and-forget | Subprocess errors suppressed with `2>/dev/null |
run_stage |
Fail-propagate | Returns 1 on stage failure, caller handles pipeline-level response |
All new functions operate on globals already set by sw-pipeline.sh and pipeline-execution.sh. Each function uses ${VAR:-default} for every global reference, matching the existing convention in pipeline-stages.sh (lines 31-48). No new globals introduced.
-
Per-Stage Wrappers (
run_intake_stage(), etc.) — Pros: Matches issue acceptance criteria literally; each stage wrapper independently testable. Cons: 14 wrapper functions where 12 are identical boilerplate (only build/test have special logic); violates DRY; ~300 lines of duplicated orchestration. Rejected because the orchestration is stage-agnostic. -
Do Nothing — Pros: Zero regression risk; stages are already in separate files. Cons:
run_pipeline()remains 390 lines mixing concerns; cross-cutting logic untestable in isolation. Rejected because the opportunity to improve testability is worth the moderate risk. -
Event-Driven Stage Lifecycle — Pros: Maximum decoupling via event emitters/listeners. Cons: Bash has no native event system; implementing one adds significant complexity for only 4 cross-cutting concerns. Rejected as overkill.
-
scripts/sw-lib-pipeline-execution-test.sh— Unit tests forrun_stage_with_retry,self_healing_build_test, and orchestration integration
-
scripts/lib/pipeline-stages.sh— Add 4 new functions (~180 lines) -
scripts/lib/pipeline-execution.sh— Simplifyrun_pipeline()(~200 lines removed, ~40 added) -
scripts/sw-lib-pipeline-stages-test.sh— Add ~20 unit tests (~250 lines) -
package.json— Registersw-lib-pipeline-execution-test.sh
- No new external dependencies
- New functions depend on existing loaded modules:
pipeline-state.sh,helpers.sh,compat.sh - All dependencies already sourced before
pipeline-stages.shin the load chain
1. Variable Scope Breakage (HIGH likelihood, MEDIUM impact)
Extracted functions reference 15+ globals. If unset in test contexts, functions fail under set -u.
Mitigation: Every global uses ${VAR:-default}. Test setup mirrors existing pipeline-stages-test.sh.
2. Self-Healing Counter Coupling (MEDIUM likelihood, HIGH impact)
The build+test self-healing loop increments completed by 2. If run_stage() is accidentally called during self-healing, counts break.
Mitigation: Self-healing block stays inline. The existing continue on line 648 prevents run_stage() from being reached.
3. Return vs Exit in Subshells (LOW likelihood, HIGH impact)
If a new function is called in a pipeline (|) or $(), return exits the subshell not the caller.
Mitigation: All new functions called directly (no pipes). Code review enforces this.
4. Module Load Order (LOW likelihood, MEDIUM impact)
New functions may call audit_emit, gh_checks_stage_update which load conditionally.
Mitigation: All optional calls use type funcname >/dev/null 2>&1 && guards.
-
check_human_directives(),select_stage_model(),broadcast_stage_discovery(),run_stage()exist inscripts/lib/pipeline-stages.sh -
run_pipeline()reduced by ~200 lines (from ~390 to ~190 in the stage loop) -
run_pipeline()calls all 4 new functions (verified by grep) - All functions use
${VAR:-default}for every global variable reference - All optional module calls use
type funcname >/dev/null 2>&1guards - Self-healing build+test loop remains inline (not extracted)
- No new subshells introduced for extracted functions
- Unit tests cover happy path, error path, and missing dependencies for each function
-
sw-lib-pipeline-stages-test.shpasses with ~20 new tests -
sw-lib-pipeline-execution-test.shregistered inpackage.json -
sw-pipeline-test.sh(58 tests) passes without modification -
sw-e2e-smoke-test.sh(19 tests) passes without modification - No Bash 3.2 incompatibilities introduced
- CLAUDE.md Shared Libraries table updated