Skip to content

Pipeline Design 179

ezigus edited this page Mar 17, 2026 · 2 revisions

Now I have full context. Here's the ADR:


Design: Success pattern learning and template recommendation engine

Context

Shipwright pipelines use templates (fast, standard, full, hotfix, autonomous, cost-aware) that determine which stages run and how gates are handled. Today, template selection is manual or daemon-configured — there's no learning from historical outcomes.

The core success pattern engine (scripts/sw-success-patterns.sh, 586 lines) and 26 tests already exist. It captures patterns on pipeline completion, computes TF-IDF keyword similarity, and recommends templates. However, four gaps prevent it from being a closed-loop learning system:

  1. Acceptance tracking is dead codesw-pipeline.sh:2894 reads recommended_template: from the state file, but nothing writes it after intake.
  2. Cost always 0success_capture_pattern() hardcodes cost_usd: 0 (line 242) despite the pipeline computing total_cost at lines 2680/2921.
  3. No success correlation — Tracks acceptance/rejection but never records whether accepted recommendations led to successful pipelines.
  4. No issue_type field — Patterns have labels and complexity but lack explicit issue type (bug/feature/etc).

Constraints: Bash 3.2 compatibility required. All JSON manipulation via jq --arg (no string interpolation). Atomic writes via tmp+mv. The success-patterns.json schema must remain backwards compatible (additive fields only, jq // 0 defaults).

Decision

Minimal targeted fixes (~80 lines across 3 files). All changes are additive JSON fields — no schema migration, no new dependencies.

Component Diagram

┌──────────────────────────────────────────────────────────────────┐
│                        sw-pipeline.sh                            │
│                                                                  │
│  ┌─────────┐    ┌──────────────┐    ┌────────────────────────┐  │
│  │  Intake  │───▶│ Display Rec  │───▶│ Persist recommended_   │  │
│  │  Stage   │    │ (existing)   │    │ template to state file │  │
│  └─────────┘    └──────────────┘    └────────────────────────┘  │
│                                              │                   │
│                                              ▼                   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                  Pipeline Completion                      │   │
│  │  1. Compute total_cost → export PIPELINE_COST_USD        │   │
│  │  2. memory_finalize_pipeline() → capture_pattern()       │   │
│  │  3. success_track_acceptance(recommended, actual)         │   │
│  │  4. success_track_correlation(recommended, actual, pass?) │   │
│  └──────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────────┐
│                  sw-success-patterns.sh                           │
│                                                                  │
│  ┌──────────────────┐  ┌─────────────────────┐                  │
│  │ capture_pattern() │  │ recommend_template() │                  │
│  │ +issue_type field │  │ (TF-IDF similarity)  │                  │
│  │ +PIPELINE_COST_USD│  └─────────────────────┘                  │
│  └──────────────────┘                                            │
│  ┌──────────────────────┐  ┌──────────────────┐                 │
│  │track_correlation()   │  │ show_stats()      │                 │
│  │ NEW: accepted+pass → │  │ +correlation_rate │                 │
│  │ increment succeeded  │  └──────────────────┘                 │
│  └──────────────────────┘                                        │
└──────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────────┐
│          ~/.shipwright/memory/<repo-hash>/                        │
│          success-patterns.json                                    │
│                                                                  │
│  { "version": 1,                                                 │
│    "patterns": [ { ..., "cost_usd": 5.23,                       │
│                        "issue_type": "feature" } ],              │
│    "stats": { ..., "recommendations_succeeded": 12 } }          │
└──────────────────────────────────────────────────────────────────┘

Interface Contracts

// Gap 1: State file persistence (in sw-pipeline.sh, after intake stage)
// Write "recommended_template: <template>" to STATE_FILE
// Read by existing grep at line 2894

// Gap 2: Cost capture (sw-success-patterns.sh)
success_capture_pattern(state_file: string, artifacts_dir: string): void
// Reads $PIPELINE_COST_USD env var (set by sw-pipeline.sh before calling)
// Falls back to 0 if unset — backwards compatible

// Gap 3: Correlation tracking (NEW function)
success_track_correlation(
  recommended_template: string,  // what was recommended
  actual_template: string,       // what was used  
  outcome: "success" | "failure" // pipeline result
): void
// If recommended == actual AND outcome == "success": 
//   stats.recommendations_succeeded += 1
// Emits success.correlation event

// Gap 4: Issue type extraction (within existing capture_pattern)
// Reads .issue_type from intake-metadata.json
// Adds "issue_type": "feature"|"bug"|... to pattern JSON
// Defaults to null if not present

// Updated stats display
success_show_stats(): void
// Now includes:
//   Correlation rate: succeeded/accepted * 100%

Data Flow

Pipeline Start:
  intake completes
    → success_recommend_template(goal, labels, complexity)
    → returns {template, confidence, rationale} (or empty)
    → display recommendation box (existing)
    → NEW: write "recommended_template: <template>" to state file

Pipeline Completion:
  compute total_cost (line ~2680 or ~2921)
    → NEW: export PIPELINE_COST_USD=$total_cost
  memory_finalize_pipeline()
    → success_capture_pattern(state, artifacts)
      → reads PIPELINE_COST_USD (replaces hardcoded 0)
      → reads issue_type from intake-metadata.json
  success_track_acceptance(recommended, actual) — existing, now works
    → NEW: success_track_correlation(recommended, actual, outcome)
      → if accepted AND succeeded: stats.recommendations_succeeded += 1
      → emit success.correlation event

Error Boundaries

  • State file write failure (Gap 1): sed append wrapped in || true — recommendation display still works, acceptance tracking degrades gracefully to no-op (same as current behavior).
  • Missing PIPELINE_COST_USD (Gap 2): ${PIPELINE_COST_USD:-0} — falls back to current behavior.
  • Correlation function failure (Gap 3): Called with 2>/dev/null || true — pipeline completion is never blocked.
  • Missing intake-metadata.json (Gap 4): jq -r '.issue_type // ""' with 2>/dev/null || true — field remains null.

All error handling follows the existing pattern: non-critical operations never block the pipeline.

Alternatives Considered

  1. Embedding-based similarity matching — Pros: better semantic understanding of issue descriptions / Cons: adds model dependency (need an embedding model at query time), overkill for categorical+keyword matching, violates the <100ms query constraint for local-only operation. Shell implementation would require an external service call.

  2. SQLite storage instead of JSON — Pros: proper indexing, better query performance at scale / Cons: adds binary dependency, current 200-pattern FIFO cap keeps JSON fast enough, jq queries are <50ms on 200 records. Would revisit if cap increases to 1000+.

  3. Full rewrite in Node/TypeScript — Pros: matches project's Node toolchain, better testability / Cons: shell integration is the natural fit for pipeline scripts, existing 26 tests pass, rewrite risk for no functional gain. The pattern engine is a pipeline-internal concern, not a user-facing API.

Implementation Plan

  • Files to create: None
  • Files to modify:
    • scripts/sw-success-patterns.sh — Add issue_type extraction in success_capture_pattern(), read $PIPELINE_COST_USD instead of hardcoded 0, add success_track_correlation() function, update success_show_stats() with correlation rate
    • scripts/sw-pipeline.sh — Persist recommended_template: to state file after intake, export PIPELINE_COST_USD before finalize, call success_track_correlation() at completion
    • scripts/sw-success-patterns-test.sh — 5 new tests (correlation accepted+success, accepted+failure, rejected+success, cost capture, issue_type extraction)
    • config/event-schema.json — Add success.correlation event type
  • Dependencies: None (all existing: jq, sed, grep)
  • Risk areas:
    • Cost timing: memory_finalize_pipeline (line 2888) runs before total_cost is computed (line 2913+). Must either move the export before finalize, or reorder the calls. The env var approach (export PIPELINE_COST_USD) requires the export to happen before memory_finalize_pipeline calls success_capture_pattern.
    • State file format: Adding recommended_template: is additive and grep-based — low risk, but must ensure no trailing whitespace or quoting issues.

Validation Criteria

  • After intake, grep 'recommended_template:' .claude/pipeline-state.md returns the recommended template (when patterns exist)
  • success_track_acceptance is called with a non-empty _rec_template (existing dead code path now executes)
  • Captured patterns have cost_usd > 0 when PIPELINE_COST_USD is set
  • Captured patterns have issue_type populated when intake-metadata.json contains it
  • success_track_correlation increments recommendations_succeeded only when accepted AND outcome is success
  • success_show_stats displays correlation rate (succeeded/accepted)
  • success.correlation event emitted at pipeline completion
  • All 26 existing tests still pass
  • 5 new tests pass (correlation x3, cost, issue_type)
  • npm test shows no regressions

Schema Changes

Forward (additive only):

// success-patterns.json — stats object gains:
"recommendations_succeeded": 0

// success-patterns.json — each pattern object gains:
"issue_type": "feature"  // nullable, from intake-metadata.json

// success-patterns.json — each pattern object changes:
"cost_usd": 5.23  // was always 0, now reads PIPELINE_COST_USD

Rollback: git revert. Old code ignores new fields via jq // 0 and // null defaults. No data migration needed in either direction.

Idempotency Strategy

  • All JSON writes use atomic tmp+mv (${sp_file}.tmp.$$ + mv)
  • Stats use += 1 via jq (idempotent per invocation, not per event — acceptable since pipeline completion runs exactly once)
  • Pattern IDs are sha256(goal+template+timestamp) — duplicates from re-runs produce distinct IDs, FIFO cap prevents unbounded growth
  • Correlation tracking is stateless per call — repeated calls with same args increment counters (pipeline completion should only call once)

Rollback Plan

  1. git revert <commit> — removes all changes
  2. Existing success-patterns.json files remain valid — new fields ignored by old code
  3. recommendations_succeeded counter becomes orphaned but harmless (jq // 0 in old stats display)
  4. No external state to clean up

Clone this wiki locally