Pipeline Design 178

Now I have a clear picture of the existing implementation and the gaps. Here's the ADR:

Design: Pipeline cost forecast and budget gate with early warning

Context

Shipwright pipelines consume budget via LLM token usage across 12 stages. Operators need visibility into expected costs before committing to a run, and guardrails to prevent budget exhaustion. The core forecasting, budget gate, variance tracking, and dashboard integration were implemented in commit abd44a8. Four gaps remain:

No early warning — the gate is binary (block or pass); no warning when a run will consume 50–100% of remaining budget.
No confidence range — forecast shows a point estimate ($X.XX) but not low/high bounds reflecting data quality.
No estimated_cost on queued dashboard items — the QueueItem type has the field but daemon dispatch doesn't populate it.
cost.forecast event missing data_points — reduces observability.

Constraints: Bash 3.2 compatibility, set -euo pipefail, awk for floating-point, no new dependencies. Shell-first architecture — dashboard is read-only over events.jsonl.

Decision

Enhance the existing implementation in-place across 3 shell scripts and 1 TypeScript file. No new files for core logic.

Confidence Range

Add low_usd and high_usd to the cost_forecast() JSON output, computed from total_usd and a confidence-dependent spread:

Confidence	Spread	Example ($50 point estimate)
high	±15%	$42.50 – $57.50
medium	±30%	$35.00 – $65.00
low	±50%	$25.00 – $75.00

cost_forecast_display() shows the range for medium/low confidence: $35.00 - $65.00 (medium confidence, 8 runs). High confidence keeps the point estimate since the range is narrow enough to be noise.

Early Warning

After the existing "exceeds budget" block in pipeline_start(), add a consumption-percentage check:

if forecast > 50% of remaining (but ≤ remaining):
    warn "Forecast $X will consume Y% of remaining budget $Z"
    emit_event "cost.budget_high_usage" ...

Single warning per pipeline start — no repeated noise. Does not block, only informs.

Dashboard Queue Enrichment

In daemon-dispatch.sh, after the existing pre-spawn budget check, write forecast_usd to the job's metadata file. server.ts reads this when building queue items for /api/status.

Event Completeness

Add data_points=$(echo "$FORECAST_JSON" | jq -r '.data_points') to the existing emit_event "cost.forecast" call.

Component Diagram

┌─────────────────────────────────────────────────────┐
│                   CLI / Pipeline                     │
│  sw-pipeline.sh                                      │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────┐ │
│  │ Budget Gate   │  │ Early Warning│  │ Variance   │ │
│  │ (block/force) │  │ (warn >50%) │  │ Recording  │ │
│  └──────┬───────┘  └──────┬───────┘  └─────┬──────┘ │
│         │                 │                │         │
└─────────┼─────────────────┼────────────────┼─────────┘
          │ calls           │ calls          │ calls
          ▼                 ▼                ▼
┌─────────────────────────────────────────────────────┐
│                 Cost Engine                           │
│  sw-cost.sh                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────┐ │
│  │cost_forecast()│  │ _display()   │  │_variance() │ │
│  │ +low/high_usd│  │ +range fmt   │  │            │ │
│  └──────┬───────┘  └──────────────┘  └────────────┘ │
│         │ reads                                      │
│         ▼                                            │
│  ┌────────────────┐  ┌───────────────────────┐       │
│  │ Pricing rates  │  │ events.jsonl (history) │       │
│  └────────────────┘  └───────────────────────┘       │
└─────────────────────────────────────────────────────┘

┌──────────────────┐          ┌──────────────────────┐
│ daemon-dispatch  │──writes──▶ job metadata file     │
│ (forecast_usd)   │          │ (worktree/.forecast)  │
└──────────────────┘          └──────────┬───────────┘
                                         │ reads
                              ┌──────────▼───────────┐
                              │ dashboard/server.ts   │
                              │ /api/costs/forecast   │
                              │ /api/status (queue)   │
                              └──────────────────────┘

Interface Contracts

// cost_forecast() JSON output (bash → stdout)
interface CostForecast {
  total_usd: string;        // "50.00"
  low_usd: string;          // "35.00" — NEW
  high_usd: string;         // "65.00" — NEW
  stages: Array<{
    id: string;
    model: string;
    est_duration_s: number;
    est_tokens_in: number;
    est_tokens_out: number;
    est_cost: string;
  }>;
  confidence: "high" | "medium" | "low";
  data_points: number;
  complexity_multiplier: string;
}

// emit_event "cost.forecast" fields
interface CostForecastEvent {
  type: "cost.forecast";
  forecast_usd: string;
  template: string;
  confidence: string;
  issue: string;
  data_points: string;     // NEW
}

// emit_event "cost.budget_high_usage" fields — NEW
interface BudgetHighUsageEvent {
  type: "cost.budget_high_usage";
  forecast_usd: string;
  remaining_usd: string;
  consumption_pct: string;  // e.g. "72.5"
  template: string;
  issue: string;
}

// GET /api/costs/forecast response (unchanged structure, data richer)
interface ForecastResponse {
  recent_forecasts: Array<{
    issue: string;
    template: string;
    forecast_usd: number;
    confidence: string;
    ts: string;
  }>;
  variance_history: Array<{
    forecast_usd: number;
    actual_usd: number;
    variance_pct: number;
    template: string;
    ts: string;
  }>;
}

// QueueItem in /api/status (existing optional field, now populated)
interface QueueItem {
  // ... existing fields ...
  estimated_cost?: number;  // populated from job metadata
}

Data Flow

Pre-start forecast: pipeline_start() → cost_forecast(template_config, complexity) → reads events.jsonl history + template stages → returns JSON with total_usd, low_usd, high_usd, confidence
Display: cost_forecast_display(json) → renders table with range to stdout
Budget gate: compares total_usd vs cost_remaining_budget() → blocks or warns
Early warning: if total_usd / remaining > 0.5 but ≤ 1.0 → warn() + emit_event "cost.budget_high_usage"
Variance: on pipeline completion → cost_record_variance(forecast, actual, template, issue) → emits cost.forecast_variance event
Dashboard: server.ts reads events.jsonl → filters cost.forecast and cost.forecast_variance → returns to frontend

Error Boundaries

Component	Error	Handling
`cost_forecast()`	Missing template file	Returns empty string to stderr, caller skips forecast (existing)
`cost_forecast()`	No events.jsonl	Uses default durations, confidence="low" (existing)
`cost_forecast()`	jq parse failure	`
Budget gate	`cost_remaining_budget` fails	Falls back to "unlimited", skips gate (existing)
Early warning	awk error	Non-fatal — wrapped in conditional, pipeline proceeds
Dashboard `/api/costs/forecast`	events.jsonl unreadable	Returns 500 (existing)
daemon-dispatch	Forecast write failure	`

Alternatives Considered

Statistical confidence intervals (std dev) — Pros: mathematically rigorous, self-calibrating / Cons: needs ≥20 data points per template to be meaningful, complex bash math, overkill for current data volumes. Deferred — can iterate when more history exists; the percentage-spread approach is a pragmatic first step.
Node.js forecast module — Pros: proper floating point, easier unit testing, natural dashboard integration / Cons: breaks shell-first convention, requires Node at pipeline start time, two systems to maintain. Rejected — inconsistent with codebase architecture.
Do nothing — Pros: zero risk / Cons: misses acceptance criteria for early warning and confidence range. Rejected — the gaps are small but material for operator experience.

Implementation Plan

Files to create: None
Files to modify:
- scripts/sw-cost.sh — add low_usd/high_usd to cost_forecast() output; update cost_forecast_display() for range format
- scripts/sw-pipeline.sh — add early warning check after budget gate; add data_points to cost.forecast event
- scripts/lib/daemon-dispatch.sh — write forecast_usd to job metadata
- dashboard/server.ts — include estimated_cost in queue item response
- scripts/sw-cost-test.sh — tests for confidence range and early warning
Dependencies: None (new)
Risk areas:
- daemon-dispatch.sh is critical path for daemon — change is additive (write one metadata field), low risk
- awk floating-point precision — acceptable at 4 decimal places for cost estimates
- Confidence spread percentages are heuristic, not statistical — clearly labeled as such

Endpoint Specification

GET /api/costs/forecast

Method: GET
Path: /api/costs/forecast
Query params: period (integer, days, default 30)
Response (200): { recent_forecasts: [...], variance_history: [...] } (see interface above)
Response (200, empty): { recent_forecasts: [], variance_history: [] } when no data exists
Response (500): Internal error reading events.jsonl
Rate limiting: N/A — local developer tool
Versioning: N/A — internal API, no external consumers
Auth: None required (local dashboard)

Error Codes

Code	Condition
200	Success (may have empty arrays)
500	events.jsonl read failure

Pipeline Design 178

Design: Pipeline cost forecast and budget gate with early warning

Context

Decision

Confidence Range

Early Warning

Dashboard Queue Enrichment

Event Completeness

Component Diagram

Interface Contracts

Data Flow

Error Boundaries

Alternatives Considered

Implementation Plan

Endpoint Specification

GET /api/costs/forecast

Error Codes

Validation Criteria

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally