Skip to content

Pipeline Design 23

Seth Ford edited this page Feb 11, 2026 · 2 revisions

The pipeline artifacts directory is also restricted. Here's the complete ADR:


Design: Add webhook receiver for instant issue processing

Context

Shipwright's daemon (sw-daemon.sh) polls GitHub for issues every 30–60 seconds via daemon_poll_issues(). This introduces up to 60 seconds of latency between a user labeling an issue and the pipeline starting. GitHub supports webhook delivery for issues.labeled events, enabling sub-second notification.

Constraints:

  • The dashboard server (dashboard/server.ts, Bun) already serves HTTP at port 8767 with an existing /api/webhook/ci endpoint (line 4074) — this is the natural home for webhook reception.
  • The daemon is a bash process with no HTTP server. Communication between dashboard and daemon must use the filesystem (established pattern: daemon-pause.flag at sw-daemon.sh:3272, daemon.shutdown at line 188, daemon-state.json).
  • All bash must be Bash 3.2 compatible (set -euo pipefail, no associative arrays, no readarray).
  • Atomic file writes required (write to tmp + mv).
  • The daemon's poll loop already sleeps in 1-second intervals (sw-daemon.sh:4197–4202), providing a natural hook point for trigger file detection.

Decision

Approach: Filesystem trigger files bridging dashboard → daemon

The dashboard receives GitHub webhooks, validates HMAC-SHA256 signatures, and writes trigger files to ~/.shipwright/webhook-triggers/. The daemon detects these files during its 1-second sleep loop and runs an immediate poll cycle.

Data Flow

GitHub ──POST──▶ dashboard:8767/api/webhook/github
                  │
                  ├─ Validate X-Hub-Signature-256 (HMAC-SHA256)
                  ├─ Filter: only issues.labeled matching watch_label
                  ├─ Write trigger: ~/.shipwright/webhook-triggers/<issue>.json
                  ├─ Log delivery: ~/.shipwright/webhook-deliveries.jsonl
                  └─ Broadcast to WS clients
                  │
daemon poll loop ─┘
  (every 1s sleep tick)
    ├─ Check ~/.shipwright/webhook-triggers/ for files
    ├─ Remove trigger files (consume)
    └─ Call daemon_poll_issues (immediate cycle)

HMAC-SHA256 Validation (server.ts)

Uses Web Crypto API (built into Bun). Constant-time comparison via Bun's built-in crypto. No external dependencies. Reads X-Hub-Signature-256 header, computes HMAC-SHA256 of the raw request body with the configured secret, and rejects with 401 on mismatch.

Trigger File Format (~/.shipwright/webhook-triggers/<issue>.json)

{
  "issue_number": 42,
  "repo": "owner/repo",
  "label": "ready-to-build",
  "timestamp": "2026-02-11T22:00:00Z",
  "delivery_id": "abc-123"
}

Atomic writes: write to .tmp then mv. One file per issue to deduplicate rapid re-labels.

Daemon Trigger Detection (sw-daemon.sh)

New function daemon_check_webhook_triggers() called inside the 1-second sleep loop (between lines 4199–4201). Checks ~/.shipwright/webhook-triggers/ for JSON files, logs each trigger, removes the files, and calls daemon_poll_issues if any were found.

The existing sleep loop becomes:

local i=0
while [[ $i -lt $effective_interval ]] && [[ ! -f "$SHUTDOWN_FLAG" ]]; do
    sleep 1
    daemon_check_webhook_triggers || true
    i=$((i + 1))
done

~0ms overhead per tick when no triggers exist (directory check + early return).

Webhook Setup CLI (daemon init --webhook)

New daemon_init_webhook() function: generates secret via openssl rand -hex 32, writes to daemon-config.json under webhook.secret, creates GitHub webhook via gh api repos/{owner}/{repo}/hooks with issues event scope, writes webhook.dashboard_url to config.

Config Schema Addition

{
  "webhook": {
    "secret": null,
    "dashboard_url": null
  }
}

Loaded in load_config() (~line 362) as WEBHOOK_SECRET and WEBHOOK_DASHBOARD_URL. When secret is null, webhook endpoint returns 503. Polling continues regardless.

Dashboard Endpoint Details

POST /api/webhook/github (public route — added to isPublicRoute() at server.ts:370):

  • Validate X-Hub-Signature-256 against configured secret
  • Parse X-GitHub-Event header; only process issues events with action: "labeled"
  • Check label matches watch_label from daemon-config.json
  • Write trigger file + delivery log to ~/.shipwright/webhook-deliveries.jsonl
  • Return 200 on success, 401 on bad signature, 503 if not configured
  • Idempotent: same issue number overwrites the trigger file

GET /api/webhook/status (protected route):

  • Read ~/.shipwright/webhook-deliveries.jsonl
  • Return last N deliveries, total count, error count

FleetState Extension

Add webhook field to FleetState interface (server.ts:125):

webhook?: {
  configured: boolean;
  total_deliveries: number;
  recent_deliveries: number;
  last_delivery_at: string | null;
  errors_24h: number;
};

Populated from webhook-deliveries.jsonl in getFleetState() (server.ts:697).

Error Handling

Failure Behavior
Invalid HMAC signature 401 response, logged, no trigger file written
Dashboard down GitHub retries; polling continues as fallback
Daemon not running Trigger files accumulate, consumed on next start
Trigger dir missing Created lazily by dashboard; daemon early-returns safely
Malformed trigger JSON Daemon logs warning, removes file, continues
Disk full Atomic write fails, 500 to GitHub, polling continues
Rapid re-labels Same issue number overwrites trigger file (dedup)

Backward Compatibility

  • Webhook is opt-in. Without webhook.secret, endpoint returns 503.
  • Polling continues identically when webhook is not configured.
  • No changes to existing behavior. Trigger check is a no-op when trigger directory doesn't exist.

Alternatives Considered

  1. Standalone webhook server (separate process) — Pros: Independent lifecycle, simpler code isolation / Cons: Another port to expose, another process to manage, duplicates HTTP infrastructure in dashboard/server.ts, still requires filesystem IPC

  2. Named pipe / Unix socket between dashboard and daemon — Pros: Truly instant (~0ms vs ~1s), no filesystem polling / Cons: Complex lifecycle management, Bash 3.2 named pipe is fragile under set -e, doesn't survive daemon restarts (trigger files do), harder to debug

  3. Direct daemon HTTP server (bash + netcat/socat) — Pros: No dashboard dependency / Cons: Extremely fragile, no TLS, no HTTP parsing, blocks main loop, security risk

  4. GitHub Actions workflow as intermediary — Pros: No inbound network needed / Cons: 5–15s Actions startup latency, CI minutes cost, more moving parts

Implementation Plan

  • Files to create:

    • scripts/sw-webhook-test.sh — Test suite for webhook trigger handling, --webhook CLI flag, config loading
  • Files to modify:

    • dashboard/server.tsPOST /api/webhook/github, GET /api/webhook/status, HMAC validation, trigger file writing, delivery log, isPublicRoute(), FleetState.webhook, getFleetState()
    • scripts/sw-daemon.shwebhook.* in load_config() (~line 362) and defaults (~line 207), daemon_check_webhook_triggers(), sleep loop (~line 4199), daemon_init_webhook(), --webhook CLI flag (~line 256), config template (~line 4521), help text
    • scripts/sw-daemon-test.sh — Webhook config loading + trigger file detection tests
    • .claude/CLAUDE.md — Document webhook config and runtime state paths
    • package.json — Register sw-webhook-test.sh
  • Dependencies: None. Bun's built-in Web Crypto API for HMAC. No new npm packages.

  • New runtime state:

    • ~/.shipwright/webhook-triggers/<issue>.json — Trigger files (transient, consumed by daemon)
    • ~/.shipwright/webhook-deliveries.jsonl — Delivery log (append-only)
  • Risk areas:

    1. Race condition on trigger files — Mitigated by atomic writes (tmp + mv) and daemon consuming files after read
    2. Trigger file accumulation when daemon stopped — On restart, all consumed at once but daemon_poll_issues() respects MAX_PARALLEL and queues excess
    3. HMAC timing attack — Constant-time comparison via Bun's built-in crypto
    4. Dashboard config read — Dashboard reads daemon-config.json for watch_label and webhook.secret; same pattern as existing daemon-state.json reads
    5. Test isolation — Mock trigger directory and file operations; no real dashboard or GitHub

Validation Criteria

  • POST /api/webhook/github returns 401 for invalid HMAC signatures
  • POST /api/webhook/github returns 503 when no webhook secret is configured
  • POST /api/webhook/github writes trigger file only for issues.labeled events matching watch_label
  • POST /api/webhook/github ignores wrong label, wrong action, or wrong event type
  • Trigger files use atomic writes (tmp + mv)
  • daemon_check_webhook_triggers() processes and removes trigger files within 1 second
  • daemon_check_webhook_triggers() is a no-op when trigger directory doesn't exist
  • daemon_check_webhook_triggers() handles malformed JSON without crashing
  • Polling continues identically when webhook is not configured
  • daemon init --webhook generates secret and creates GitHub webhook via gh api
  • GET /api/webhook/status returns delivery history with counts
  • All 22 existing test suites pass
  • New sw-webhook-test.sh suite passes
  • All new bash is Bash 3.2 compatible
  • HMAC comparison is constant-time

The ADR is ready. I wasn't able to write it to .claude/pipeline-artifacts/design.md due to file permissions — you'll need to save it there or grant write access to that directory. The design follows established Shipwright patterns (filesystem IPC, atomic writes, error-guarded poll loop calls) and requires zero new dependencies.

Clone this wiki locally