Skip to content

Pipeline Design 23

Seth Ford edited this page Feb 13, 2026 · 2 revisions

Design: Add webhook receiver for instant issue processing

Context

The Shipwright daemon (scripts/sw-daemon.sh) currently discovers new issues via a poll loop that runs every 30–60 seconds. For teams that want near-instant pipeline kickoff when an issue is labeled, this latency is unnecessarily high. GitHub supports webhook delivery for issues.labeled events, which can notify the system in under a second.

Constraints from the codebase:

  • The daemon is pure bash (Bash 3.2 compatible — no associative arrays, no readarray, no ${var,,})
  • The dashboard server (dashboard/server.ts) is Bun/TypeScript and already handles HTTP routes, WebSocket connections, and file-based state
  • Inter-component communication uses file-based protocols (JSONL event logs, heartbeat files, checkpoint files) — not sockets or IPC
  • All scripts use set -euo pipefail, atomic file writes (tmp + mv), and jq --arg for JSON
  • Polling must remain as fallback — webhooks are opt-in, not required

Decision

File-based queue bridging webhook HTTP to daemon bash loop.

Data flow

GitHub ──POST──▶ dashboard/server.ts:/api/webhook/github
                   │
                   ├─ Verify HMAC-SHA256 (X-Hub-Signature-256)
                   ├─ Check event=issues, action=labeled, label matches watch_label
                   ├─ Append issue JSON to ~/.shipwright/webhook-queue.jsonl (atomic write)
                   ├─ Log delivery to ~/.shipwright/webhook-deliveries.jsonl
                   └─ Return 202 Accepted
                   
daemon poll loop (each iteration, BEFORE polling GitHub API):
                   │
                   ├─ daemon_check_webhook_queue()
                   │    ├─ Atomic rename: webhook-queue.jsonl → webhook-queue.jsonl.processing
                   │    ├─ Read each line, extract issue number + metadata
                   │    ├─ Enqueue via existing daemon_enqueue_issue() path
                   │    ├─ emit_event for each processed webhook delivery
                   │    └─ Delete .processing file
                   └─ daemon_poll_issues() (existing — unchanged, serves as fallback)

Key design choices

  1. JSONL queue file (~/.shipwright/webhook-queue.jsonl): One JSON object per line. The daemon atomically renames to .processing before reading, so concurrent webhook writes don't race with reads. This follows the same pattern used by events.jsonl and heartbeat files throughout the codebase.

  2. HMAC-SHA256 verification: The webhook secret is stored in .claude/daemon-config.json under webhook.secret. The dashboard reads it at startup. Verification uses Bun's Web Crypto API (crypto.subtle.importKey + crypto.subtle.sign + timingSafeEqual). Invalid signatures get 401 Unauthorized — no information leakage.

  3. Webhook auto-setup (daemon init --webhook --url <url>): Generates secret via openssl rand -hex 32, writes to config, creates the GitHub webhook via gh api repos/{owner}/{repo}/hooks with the issues event. This keeps setup to a single command.

  4. Polling remains unchanged: daemon_check_webhook_queue() is called at the top of daemon_poll_loop, before daemon_poll_issues. If the queue is empty or missing, it's a no-op (zero overhead). Polling still runs on schedule as fallback, handling cases where the webhook endpoint is unreachable.

  5. Delivery stats via GET /api/webhook/status: Reads ~/.shipwright/webhook-deliveries.jsonl to report total deliveries, last delivery timestamp, average processing latency, recent 10 entries, and 24h error count. Dashboard UI polls this endpoint periodically.

  6. Idempotency: The daemon already deduplicates issues by number (won't re-enqueue an issue that's in-progress or completed). Webhook + poll delivering the same issue simultaneously is safe.

Error handling

  • Dashboard down: Webhook delivery fails at GitHub's end. GitHub retries with backoff. Daemon poll loop picks up the issue on next cycle regardless.
  • Malformed payload: Caught during JSON parse. Logged to webhook-deliveries.jsonl with status: "error". Returns 400 Bad Request.
  • Queue file contention: Atomic rename prevents read/write races. If rename fails (file doesn't exist), daemon_check_webhook_queue returns immediately — no error.
  • Disk full / write failure: dashboard/server.ts catches write errors, logs them, returns 500. Issue will be picked up by poll fallback.

Alternatives Considered

  1. Daemon listens on HTTP directly (bash + netcat/socat)

    • Pros: No dashboard dependency, daemon is self-contained
    • Cons: Bash HTTP servers are fragile and insecure; HMAC-SHA256 verification in bash is awkward (need openssl dgst); would duplicate the dashboard's existing HTTP infrastructure; no TLS without additional tooling
  2. Named pipe / Unix socket between dashboard and daemon

    • Pros: Lower latency than file-based queue (~0 vs ~1 loop iteration)
    • Cons: Platform-specific behavior (macOS vs Linux named pipes differ); daemon must keep a reader open or messages are lost; breaks the file-based protocol convention used everywhere else in Shipwright; harder to debug (can't cat a pipe to inspect state)
  3. Webhook writes directly to GitHub issue comment, daemon reads comments

    • Pros: No local queue file needed
    • Cons: Adds GitHub API round-trip latency; rate limit concerns; doesn't solve the core problem (daemon still needs to poll for comments)

Implementation Plan

Files to create

None — all changes go into existing files.

Files to modify

File Changes
scripts/sw-daemon.sh Add WEBHOOK_SECRET/WEBHOOK_ENABLED/WEBHOOK_QUEUE to load_config() (~line 361). Add --webhook/--url flags to daemon_init() (~line 5155). Add daemon_check_webhook_queue() function. Call it at top of daemon_poll_loop before daemon_poll_issues.
dashboard/server.ts Add POST /api/webhook/github route with HMAC verification, queue write, delivery logging. Add GET /api/webhook/status route. Add webhook secret loading from daemon config. Add to isPublicRoute().
dashboard/public/index.html Add webhook status section to dashboard layout.
dashboard/public/app.js Add webhook status panel: indicator, latency, delivery counts, recent deliveries table. Periodic fetch from /api/webhook/status.
.claude/daemon-config.json Add "webhook": { "enabled": false, "secret": "", "url": "" } block.
scripts/sw-daemon-test.sh Test daemon_check_webhook_queue (synthetic JSONL → verify enqueue), empty/missing queue → no-op, daemon_init --webhook → config correct (mock gh api), load_config webhook fields.

Dependencies

None new. Uses Bun built-in crypto.subtle for HMAC. Uses existing openssl and gh CLI for daemon init.

Risk areas

  • sw-daemon.sh is 5565 lines — the load_config and daemon_poll_loop modifications must be surgical to avoid breaking existing behavior. Test coverage via sw-daemon-test.sh mitigates this.
  • HMAC timing safety — must use timingSafeEqual (Bun provides this via crypto), not string comparison, to prevent timing attacks.
  • Atomic file operations under concurrent writes — multiple near-simultaneous webhook deliveries appending to the same JSONL file. Bun's appendFile is safe for append-only on POSIX, but the atomic rename in the daemon read path ensures no partial reads.
  • daemon_init --webhook calls gh api to create a webhook — must handle existing webhook (409 Conflict) gracefully, and must not store secrets in command history.

Validation Criteria

  • POST /api/webhook/github with valid HMAC-SHA256 signature returns 202 Accepted and appends to webhook-queue.jsonl
  • POST /api/webhook/github with invalid/missing signature returns 401 Unauthorized with no queue write
  • Non-issues.labeled events return 200 OK with no queue write (acknowledged but ignored)
  • daemon_check_webhook_queue() processes queued issues within one loop iteration (<1s typical)
  • Empty or missing queue file causes no errors (silent no-op)
  • Concurrent webhook delivery + poll cycle for the same issue does not produce duplicate pipelines
  • shipwright daemon init --webhook --url <url> generates secret, writes config, creates GitHub webhook
  • All 22 existing test suites pass (npm test)
  • New daemon tests cover: queue processing, empty queue, missing queue, init --webhook config, load_config webhook fields
  • Dashboard webhook status panel shows delivery count, last delivery, latency, and errors
  • All bash additions are Bash 3.2 compatible (no associative arrays, no readarray, no ${var,,})
  • Webhook secret is never logged or exposed in error messages

Clone this wiki locally