Skip to content

Pipeline Design 25

Seth Ford edited this page Feb 13, 2026 · 1 revision

It seems writing to the .claude/pipeline-artifacts/ directory requires permission. Here's the complete Architecture Decision Record. I'll output it directly since the file write is being blocked:


Design: Public real-time dashboard — live pipeline progress visible to anyone with a URL

Context

Shipwright's existing dashboard (dashboard/server.ts, ~4300 lines) is an admin-only tool protected by GitHub OAuth/PAT authentication. It serves the Fleet Command UI from dashboard/public/ via Bun, with real-time updates pushed over a single WebSocket channel (/ws) at 2-second intervals. The FleetState payload includes internal operational data (daemon PID, machine registry, cost info, agent details, developer presence) that is inappropriate for public exposure.

Pipeline progress is already tracked in two places: (1) pipeline-state.md in the worktree's .claude/ directory (YAML frontmatter + log), and (2) a GitHub issue comment updated via gh_update_progress() at each stage transition (tracked by PROGRESS_COMMENT_ID). Neither is consumable as a real-time public feed.

The goal is to let anyone with a URL watch a specific pipeline run in real-time — no login, no GitHub account, no admin access. The solution must:

  • Expose only sanitized, read-only data for a single pipeline run
  • Not weaken existing auth on the admin dashboard
  • Work without a running server (offline/static fallback)
  • Be a standalone HTML page (no build step, no framework)
  • Follow the existing codebase patterns (Bun server, bash pipeline, dark theme, no new dependencies)

Decision

Share Token System

A cryptographically random token (256-bit, hex-encoded) grants read-only access to a single pipeline's public state. Tokens are:

  • Generated by an authenticated admin via POST /api/share/:issue (requires existing session auth) or via shipwright dashboard live --issue <N> CLI
  • Stored in ~/.shipwright/share-tokens.json (file-backed, atomic writes via tmp+rename, same pattern as invite-tokens.json at server.ts:346-353)
  • Scoped to a single issue number
  • Time-limited with a 7-day TTL, cleaned up by a periodic interval (same pattern as invite token cleanup at server.ts:4249-4267)
  • Revocable via DELETE /api/share/:token (admin-only)

Token format: sw_live_{64-char-hex} — the sw_live_ prefix allows grep-ability and prevents accidental use as other token types.

Public API Surface (No Auth Required)

Three new endpoints added to isPublicRoute() (server.ts:370-384):

  1. GET /api/public/pipeline/:token — Returns a PublicPipelineState JSON object (REST polling endpoint):

    interface PublicPipelineState {
      issue: number;
      title: string;
      goal: string;
      status: "running" | "completed" | "failed";
      currentStage: string;
      stageProgress: Array<{
        id: string;
        status: "complete" | "active" | "pending" | "failed";
        duration_s?: number;
      }>;
      events: Array<{
        ts: string;
        type: string;
        stage?: string;
        message: string;
      }>;
      elapsed_s: number;
      prLink: string;
      startedAt: string;
      updatedAt: string;
    }

    No internal state (PIDs, worktree paths, machine info, cost data, agent details, secrets) exposed. Events sanitized to stage transitions + lifecycle events, capped at 50.

  2. GET /ws/live/:token — WebSocket upgrade for real-time push. Sends current PublicPipelineState on connect. Subsequent pushes piggyback on the existing periodicPush interval (server.ts:4205). A wsLiveClients map tracks Map<ServerWebSocket, { token: string; issue: number }> for scoped pushes.

  3. GET /live/:token — Serves live.html (standalone viewer, no auth).

Data Flow

sw-pipeline.sh (write_state)
       |
       |---> .claude/pipeline-state.md  (already exists)
       |---> events.jsonl               (already exists)
       +---> .claude/pipeline-artifacts/live-state.json  (NEW)
              |
              |---> server.ts reads this + events.jsonl -> PublicPipelineState
              +---> GitHub Actions pushes to shipwright-data branch (offline fallback)

Standalone Viewer (live.html + live.css + live.js)

  • Connection: WebSocket first, falls back to REST polling (5s) on failure, then static JSON if data-fallback-url is set
  • Layout: Stage timeline (horizontal dots/checkmarks), current stage pulse, activity feed, metadata bar
  • Styling: Matches existing dark theme (#060a14, #00d4ff cyan, Plus Jakarta Sans — same as server.ts:435-509)
  • No dependencies: Pure HTML/CSS/JS, consistent with dashboard/public/app.js

Pipeline Integration

  • write_state() (sw-pipeline.sh:1331): Also writes live-state.json using jq --arg
  • gh_build_progress_body() (sw-pipeline.sh:806): Appends live link when share token exists

CLI: shipwright dashboard live

  • --issue <N> — Creates share token, prints live URL
  • --list — Lists active tokens
  • --revoke <token> — Revokes a token

GitHub Actions (Offline Fallback)

shipwright-pipeline.yml extended with a post-stage step to push live-state.json to orphan shipwright-data branch (uses existing contents: write permission at line 49).

Error Handling

  • Invalid/expired tokens: 404 with {"error": "Invalid or expired share token"} (no existence leakage)
  • Server down: viewer degrades WebSocket -> REST -> static JSON with "Connecting..." banner
  • Stale data: warning if live-state.json >5min old and status "running"
  • Token cleanup: 15-minute background interval

Security

  • 256-bit random hex tokens, sw_live_ prefix, Map lookup (constant-time)
  • Zero internal state in public payloads
  • CORS already configured (server.ts:2067-2071)

Alternatives Considered

  1. Separate lightweight server (Express/Hono) — Pros: Clean separation. Cons: Second process, duplicates data reading, doubles operational complexity. The existing Bun server already has all infrastructure.

  2. GitHub Pages only (static JSON polling) — Pros: Zero infrastructure, free. Cons: 30-60s latency (git push + Pages rebuild), no WebSocket. Used as fallback, not primary.

  3. Embed in existing dashboard with "public mode" query param — Pros: No new files. Cons: app.js has 80+ state variables tightly coupled to FleetState. Conditional branches would be brittle. Standalone viewer is simpler and purpose-built.

Implementation Plan

Files to create

  • dashboard/public/live.html — Standalone viewer page
  • dashboard/public/live.css — Viewer styles
  • dashboard/public/live.js — Connection logic + DOM rendering
  • scripts/sw-live-test.sh — Test suite

Files to modify

  • dashboard/server.ts — Share token store, 3 public routes, wsLiveClients, PublicPipelineState interface + sanitizer (~285 lines)
  • scripts/sw-pipeline.shwrite_state() writes live-state.json, gh_build_progress_body() live link (~25 lines)
  • scripts/sw-dashboard.shlive subcommand (~80 lines)
  • .github/workflows/shipwright-pipeline.yml — Post-stage data branch push (~25 lines)
  • package.json — Register sw-live-test.sh

Dependencies

  • None. Uses existing Bun server, crypto.randomUUID(), jq, native browser APIs.

Risk areas

  • server.ts size (4300+ lines): ~285 lines added is manageable; share token system is extractable later
  • WebSocket fan-out: Cache sanitized payload per issue per push cycle to avoid N redundant serializations
  • write_state() jq call: ~50ms per stage transition (infrequent), guarded by command -v jq
  • GitHub Actions data branch push: Silently fails if permissions missing (non-blocking)

Validation Criteria

  • GET /api/public/pipeline/:token returns 200 with valid JSON for valid token, 404 for invalid/expired
  • PublicPipelineState contains zero internal fields (no PIDs, paths, costs, machine names, agent details)
  • WebSocket at /ws/live/:token sends initial state on connect and updates on state change
  • WebSocket with invalid token receives close frame with code 4001
  • live.html renders stage timeline, activity feed, and metadata correctly
  • Viewer falls back WebSocket -> REST polling -> static JSON gracefully
  • Share tokens expire after 7 days and are cleaned up automatically
  • shipwright dashboard live --issue 25 creates token and prints valid URL
  • write_state() produces valid live-state.json
  • Progress comment includes live URL when share token exists
  • sw-live-test.sh passes all mock scenarios
  • All 22 existing test suites pass (npm test)
  • No regression in admin dashboard auth enforcement

The file write to .claude/pipeline-artifacts/design.md was blocked by permission settings. Would you like to approve the write so the ADR is persisted to disk, or is the output above sufficient?

Clone this wiki locally