---
name: cloud-headless-ec2
description: Run shipcode in Docker on EC2 with a web UI at shipcode.shipshit.dev so pipeline jobs no longer require a running desktop.
status: backlog
estimated_complexity: high
blast_radius: infra
---
PRD: cloud-headless-ec2
Executive Summary
Today shipcode only runs while the user's local machine is on. This PRD turns shipcode into a cloud-resident app: a Docker container on a self-hosted EC2 instance, fronted by a web UI at shipcode.shipshit.dev, with SSH for ops only and no SaaS dependencies. Pipeline jobs survive the user closing their laptop.
Problem Statement
The Electron desktop app is the control plane and the worker. If the laptop sleeps, pipeline threads pause. If the laptop is offline, no pipeline runs. The claude / codex / gh subprocesses, the SQLite DB, and the worktrees are all local. There is currently no way to start a pipeline from one machine and check on it from another. For long-running multi-issue runs (e.g. overnight planner+executor loops), this is the dominant operational pain.
Goals
- A single Docker image that runs the shipcode pipeline backend headlessly on EC2.
- A web UI at
shipcode.shipshit.dev serving the existing renderer surface against the remote backend.
- All current pipeline phases (plan → review → execute → verify → ship) run unchanged inside the container.
- Pipeline state, worktrees, and DB persist on EBS so a container restart does not lose threads.
- Auth gate in front of the public URL (Cloudflare Access or equivalent) so spend and git writes can never be triggered anonymously.
Non-Goals
- Multi-tenant SaaS. Single-instance, single-user-per-instance only.
- In-app GitHub OAuth.
gh CLI keeps using a long-lived token mounted at container start.
- Replacing the Electron app. Desktop and cloud both ship from the same packages.
- Auto-scaling, queueing across machines, or load balancing.
- Migrating user data from a desktop install to a cloud install.
- Replacing CodeRabbit or any existing review tooling.
User Stories
- As the operator, I want to start a pipeline run from my laptop and let it finish on EC2 even after I close the lid, so that long executor loops are not bound to my local power state.
- Acceptance: start a thread from
shipcode.shipshit.dev, kill the laptop, reopen the URL from a different machine, see the same thread still progressing through phases with live terminal output.
- As the operator, I want to deploy updates to the cloud instance with a single command, so that operating it is not a bespoke ritual every time.
- Acceptance: a documented one-command deploy (
bun run deploy:cloud or equivalent) updates the running container without losing thread state.
- As the operator, I want any HTTP traffic to
shipcode.shipshit.dev to be authenticated, so that anonymous internet traffic cannot trigger Claude/Codex spend or git writes.
- Acceptance: an unauthenticated request to any pipeline-controlling endpoint returns 401/403 before reaching the backend.
- As the operator, I want SSH access to the EC2 instance for logs and debugging, with no other inbound ports open beyond the reverse proxy.
- Acceptance:
nmap of the public IP shows only ports 22 (SSH) and 443 (HTTPS via reverse proxy).
Functional Requirements
- The system must expose the existing
ShipCodeAPI IPC contract (see packages/shared/src/ipc-channels.ts) over HTTP+WebSocket from a Node server process.
- The web client must implement a
window.shipcode adapter that satisfies the same contract over WS+HTTP, so existing renderer components reuse without behavioural changes.
- The pipeline backend must spawn
claude, codex, and gh from inside the container, with API credentials sourced from container environment variables.
- The SQLite database must persist on a mounted volume (e.g.
/data/shipcode.db) and survive container restarts.
- Worktrees must persist on a mounted volume (e.g.
/worktrees/...) using the existing AppSettings.worktreeRoot mechanism, with no code changes to worktree path resolution.
- The system must reject any request whose authentication header is missing or invalid before reaching pipeline-controlling handlers.
- The system must surface a basic spend ceiling (per-thread or per-day) that halts new phases when crossed, since headless mode removes the natural human-in-the-loop spend gate.
- A health endpoint must report: build SHA, DB migration version,
claude/codex/gh version + auth status, and disk free on the worktree volume.
- The system must provide a documented one-command deploy that updates the container without dropping in-flight thread state.
- The system must keep the existing Electron app working from the same packages, so desktop and cloud share
packages/pipeline, packages/agents, packages/db, packages/git, packages/shared verbatim.
Non-Functional Requirements
- Persistence: thread state, plans, reviews, verifications, terminal events, costs, and worktrees all survive an EC2 reboot.
- Observability: structured JSON logs from the backend, with phase boundaries and subprocess spawn/exit events visible in
journalctl or container logs.
- Security: no inbound port other than 22 (SSH) and 443 (HTTPS reverse proxy). API keys and
GH_TOKEN only in container env or in a managed secrets store, never committed.
- Resource ceilings: worktree disk usage must auto-prune so a runaway issue loop cannot fill the EBS volume silently.
- Cost guard: headless runs must have a configurable max-spend-per-day cap that halts new pipeline phases when crossed.
Success Criteria
- A pipeline thread started from
shipcode.shipshit.dev continues running through all phases when the originating browser is closed, and the same thread is visible from any other browser session that authenticates against the URL.
- After the container is restarted (
docker compose restart or equivalent), all previously running threads, their phases, terminal events, and worktrees are still present and resumable.
- An unauthenticated
curl https://shipcode.shipshit.dev/<any pipeline endpoint> returns 401 or 403 and produces no side effects in the backend logs.
- The Electron desktop app still installs from the same monorepo and still passes
bun run typecheck and bun run test after this work lands.
- A documented runbook exists under
apps/docs/ covering: provisioning the EC2 instance, building the image, deploying, rotating credentials, and reading logs.
- The deploy step is a single command and is reproducible from a clean
git clone.
- Daily spend in any 24-hour window is bounded by the configured cap; once hit, new phase starts return a clear error and the cap reset is documented.
Out of Scope
- Multi-user accounts, RBAC, per-user spend ceilings.
- In-app GitHub OAuth or any user-facing GitHub login flow.
- Auto-scaling, multi-instance fan-out, or any cross-machine job scheduling.
- Replacing
node-pty or the existing terminal streaming model with a different transport.
- Migrating data from existing desktop installs into the cloud instance.
- A managed SaaS offering with billing.
- Replacing CodeRabbit or building a separate AI PR review path.
- Switching the database engine away from SQLite.
Dependencies
- Existing IPC contract:
packages/shared/src/ipc-channels.ts.
- Existing pipeline state machine:
packages/pipeline/src/pipeline.ts and packages/pipeline/src/pipeline/.
- Existing worktree path resolution:
packages/shared/src/worktree-path.ts.
- Existing health check primitives:
packages/agents/src/health-check.ts.
- External: AWS EC2, EBS volume, Cloudflare (DNS + Access) or equivalent reverse-proxy auth provider.
- External:
claude, codex, gh CLIs installed inside the container; ANTHROPIC_API_KEY, OPENAI_API_KEY, GH_TOKEN provided via container env.
Verification Plan
Tests:
- New integration test that boots the headless backend, drives a pipeline thread through
pipeline:start → pipeline:approve → verify → ship against a test repo, and asserts state persists across a backend restart.
- New auth middleware test asserting that pipeline-controlling routes reject unauthenticated requests.
- New persistence test asserting SQLite path and worktree path are both pulled from env vars and not from
app.getPath('userData').
- Existing
packages/pipeline, packages/agents, packages/db, packages/git, packages/shared test suites continue to pass without modification.
- Existing
apps/desktop typecheck and tests continue to pass to confirm the Electron app was not regressed.
Manual:
- Deploy to a real EC2 instance behind Cloudflare Access. Start a pipeline thread from a laptop browser. Close the laptop. Open
shipcode.shipshit.dev from a phone or different laptop after authenticating. Confirm the thread is still progressing.
docker compose restart the running container during an active thread. Confirm the thread resumes from its last persisted phase boundary.
curl an unauthenticated request to a pipeline route. Confirm 401/403 with no side effects in logs.
- Trigger the spend cap. Confirm new phase starts are rejected with a clear error and that already-running phases finish cleanly.
Risks & Open Questions
node:sqlite is bundled with the Electron-shipped Node and is not portable to a vanilla Node/Bun container without a driver swap (likely better-sqlite3). Verifying the swap is behaviour-identical is non-trivial.
node-pty inside Docker requires correct TTY allocation. Worth a spike before committing to the image shape.
- Long-lived
GH_TOKEN (fine-grained PAT) replaces the interactive gh auth login flow. Rotation cadence and revocation procedure need to be decided.
- EBS sizing for worktrees is unknown until we have real usage data; the auto-prune policy needs a concrete threshold (count of worktrees? days idle? disk pressure?).
- Cloudflare Access vs HTTP basic auth vs an OIDC reverse proxy is still open. Cloudflare Access has the lowest setup cost but introduces a dependency on Cloudflare being in the request path.
- The cutover sequence (Phase A: extract
apps/server; Phase B: web control plane; Phase C: Dockerfile + EC2 + auth) needs to land without breaking the Electron app at any step. Each phase should ship green on its own.
PRD: cloud-headless-ec2
Executive Summary
Today shipcode only runs while the user's local machine is on. This PRD turns shipcode into a cloud-resident app: a Docker container on a self-hosted EC2 instance, fronted by a web UI at
shipcode.shipshit.dev, with SSH for ops only and no SaaS dependencies. Pipeline jobs survive the user closing their laptop.Problem Statement
The Electron desktop app is the control plane and the worker. If the laptop sleeps, pipeline threads pause. If the laptop is offline, no pipeline runs. The
claude/codex/ghsubprocesses, the SQLite DB, and the worktrees are all local. There is currently no way to start a pipeline from one machine and check on it from another. For long-running multi-issue runs (e.g. overnight planner+executor loops), this is the dominant operational pain.Goals
shipcode.shipshit.devserving the existing renderer surface against the remote backend.Non-Goals
ghCLI keeps using a long-lived token mounted at container start.User Stories
shipcode.shipshit.dev, kill the laptop, reopen the URL from a different machine, see the same thread still progressing through phases with live terminal output.bun run deploy:cloudor equivalent) updates the running container without losing thread state.shipcode.shipshit.devto be authenticated, so that anonymous internet traffic cannot trigger Claude/Codex spend or git writes.nmapof the public IP shows only ports 22 (SSH) and 443 (HTTPS via reverse proxy).Functional Requirements
ShipCodeAPIIPC contract (seepackages/shared/src/ipc-channels.ts) over HTTP+WebSocket from a Node server process.window.shipcodeadapter that satisfies the same contract over WS+HTTP, so existing renderer components reuse without behavioural changes.claude,codex, andghfrom inside the container, with API credentials sourced from container environment variables./data/shipcode.db) and survive container restarts./worktrees/...) using the existingAppSettings.worktreeRootmechanism, with no code changes to worktree path resolution.claude/codex/ghversion + auth status, and disk free on the worktree volume.packages/pipeline,packages/agents,packages/db,packages/git,packages/sharedverbatim.Non-Functional Requirements
journalctlor container logs.GH_TOKENonly in container env or in a managed secrets store, never committed.Success Criteria
shipcode.shipshit.devcontinues running through all phases when the originating browser is closed, and the same thread is visible from any other browser session that authenticates against the URL.docker compose restartor equivalent), all previously running threads, their phases, terminal events, and worktrees are still present and resumable.curl https://shipcode.shipshit.dev/<any pipeline endpoint>returns 401 or 403 and produces no side effects in the backend logs.bun run typecheckandbun run testafter this work lands.apps/docs/covering: provisioning the EC2 instance, building the image, deploying, rotating credentials, and reading logs.git clone.Out of Scope
node-ptyor the existing terminal streaming model with a different transport.Dependencies
packages/shared/src/ipc-channels.ts.packages/pipeline/src/pipeline.tsandpackages/pipeline/src/pipeline/.packages/shared/src/worktree-path.ts.packages/agents/src/health-check.ts.claude,codex,ghCLIs installed inside the container;ANTHROPIC_API_KEY,OPENAI_API_KEY,GH_TOKENprovided via container env.Verification Plan
Tests:
pipeline:start→pipeline:approve→verify→shipagainst a test repo, and asserts state persists across a backend restart.app.getPath('userData').packages/pipeline,packages/agents,packages/db,packages/git,packages/sharedtest suites continue to pass without modification.apps/desktoptypecheck and tests continue to pass to confirm the Electron app was not regressed.Manual:
shipcode.shipshit.devfrom a phone or different laptop after authenticating. Confirm the thread is still progressing.docker compose restartthe running container during an active thread. Confirm the thread resumes from its last persisted phase boundary.curlan unauthenticated request to a pipeline route. Confirm 401/403 with no side effects in logs.Risks & Open Questions
node:sqliteis bundled with the Electron-shipped Node and is not portable to a vanilla Node/Bun container without a driver swap (likelybetter-sqlite3). Verifying the swap is behaviour-identical is non-trivial.node-ptyinside Docker requires correct TTY allocation. Worth a spike before committing to the image shape.GH_TOKEN(fine-grained PAT) replaces the interactivegh auth loginflow. Rotation cadence and revocation procedure need to be decided.apps/server; Phase B: web control plane; Phase C: Dockerfile + EC2 + auth) needs to land without breaking the Electron app at any step. Each phase should ship green on its own.