Skip to content

postgrip-io/postgrip

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 

Repository files navigation

PostGrip — Agent Setup Runbook (server side, Docker Hub)

This file is a runbook for a coding agent (Claude Code, Codex, or similar) to provision a brand-new PostGrip server-side environment from prebuilt Docker Hub images. Given one public URL, work through the phases in order.

Each phase has Prerequisites → Actions → Checklist → Verify. Treat every - [ ] item as a gate: do not start the next phase until the current phase's checklist is fully checked and its Verify step passes. Commands are copy-paste runnable and idempotent. Working state (config + secrets) is persisted to $DEPLOY_ROOT/.deploy.env and $DEPLOY_ROOT/.secrets.env in Phases 2–3, so if your shell dies you can resume in a fresh one — see Resuming in a new shell.

Scope. Four server-side services + bundled PostgreSQL: server, ui, workerorchestrator, agentorchestrator. Customer-hosted worker and agent nodes are enrolled later from the UI and are out of scope here.

Tested on Ubuntu 24.04 LTS. This runbook assumes Docker Engine and the Compose plugin are already installed from Docker's official apt repository and that the Linux post-install steps are complete (see Phase 1). The setup steps begin after Docker is working.


0. Inputs and decisions

Required inputs (you must be given / obtain these before starting)

  1. PUBLIC_URL — the public HTTPS URL the site will be served at, e.g. https://postgrip.example.com. If you were handed a bare domain, prefix https://.
  2. Cloudflare Email Service credentialsemail is required, not optional. postgrip-server enforces email verification, so a working outbound mail channel is mandatory: the very first sign-up sends a verification email and cannot complete without it. PostGrip's only mail channel is the Cloudflare Email Service REST API (https://developers.cloudflare.com/email-service/). You need:
    • CLOUDFLARE_ACCOUNT_ID — the account that owns the verified sender.
    • CLOUDFLARE_API_TOKEN — an API token scoped to Account → Email Sending → Edit for that account.
    • FROM_EMAIL — an address on your Cloudflare-verified sending domain, which is the registrable domain (e.g. noreply@postgrip.app), not the www. subdomain. Cloudflare verifies postgrip.app, so noreply@www.postgrip.app will not send.
  3. GitHub deploy apprequired, not optional. Deploying connected GitHub repos is core to what PostGrip does, so the GitHub deploy integration is part of a complete setup. You will create a GitHub App (recommended) or OAuth app in Phase 9 and, to actually run deploys, enroll at least one worker node (UI Add Worker wizard). No credentials are needed before Phase 9.

Other than the above, everything is generated (secrets) or derived from PUBLIC_URL.

Decisions you must resolve before Phase 7 (ask the user if unspecified)

  • DECISION — public exposure / TLS. Pick one:
    • A. Cloudflare Tunnel — requires a tunnel token. No inbound ports.
    • B. Your own reverse proxy + TLS (Caddy/Nginx/Traefik) terminating HTTPS in front of the UI container.

Optional integrations (skip unless the user provides credentials)

  • Google sign-in, GitHub sign-in, Stripe billing. Each has a clearly-marked optional section; the core install works without them. (Email and GitHub deploy are not in this list — both are required, see above.)

Defaults this runbook applies without asking

  • Billing off (POSTGRIP_BILLING_PROVIDER=none).
  • A single signup region is seeded from PUBLIC_URL (required for signup).
  • Orchestrator host ports bound to loopback only.
  • Email/password sign-in only (no OAuth) until you add it.
  • Images tracked at :latest (override per stack with POSTGRIP_IMAGE_VERSION).

Architecture being built

Service Image Internal port Role
postgres postgres:17.9 5432 Shared DB (auth, main, worker, deploy, … schemas)
postgrip-server postgrip/server 3000 Auth edge; runs DB migrations on startup
postgrip-ui postgrip/ui 80 React SPA + Nginx that proxies /api/* to the backends
postgrip-workerorchestrator postgrip/workerorchestrator 4000 Control plane (apps, workers, targets, deploy)
postgrip-agentorchestrator postgrip/agentorchestrator 4100 Agent control plane

Four Compose stacks share one external network postgrip-server-net. The data stack (postgres + cloudflared) creates the network and is the stable tier; the app stack (server + ui) is updated frequently; workerorchestrator and agentorchestrator are separate. All connect to the same PostgreSQL using the POSTGRES_* pieces (the app server and workerorchestrator build the connection string from them; the agentorchestrator takes a full POSTGRIP_AGENT_DATABASE_URL). Splitting the stateful database into its own stack means app updates never restart or risk it.


Phase 1 — Preflight

Prerequisites: A host running Ubuntu 24.04 LTS (what this runbook is tested on) with shell access as root or a sudo-capable user (Phase 2 detects which), and Docker already installed from Docker's official apt repository with the Linux post-install steps complete:

  1. Install Docker Engine + Compose pluginhttps://docs.docker.com/engine/install/ubuntu/ (use the official download.docker.com apt repo, not the distro's docker.io package; this provides docker compose v2).
  2. Linux post-install stepshttps://docs.docker.com/engine/install/linux-postinstall/ (add your user to the docker group so docker runs without sudo, and enable the Docker service to start on boot).

Completing both is assumed by every later phase: the docker / docker compose commands are run without sudo, and the daemon is expected to come up on reboot. Other Debian/Ubuntu-family systemd hosts will likely work but were not verified.

Actions / Verify:

docker version && docker compose version   # Docker Engine 24+ with Compose v2
openssl version                            # for secret generation
command -v curl wget                       # for verification

Confirm DNS and reachability (replace with your domain):

DOMAIN_CHECK=postgrip.example.com
getent hosts "$DOMAIN_CHECK" || echo "WARN: $DOMAIN_CHECK does not resolve yet"

Checklist:

  • OS is Ubuntu 24.04 LTS (or an equivalent systemd Linux host).
  • Docker Engine + Compose plugin installed from Docker's official apt repo (docker compose version works).
  • Post-install done: docker run --rm hello-world works without sudo (user is in the docker group), and systemctl is-enabled docker returns enabled.
  • openssl is available.
  • The host can reach Docker Hub (the hello-world pull above succeeded).
  • DNS for the domain points (or will point) at this host, or a Cloudflare Tunnel will provide ingress.
  • If using your own reverse proxy: ports 80/443 are available on the host.

Phase 2 — Scaffold and persist config

Prerequisites: Phase 1 complete.

Actions: set the two inputs, derive the rest, create the directory tree, and persist the non-secret config to $DEPLOY_ROOT/.deploy.env so later phases survive a new shell (see Resuming below). Re-running this phase is safe.

# --- inputs ---
export DEPLOY_ROOT="${DEPLOY_ROOT:-/opt/postgrip}"
export PUBLIC_URL="https://postgrip.example.com"     # <-- the URL you were given

# --- derived / fixed ---
export DOMAIN="$(printf '%s' "$PUBLIC_URL" | sed -E 's#^https?://##; s#/.*$##')"
# Mail domain = the registrable domain (strip a leading www.). Cloudflare verifies
# the sending domain (e.g. postgrip.app), NOT the www. subdomain.
export MAIL_DOMAIN="${DOMAIN#www.}"
export POSTGRES_USER=postgrip_admin
export POSTGRES_DB=postgrip

# --- REQUIRED: Cloudflare Email Service (sign-up sends a verification email) ---
export CLOUDFLARE_ACCOUNT_ID="REPLACE_WITH_CLOUDFLARE_ACCOUNT_ID"
# FROM_EMAIL MUST be an address on your Cloudflare-VERIFIED sending domain.
# Default strips www.; override if your verified domain differs.
export FROM_EMAIL="${FROM_EMAIL:-noreply@$MAIL_DOMAIN}"
# (the API token is a secret — it goes into .secrets.env in Phase 3)

# Create the deploy tree. Use sudo only if not already root and sudo exists;
# if you are non-root without sudo, pick a DEPLOY_ROOT you own (e.g. $HOME/postgrip).
SUDO=""; [ "$(id -u)" -eq 0 ] || SUDO="$(command -v sudo || true)"
${SUDO} mkdir -p "$DEPLOY_ROOT"/{data,app,workerorchestrator,agentorchestrator}
[ -n "$SUDO" ] && ${SUDO} chown -R "$(id -u):$(id -g)" "$DEPLOY_ROOT"
mkdir -p "$DEPLOY_ROOT/data/pgdata"

# Persist non-secret config (chmod 600).
( umask 077; cat > "$DEPLOY_ROOT/.deploy.env" <<EOF
DEPLOY_ROOT=$DEPLOY_ROOT
PUBLIC_URL=$PUBLIC_URL
DOMAIN=$DOMAIN
POSTGRES_USER=$POSTGRES_USER
POSTGRES_DB=$POSTGRES_DB
CLOUDFLARE_ACCOUNT_ID=$CLOUDFLARE_ACCOUNT_ID
FROM_EMAIL=$FROM_EMAIL
EOF
)
chmod 600 "$DEPLOY_ROOT/.deploy.env"

Checklist:

  • $DEPLOY_ROOT/{data,app,workerorchestrator,agentorchestrator} exist and are writable by your user.
  • $DEPLOY_ROOT/.deploy.env exists, is chmod 600, and contains PUBLIC_URL/DOMAIN.
  • CLOUDFLARE_ACCOUNT_ID is set to your real account ID (not the placeholder), and FROM_EMAIL is a sender verified in that Cloudflare account.
  • PUBLIC_URL, DOMAIN, DEPLOY_ROOT, POSTGRES_USER, POSTGRES_DB are exported in the shell.

Phase 3 — Generate and persist secrets

Prerequisites: Phase 2 complete.

Actions: export your Cloudflare API token (a provided secret, required for email), then generate the rest once into $DEPLOY_ROOT/.secrets.env and source it. The [ -f ] guard makes this idempotent — re-running never mints new secrets, so it won't break an already-running deployment (which would otherwise hit a DB-password mismatch and invalidated sessions). Generated values use hex to avoid any .env quoting pitfalls.

# REQUIRED before first run: the Cloudflare Email Service API token
# (Account -> Email Sending -> Edit). The guard fails fast if it's unset.
export CLOUDFLARE_API_TOKEN="REPLACE_WITH_CLOUDFLARE_API_TOKEN"
: "${CLOUDFLARE_API_TOKEN:?export your Cloudflare API token first (Phase 0 input)}"

if [ ! -f "$DEPLOY_ROOT/.secrets.env" ]; then
  ( umask 077; cat > "$DEPLOY_ROOT/.secrets.env" <<EOF
POSTGRES_PASSWORD=$(openssl rand -hex 24)
BETTER_AUTH_SECRET=$(openssl rand -hex 32)
WORKER_TOKEN_SECRET=$(openssl rand -hex 32)
KEK_SECRET=$(openssl rand -hex 32)
INTERNAL_API_KEY=$(openssl rand -hex 32)
POSTGRIP_AGENT_TOKEN_SECRET=$(openssl rand -hex 32)
POSTGRIP_AGENT_ENROLLMENT_KEY=$(openssl rand -hex 32)
CLOUDFLARE_API_TOKEN=$CLOUDFLARE_API_TOKEN
EOF
  )
  chmod 600 "$DEPLOY_ROOT/.secrets.env"
fi

# Load config + secrets into the environment (safe to re-run).
set -a; . "$DEPLOY_ROOT/.deploy.env"; . "$DEPLOY_ROOT/.secrets.env"; set +a

Resuming in a new shell

If your shell dies at any point, you do not lose state. Re-export the one path and re-source both files, then continue from where you were:

export DEPLOY_ROOT=/opt/postgrip
set -a; . "$DEPLOY_ROOT/.deploy.env"; . "$DEPLOY_ROOT/.secrets.env"; set +a

Checklist:

  • $DEPLOY_ROOT/.secrets.env exists, is chmod 600, and has 8 KEY=value lines (7 generated + CLOUDFLARE_API_TOKEN).
  • CLOUDFLARE_API_TOKEN in .secrets.env is your real token, not the placeholder.
  • Re-running this phase did not change existing secrets (the [ -f ] guard preserved them).
  • Config + secrets are loaded now: for v in PUBLIC_URL KEK_SECRET POSTGRES_PASSWORD CLOUDFLARE_ACCOUNT_ID CLOUDFLARE_API_TOKEN FROM_EMAIL; do [ -n "${!v}" ] || echo "MISSING $v"; done prints nothing.
  • .deploy.env and .secrets.env are backed up securely and never committed.

Phase 4 — Data stack (PostgreSQL + Cloudflare Tunnel)

Prerequisites: Phases 2–3 complete (config + secrets sourced in this shell — re-source per Resuming if needed).

This is the stable tier: PostgreSQL plus (optionally) the Cloudflare Tunnel. It creates the shared postgrip-server-net network that every other stack joins. You rarely touch it after first bring-up, so app updates never restart or risk the database.

4a. Write docker-compose.yml

cat > "$DEPLOY_ROOT/data/docker-compose.yml" <<'YAML_EOF'
services:
  postgres:
    image: postgres:17.9-bookworm
    restart: unless-stopped
    ports:
      - "127.0.0.1:${POSTGRES_PORT:-5433}:5432"
    environment:
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_DB: ${POSTGRES_DB}
    volumes:
      - ${POSTGRES_DATA_DIR}:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql:ro
    expose:
      - "5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
      interval: 5s
      timeout: 3s
      retries: 5
    networks:
      - postgrip_server_net

  # Optional Cloudflare Tunnel (Phase 7 option A). Only runs with --profile
  # cloudflare, so it stays stopped until you enable it in Phase 7. Living here
  # (the stable stack) keeps it out of the frequently-pulled app stack.
  cloudflared:
    profiles:
      - cloudflare
    image: cloudflare/cloudflared:latest
    restart: unless-stopped
    command: tunnel --no-autoupdate --protocol http2 run
    environment:
      TUNNEL_TOKEN: ${CLOUDFLARE_TUNNEL_TOKEN}
    networks:
      - postgrip_server_net

networks:
  postgrip_server_net:
    name: ${POSTGRIP_SERVER_NETWORK:-postgrip-server-net}
YAML_EOF

4b. Write init.sql

Pre-creates the schemas the migrations and orchestrators expect (runs only on first DB initialization).

cat > "$DEPLOY_ROOT/data/init.sql" <<'SQL_EOF'
CREATE SCHEMA IF NOT EXISTS auth;
CREATE SCHEMA IF NOT EXISTS main;
CREATE SCHEMA IF NOT EXISTS control;
CREATE SCHEMA IF NOT EXISTS deploy;
SQL_EOF

4c. Write .env

POSTGRES_PORT is the host port for admin access; POSTGRES_INTERNAL_PORT (5432, set in the app stack) is what services use inside the network.

cat > "$DEPLOY_ROOT/data/.env" <<EOF
# ---- Bundled PostgreSQL ----
POSTGRES_USER=${POSTGRES_USER}
POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
POSTGRES_DB=${POSTGRES_DB}
POSTGRES_PORT=5433
POSTGRES_DATA_DIR=${DEPLOY_ROOT}/data/pgdata

# ---- Shared Docker network (created by THIS stack) ----
POSTGRIP_SERVER_NETWORK=postgrip-server-net

# ---- Cloudflare Tunnel (only used with --profile cloudflare in Phase 7) ----
CLOUDFLARE_TUNNEL_TOKEN=
EOF
chmod 600 "$DEPLOY_ROOT/data/.env"

4d. Validate, pull, and start (must be first — creates the network + DB)

cd "$DEPLOY_ROOT/data"
docker compose config --quiet   # validate compose + .env interpolation
docker compose pull
docker compose up -d --wait     # postgres healthy; network created (cloudflared stays off)

Checklist:

  • docker-compose.yml, init.sql, .env exist in $DEPLOY_ROOT/data, and .env is chmod 600.
  • docker compose config --quiet passed.
  • docker compose up -d --wait exited 0 (postgres healthy).
  • docker network inspect postgrip-server-net succeeds (network created).

Phase 4.5 — App stack (server + ui)

Prerequisites: Phase 4 healthy (network + DB up). This is the frequently updated tierpostgrip-server + postgrip-ui. It joins the data stack's network; postgrip-server connects to postgres over it and runs migrations on startup. Because postgres lives in another stack, postgrip-server can't gate on it with depends_onrestart: unless-stopped makes it retry until the DB answers, which is why the data stack comes up first.

4.5a. Write docker-compose.yml

cat > "$DEPLOY_ROOT/app/docker-compose.yml" <<'YAML_EOF'
services:
  postgrip-server:
    image: postgrip/server:${POSTGRIP_IMAGE_VERSION:-latest}
    user: "0:0"
    restart: unless-stopped
    expose:
      - "3000"
    environment:
      PORT: "3000"
      HOME: /tmp
      AUTH_ENABLED: "true"
      POSTGRES_HOST: ${POSTGRES_HOST:-postgres}
      POSTGRES_INTERNAL_PORT: ${POSTGRES_INTERNAL_PORT:-5432}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_DB: ${POSTGRES_DB}
      BETTER_AUTH_SECRET: ${BETTER_AUTH_SECRET}
      BETTER_AUTH_URL: ${PUBLIC_URL}
      PUBLIC_URL: ${PUBLIC_URL}
      CORS_ORIGINS: ${CORS_ORIGINS}
      POSTGRIP_AUTH_GOOGLE_CLIENT_ID: ${POSTGRIP_AUTH_GOOGLE_CLIENT_ID:-}
      POSTGRIP_AUTH_GOOGLE_CLIENT_SECRET: ${POSTGRIP_AUTH_GOOGLE_CLIENT_SECRET:-}
      POSTGRIP_AUTH_GITHUB_CLIENT_ID: ${POSTGRIP_AUTH_GITHUB_CLIENT_ID:-}
      POSTGRIP_AUTH_GITHUB_CLIENT_SECRET: ${POSTGRIP_AUTH_GITHUB_CLIENT_SECRET:-}
      POSTGRIP_HIBP_PASSWORD_CHECK: ${POSTGRIP_HIBP_PASSWORD_CHECK:-true}
      CLOUDFLARE_ACCOUNT_ID: ${CLOUDFLARE_ACCOUNT_ID:-}
      CLOUDFLARE_API_TOKEN: ${CLOUDFLARE_API_TOKEN:-}
      FROM_EMAIL: ${FROM_EMAIL:-noreply@example.com}
      POSTGRIP_BILLING_PROVIDER: ${POSTGRIP_BILLING_PROVIDER:-none}
      POSTGRIP_WORKERORCHESTRATOR_URL: ${POSTGRIP_WORKERORCHESTRATOR_URL:-http://postgrip-workerorchestrator:4000}
      POSTGRIP_AGENTORCHESTRATOR_URL: ${POSTGRIP_AGENTORCHESTRATOR_URL:-http://postgrip-agentorchestrator:4100}
      POSTGRIP_PUBLIC_AGENTORCHESTRATOR_URL: ${POSTGRIP_PUBLIC_AGENTORCHESTRATOR_URL:-}
      DEFAULT_CELL_WORKERORCHESTRATOR_ID: ${DEFAULT_CELL_WORKERORCHESTRATOR_ID:-}
      DEFAULT_CELL_WORKERORCHESTRATOR_DISPLAY_NAME: ${DEFAULT_CELL_WORKERORCHESTRATOR_DISPLAY_NAME:-}
      DEFAULT_CELL_WORKERORCHESTRATOR_REGION_CODE: ${DEFAULT_CELL_WORKERORCHESTRATOR_REGION_CODE:-}
      DEFAULT_CELL_PUBLIC_WORKERORCHESTRATOR_URL: ${DEFAULT_CELL_PUBLIC_WORKERORCHESTRATOR_URL:-}
      DEFAULT_CELL_WORKERORCHESTRATOR_VISIBLE: ${DEFAULT_CELL_WORKERORCHESTRATOR_VISIBLE:-true}
    healthcheck:
      test: ["CMD", "bun", "-e", "fetch('http://localhost:3000/api/auth/get-session').then(r => process.exit(r.status < 500 ? 0 : 1)).catch(() => process.exit(1))"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 15s
    networks:
      - postgrip_server_net

  postgrip-ui:
    image: postgrip/ui:${POSTGRIP_IMAGE_VERSION:-latest}
    restart: unless-stopped
    expose:
      - "80"
    # For reverse-proxy exposure (Phase 7 option B), publish a loopback port:
    # ports:
    #   - "127.0.0.1:8080:80"
    environment:
      NGINX_CANONICAL_ORIGIN: ${NGINX_CANONICAL_ORIGIN:-${PUBLIC_URL}}
      NGINX_REDIRECT_HOST: ${NGINX_REDIRECT_HOST:-redirect-disabled.invalid}
    volumes:
      - ./nginx.conf.template:/etc/nginx/templates/default.conf.template:ro
    depends_on:
      postgrip-server:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "-qO-", "--tries=1", "http://127.0.0.1:80/healthz"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 5s
    networks:
      - postgrip_server_net

networks:
  postgrip_server_net:
    external: true
    name: ${POSTGRIP_SERVER_NETWORK:-postgrip-server-net}
YAML_EOF

4.5b. Write nginx.conf.template

The UI reverse-proxies /api/* to the backends. Nginx substitutes ${NGINX_*} from the environment at start; $host, $scheme, etc. are Nginx runtime variables (preserved verbatim by the quoted heredoc).

Run this to create nginx.conf.template
cat > "$DEPLOY_ROOT/app/nginx.conf.template" <<'NGINX_EOF'
  server {
      listen 80 default_server;
      server_name _;
      root /usr/share/nginx/html;
      index index.html;

      # Host-level canonicalization: keep this in server context before routes.
      if ($host = ${NGINX_REDIRECT_HOST}) {
          return 308 ${NGINX_CANONICAL_ORIGIN}$request_uri;
      }

      # Re-resolve upstream hostnames via Docker's embedded DNS so proxy_pass
      # does not hold on to stale container IPs after an upstream restart.
      resolver 127.0.0.11 valid=10s ipv6=off;
      set $auth_upstream http://postgrip-server:3000;
      set $ctrl_upstream http://postgrip-workerorchestrator:4000;

      # Security headers
      add_header X-Frame-Options "DENY" always;
      add_header X-Content-Type-Options "nosniff" always;
      add_header X-XSS-Protection "1; mode=block" always;
      add_header Referrer-Policy "strict-origin-when-cross-origin" always;
      add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
      add_header Permissions-Policy "camera=(), microphone=(), geolocation=(), payment=(), usb=()" always;
      add_header Content-Security-Policy "default-src 'self'; script-src 'self' https://js.stripe.com 'sha256-Wn5wULmWNAS84lRO4JMQ9jE/kYWz1ADWjWy4z2jjosA='; style-src 'self'; style-src-elem 'self'; style-src-attr 'none'; img-src 'self' data: https:; font-src 'self' data:; connect-src 'self' https://api.stripe.com https://checkout.stripe.com https://r.stripe.com https://m.stripe.network https://q.stripe.com ${NGINX_CANONICAL_ORIGIN}; frame-src https://js.stripe.com https://hooks.stripe.com; object-src 'none'; frame-ancestors 'none'; base-uri 'self'; form-action 'self'" always;

      # Liveness probe used by docker compose's healthcheck. Returns
      # a static 200 without going through the SPA fallback or any
      # upstream proxy_pass, so a degraded auth/control-plane upstream
      # does not flap the UI container's health state. access_log is
      # off so the probe doesn't drown out real traffic in nginx logs.
      location = /healthz {
          access_log off;
          add_header Content-Type text/plain;
          return 200 "ok\n";
      }

      # Auth server routes
      location /api/auth {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/signup {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/data-servers {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/billing {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/account {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/account-profile {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/stripe {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/webhooks/stripe {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/support {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      location /api/admin {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      # Agent console routes — forwarded through the auth server, which proxies
      # the caller's session cookie to postgrip-agentorchestrator.
      location /api/agents {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
          proxy_http_version 1.1;
          proxy_request_buffering off;
          proxy_buffering off;
          proxy_read_timeout 300s;
          proxy_send_timeout 300s;
      }

      # Control-plane routes — forwarded through the auth server, which resolves
      # the caller's cell and proxies to the right postgrip-workerorchestrator.
      # Keep this regex in sync with WORKERORCHESTRATOR_PREFIXES in
      # postgrip-server/proxy/workerorchestrator-proxy.ts.
      location ~ ^/api/(apps|workers|targets|dashboard|connections|preferences|backup-schedules|backup-storage|jobs|deploy|integrations|connect|connect-direct|disconnect|query|run-query|bootstrap|preview-table|get-table-ddl|drop-table|truncate-table|get-editable-table-data|execute-dml|get-modify-table-info|alter-table|export-table-csv|create-schema|create-table|host-stats|monitoring-data|active-queries|backup-database|restore-database|list-backups|delete-backup|get-backup-dir|get-app-info)(/|$) {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
          proxy_http_version 1.1;
          proxy_request_buffering off;
          proxy_buffering off;
          proxy_read_timeout 300s;
          proxy_send_timeout 300s;
      }

      location ~ ^/api/admin/(vacuum-analyze|analyze|reindex-table|refresh-materialized-view|cancel-query|terminate-session)(/|$) {
          proxy_pass $auth_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
          proxy_http_version 1.1;
          proxy_request_buffering off;
          proxy_buffering off;
          proxy_read_timeout 300s;
          proxy_send_timeout 300s;
      }

      # Remaining /api/* routes — fallback to workerorchestrator for anything not
      # explicitly handled above.
      location /api {
          proxy_pass $ctrl_upstream$request_uri;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header X-Forwarded-Proto $scheme;
      }

      # Static assets
      location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff2?)$ {
          try_files $uri @asset_not_found;
          expires 1y;
          add_header Cache-Control "public, immutable";
      }

      location @asset_not_found {
          add_header Cache-Control "no-store, no-cache, must-revalidate" always;
          add_header Pragma "no-cache" always;
          add_header Expires "0" always;
          return 404;
      }

      # SPA fallback — serve index.html for all non-asset routes, never cached
      location / {
          try_files $uri $uri/ /index.html;
          add_header Cache-Control "no-cache, no-store, must-revalidate" always;
          add_header Pragma "no-cache" always;
          add_header Expires "0" always;
      }
  }
NGINX_EOF

4.5c. Write .env

The default-cell block seeds the signup region — required, since signup needs at least one active visible region. The Cloudflare email vars (set in Phases 2–3) are written here too and are required — sign-up cannot complete without working email. POSTGRES_* here are how postgrip-server connects to the data stack's database (must match the data stack's values).

cat > "$DEPLOY_ROOT/app/.env" <<EOF
# ---- Database connection (must match the data stack) ----
POSTGRES_HOST=postgres
POSTGRES_INTERNAL_PORT=5432
POSTGRES_USER=${POSTGRES_USER}
POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
POSTGRES_DB=${POSTGRES_DB}

# ---- Public app + UI ----
PUBLIC_URL=${PUBLIC_URL}
CORS_ORIGINS=${PUBLIC_URL}
VITE_AUTH_URL=

# ---- Image tag (set to a number like 1 to pin) ----
POSTGRIP_IMAGE_VERSION=latest

# ---- Auth ----
BETTER_AUTH_SECRET=${BETTER_AUTH_SECRET}
POSTGRIP_HIBP_PASSWORD_CHECK=true
# Optional sign-in OAuth (leave blank to disable):
POSTGRIP_AUTH_GOOGLE_CLIENT_ID=
POSTGRIP_AUTH_GOOGLE_CLIENT_SECRET=
POSTGRIP_AUTH_GITHUB_CLIENT_ID=
POSTGRIP_AUTH_GITHUB_CLIENT_SECRET=

# ---- Email (REQUIRED: Cloudflare Email Service REST API) ----
# Sign-up enforces email verification; these must be set and working.
# https://developers.cloudflare.com/email-service/
CLOUDFLARE_ACCOUNT_ID=${CLOUDFLARE_ACCOUNT_ID}
CLOUDFLARE_API_TOKEN=${CLOUDFLARE_API_TOKEN}
FROM_EMAIL=${FROM_EMAIL}

# ---- Billing (off by default) ----
POSTGRIP_BILLING_PROVIDER=none

# ---- Compose infra ----
POSTGRIP_SERVER_NETWORK=postgrip-server-net
CLOUDFLARE_TUNNEL_TOKEN=

# ---- Internal service URLs (Docker DNS only) ----
POSTGRIP_WORKERORCHESTRATOR_URL=http://postgrip-workerorchestrator:4000
POSTGRIP_AGENTORCHESTRATOR_URL=http://postgrip-agentorchestrator:4100
# Public agent-orchestrator URL (set later, before enrolling agents):
POSTGRIP_PUBLIC_AGENTORCHESTRATOR_URL=

# ---- Default signup region (REQUIRED for signup) ----
DEFAULT_CELL_WORKERORCHESTRATOR_ID=default
DEFAULT_CELL_WORKERORCHESTRATOR_DISPLAY_NAME=Default
DEFAULT_CELL_WORKERORCHESTRATOR_REGION_CODE=default
DEFAULT_CELL_PUBLIC_WORKERORCHESTRATOR_URL=${PUBLIC_URL}
DEFAULT_CELL_WORKERORCHESTRATOR_VISIBLE=true
EOF
chmod 600 "$DEPLOY_ROOT/app/.env"

4.5d. Validate, pull, and start

cd "$DEPLOY_ROOT/app"
docker compose config --quiet   # validate compose + .env interpolation (fails fast on a typo)
docker compose pull
docker compose up -d --wait     # postgrip-server runs migrations, then ui; blocks until healthy

Checklist:

  • docker-compose.yml, nginx.conf.template, .env exist in $DEPLOY_ROOT/app, and .env is chmod 600.
  • The email vars are populated (not blank/placeholder): grep -E '^(CLOUDFLARE_ACCOUNT_ID|CLOUDFLARE_API_TOKEN|FROM_EMAIL)=' "$DEPLOY_ROOT/app/.env" shows real values.
  • docker compose config --quiet passed (no unset-variable / template errors).
  • docker compose up -d --wait exited 0 (migrations ran, server + ui healthy).

Verify:

( cd "$DEPLOY_ROOT/app" && docker compose ps )   # postgrip-server, postgrip-ui "healthy"

Phase 5 — Workerorchestrator stack

Prerequisites: Phases 4 and 4.5 healthy (network + DB up, and the app's postgrip-server has applied migrations — it must run before this stack so the controlworker schema rename happens first).

5a. Write docker-compose.yml

cat > "$DEPLOY_ROOT/workerorchestrator/docker-compose.yml" <<'YAML_EOF'
services:
  postgrip-workerorchestrator:
    image: postgrip/workerorchestrator:${POSTGRIP_IMAGE_VERSION:-latest}
    restart: unless-stopped
    expose:
      - "4000"
    ports:
      - "${POSTGRIP_WORKERORCHESTRATOR_PORT:-127.0.0.1:4000}:4000"
    environment:
      PORT: "4000"
      AUTH_ENABLED: ${AUTH_ENABLED:-true}
      AUTH_SERVER_URL: ${AUTH_SERVER_URL:-http://postgrip-server:3000}
      POSTGRES_HOST: ${POSTGRES_HOST:-postgres}
      POSTGRES_INTERNAL_PORT: ${POSTGRES_INTERNAL_PORT:-5432}
      POSTGRES_USER: ${POSTGRES_USER:?set POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:?set POSTGRES_PASSWORD}
      POSTGRES_DB: ${POSTGRES_DB:?set POSTGRES_DB}
      WORKER_TOKEN_SECRET: ${WORKER_TOKEN_SECRET:-}
      KEK_SECRET: ${KEK_SECRET:?set KEK_SECRET}
      KEK_KEY_ID: ${KEK_KEY_ID:-kek-v1}
      INTERNAL_API_KEY: ${INTERNAL_API_KEY:?set INTERNAL_API_KEY}
      CORS_ORIGINS: ${CORS_ORIGINS:-}
      POSTGRIP_PUBLIC_WORKERORCHESTRATOR_URL: ${POSTGRIP_PUBLIC_WORKERORCHESTRATOR_URL:-}
      WORKER_EXPECTED_VERSION: ${WORKER_EXPECTED_VERSION:-1}
      WORKER_IMAGE: ${WORKER_IMAGE:-postgrip/worker:1}
      GITHUB_APP_ID: ${GITHUB_APP_ID:-}
      GITHUB_APP_SLUG: ${GITHUB_APP_SLUG:-}
      GITHUB_APP_PRIVATE_KEY: ${GITHUB_APP_PRIVATE_KEY:-}
      GITHUB_APP_SETUP_URL: ${GITHUB_APP_SETUP_URL:-}
      GITHUB_APP_WEBHOOK_SECRET: ${GITHUB_APP_WEBHOOK_SECRET:-}
      GITHUB_OAUTH_CLIENT_ID: ${GITHUB_OAUTH_CLIENT_ID:-}
      GITHUB_OAUTH_CLIENT_SECRET: ${GITHUB_OAUTH_CLIENT_SECRET:-}
      GITHUB_OAUTH_REDIRECT_URL: ${GITHUB_OAUTH_REDIRECT_URL:-}
      GITHUB_OAUTH_STATE_SECRET: ${GITHUB_OAUTH_STATE_SECRET:-}
      WORKER_ATTESTATION_MODE: ${WORKER_ATTESTATION_MODE:-enforce_enroll}
      WORKER_ATTESTATION_MAX_AGE: ${WORKER_ATTESTATION_MAX_AGE:-15m}
    healthcheck:
      test: ["CMD", "wget", "-qO-", "--tries=1", "http://localhost:4000/health"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 10s
    networks:
      - postgrip_server_net

networks:
  postgrip_server_net:
    external: true
    name: ${POSTGRIP_SERVER_NETWORK:-postgrip-server-net}
YAML_EOF

5b. Write .env

cat > "$DEPLOY_ROOT/workerorchestrator/.env" <<EOF
POSTGRIP_WORKERORCHESTRATOR_PORT=127.0.0.1:4000
POSTGRIP_SERVER_NETWORK=postgrip-server-net
POSTGRIP_IMAGE_VERSION=latest

AUTH_ENABLED=true
AUTH_SERVER_URL=http://postgrip-server:3000

# Same database as the data stack
POSTGRES_HOST=postgres
POSTGRES_INTERNAL_PORT=5432
POSTGRES_USER=${POSTGRES_USER}
POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
POSTGRES_DB=${POSTGRES_DB}

# Control-plane secrets
WORKER_TOKEN_SECRET=${WORKER_TOKEN_SECRET}
KEK_SECRET=${KEK_SECRET}
KEK_KEY_ID=kek-v1
INTERNAL_API_KEY=${INTERNAL_API_KEY}

CORS_ORIGINS=${PUBLIC_URL}
POSTGRIP_PUBLIC_WORKERORCHESTRATOR_URL=${PUBLIC_URL}

# Customer worker self-upgrade target (relevant when you add workers)
WORKER_EXPECTED_VERSION=1
WORKER_IMAGE=postgrip/worker:1

# GitHub deploy integration (REQUIRED — filled in Phase 9)
GITHUB_APP_ID=
GITHUB_APP_SLUG=
GITHUB_APP_PRIVATE_KEY=
GITHUB_APP_SETUP_URL=
GITHUB_APP_WEBHOOK_SECRET=
GITHUB_OAUTH_CLIENT_ID=
GITHUB_OAUTH_CLIENT_SECRET=
GITHUB_OAUTH_REDIRECT_URL=
GITHUB_OAUTH_STATE_SECRET=

WORKER_ATTESTATION_MODE=enforce_enroll
WORKER_ATTESTATION_MAX_AGE=15m
EOF
chmod 600 "$DEPLOY_ROOT/workerorchestrator/.env"

5c. Validate, pull, and start

cd "$DEPLOY_ROOT/workerorchestrator"
docker compose config --quiet   # validate compose + .env interpolation
docker compose pull
docker compose up -d

Checklist:

  • docker-compose.yml and .env exist in $DEPLOY_ROOT/workerorchestrator, and .env is chmod 600.
  • docker compose config --quiet passed.
  • Container is running and becomes healthy (docker compose ps).

Verify:

( cd "$DEPLOY_ROOT/workerorchestrator" && docker compose exec -T postgrip-workerorchestrator \
    wget -qO- http://localhost:4000/health ) ; echo

Phase 6 — Agentorchestrator stack

Prerequisites: Phases 4 and 4.5 healthy. This stack uses a full database URL (POSTGRIP_AGENT_DATABASE_URL) built from the same DB identity + password, on the internal port 5432.

6a. Write docker-compose.yml

cat > "$DEPLOY_ROOT/agentorchestrator/docker-compose.yml" <<'YAML_EOF'
services:
  postgrip-agentorchestrator:
    image: postgrip/agentorchestrator:${POSTGRIP_IMAGE_VERSION:-latest}
    user: "0:0"
    entrypoint: ["postgrip-agentorchestrator-entrypoint"]
    command: ["-addr", "0.0.0.0:4100"]
    restart: unless-stopped
    ports:
      - "${POSTGRIP_AGENTORCHESTRATOR_PORT:-127.0.0.1:4100}:4100"
    environment:
      AUTH_SERVER_URL: ${AUTH_SERVER_URL:-http://postgrip-server:3000}
      POSTGRIP_AGENT_TOKEN_SECRET: ${POSTGRIP_AGENT_TOKEN_SECRET:?set POSTGRIP_AGENT_TOKEN_SECRET}
      POSTGRIP_AGENT_ENROLLMENT_KEY: ${POSTGRIP_AGENT_ENROLLMENT_KEY:?set POSTGRIP_AGENT_ENROLLMENT_KEY}
      POSTGRIP_AGENT_DATABASE_URL: ${POSTGRIP_AGENT_DATABASE_URL:-}
      POSTGRIP_AGENTORCHESTRATOR_STATE_PATH: ${POSTGRIP_AGENTORCHESTRATOR_STATE_PATH:-/agent-state/state.json}
      POSTGRIP_AGENT_EXPECTED_VERSION: ${POSTGRIP_AGENT_EXPECTED_VERSION:-10}
      POSTGRIP_AGENT_IMAGE: ${POSTGRIP_AGENT_IMAGE:-postgrip/agent:10}
      POSTGRIP_AGENT_MAINTENANCE_WINDOW_START_MINUTE: ${POSTGRIP_AGENT_MAINTENANCE_WINDOW_START_MINUTE:-120}
      POSTGRIP_AGENT_MAINTENANCE_WINDOW_DURATION_MINUTES: ${POSTGRIP_AGENT_MAINTENANCE_WINDOW_DURATION_MINUTES:-60}
      POSTGRIP_AGENT_MAINTENANCE_WINDOW_TIMEZONE: ${POSTGRIP_AGENT_MAINTENANCE_WINDOW_TIMEZONE:-UTC}
      POSTGRIP_AGENT_ATTESTATION_MODE: ${POSTGRIP_AGENT_ATTESTATION_MODE:-observe}
      POSTGRIP_AGENT_ATTESTATION_MAX_AGE: ${POSTGRIP_AGENT_ATTESTATION_MAX_AGE:-24h}
    healthcheck:
      test: ["CMD", "wget", "-qO-", "--tries=1", "http://localhost:4100/healthz"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 5s
    volumes:
      - postgrip-agentorchestrator-state:/agent-state
    networks:
      - postgrip-agentorchestrator-net
      - postgrip-server-net

volumes:
  postgrip-agentorchestrator-state:

networks:
  postgrip-agentorchestrator-net:
  postgrip-server-net:
    external: true
    name: ${POSTGRIP_SERVER_NETWORK:-postgrip-server-net}
YAML_EOF

6b. Write .env

cat > "$DEPLOY_ROOT/agentorchestrator/.env" <<EOF
POSTGRIP_AGENTORCHESTRATOR_PORT=127.0.0.1:4100
POSTGRIP_SERVER_NETWORK=postgrip-server-net
POSTGRIP_IMAGE_VERSION=latest

AUTH_SERVER_URL=http://postgrip-server:3000
POSTGRIP_AGENT_TOKEN_SECRET=${POSTGRIP_AGENT_TOKEN_SECRET}
POSTGRIP_AGENT_ENROLLMENT_KEY=${POSTGRIP_AGENT_ENROLLMENT_KEY}

# Durable state in the shared Postgres (internal port 5432). The shared
# 'postgrip_admin' role already has the SELECT on auth.member the console needs.
POSTGRIP_AGENT_DATABASE_URL=postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}

POSTGRIP_AGENT_EXPECTED_VERSION=10
POSTGRIP_AGENT_IMAGE=postgrip/agent:10
POSTGRIP_AGENT_MAINTENANCE_WINDOW_START_MINUTE=120
POSTGRIP_AGENT_MAINTENANCE_WINDOW_DURATION_MINUTES=60
POSTGRIP_AGENT_MAINTENANCE_WINDOW_TIMEZONE=UTC
POSTGRIP_AGENT_ATTESTATION_MODE=observe
POSTGRIP_AGENT_ATTESTATION_MAX_AGE=24h
EOF
chmod 600 "$DEPLOY_ROOT/agentorchestrator/.env"

6c. Validate, pull, and start

cd "$DEPLOY_ROOT/agentorchestrator"
docker compose config --quiet   # validate compose + .env interpolation
docker compose pull
docker compose up -d

Checklist:

  • docker-compose.yml and .env exist in $DEPLOY_ROOT/agentorchestrator, and .env is chmod 600.
  • docker compose config --quiet passed.
  • POSTGRIP_AGENT_DATABASE_URL points at @postgres:5432/ (internal port).
  • Container is running and becomes healthy.
  • Deferred TODO: before enrolling your first agent, set POSTGRIP_PUBLIC_AGENTORCHESTRATOR_URL in app/.env (the only agent-public var the app stack wires) and recreate the app stack. It is correctly blank now, but agents can't be added until it's set.

Verify (per-service):

( cd "$DEPLOY_ROOT/agentorchestrator" && docker compose exec -T postgrip-agentorchestrator \
    wget -qO- http://localhost:4100/healthz ) ; echo

Phase 6.5 — Internal end-to-end gate (before exposing publicly)

Prerequisites: Phases 4–6 healthy. (This checks the app stack — UI → auth — not the agentorchestrator; run it here, before any public exposure.)

This proves the app stack works on its own, so any later failure can be pinned on the tunnel/DNS rather than the application. It exercises the same Nginx → server path the browser will use, but from inside the network:

cd "$DEPLOY_ROOT/app"
docker compose exec -T postgrip-ui wget -qO- http://127.0.0.1/healthz ; echo            # -> ok
docker compose exec -T postgrip-ui wget -qO- http://127.0.0.1/api/auth/get-session ; echo # -> null (HTTP 200)

Checklist:

  • /healthz returns ok from inside the UI container.
  • /api/auth/get-session returns null (a 200 with no session) — this confirms UI → server proxying works. A connection error or 502 here means fix the app before touching DNS/TLS.

Phase 7 — Expose the site publicly (DECISION)

Prerequisites: Phases 4–6 healthy. The UI listens on port 80 inside the network only. Resolve the exposure DECISION from Phase 0.

Option A — Cloudflare Tunnel

Prerequisite: a tunnel token from the Cloudflare Zero Trust dashboard (Networks → Tunnels → create/select a tunnel → copy the token).

  1. Store the token in the data stack .env (cloudflared lives there):
    sed -i "s#^CLOUDFLARE_TUNNEL_TOKEN=.*#CLOUDFLARE_TUNNEL_TOKEN=PASTE_TOKEN_HERE#" "$DEPLOY_ROOT/data/.env"
  2. Create the public-hostname route — this is the real gate; it creates DNS. In the tunnel's Public Hostname tab add: hostname $DOMAIN, service http://postgrip-ui:80. Saving this writes a proxied CNAME for $DOMAIN into Cloudflare DNS. Until this exists, the domain does not resolve and any curl $PUBLIC_URL fails with "Could not resolve host" — which looks like an app failure but is not.
  3. Confirm the DNS record exists before starting the tunnel:
    dig +short "$DOMAIN" @1.1.1.1     # expect Cloudflare IPs (e.g. 104.x.x.x / 172.67.x.x)
  4. Start the tunnel sidecar:
    cd "$DEPLOY_ROOT/data" && docker compose --profile cloudflare up -d --wait

Normal cloudflared log noise: it opens ~4 HA connections to the Cloudflare edge and may log already connected to this server, trying another address for some of them. That's it choosing distinct edge endpoints — not an error.

Checklist (Option A):

  • Token is stored in data/.env and the cloudflared sidecar is running (cd "$DEPLOY_ROOT/data" && docker compose --profile cloudflare ps).
  • A public-hostname route maps $DOMAINhttp://postgrip-ui:80.
  • dig +short "$DOMAIN" @1.1.1.1 returns Cloudflare IPs (the DNS record exists).

Option B — Your own reverse proxy + TLS

# 1. Publish the UI on a host loopback port: uncomment the `ports:` block under
#    postgrip-ui in $DEPLOY_ROOT/app/docker-compose.yml, then:
cd "$DEPLOY_ROOT/app" && docker compose up -d
# 2. Point your TLS-terminating proxy at http://127.0.0.1:8080, forwarding Host.
#    Example Caddy:  postgrip.example.com { reverse_proxy 127.0.0.1:8080 }

Checklist (Option B):

  • The reverse proxy terminates TLS for $DOMAIN and forwards the Host header to the UI port.
  • DNS for $DOMAIN points at the proxy/host.

Checklist (both):

  • PUBLIC_URL/CORS_ORIGINS in app/.env exactly match the public HTTPS origin.

Verify:

# The deploy host often can't resolve its own public domain (split-horizon or a
# negative-DNS cache from before the record existed). Try direct first, then fall
# back to forcing resolution via Cloudflare's resolver:
curl -fsS "$PUBLIC_URL/healthz" \
  || curl -fsS --resolve "$DOMAIN:443:$(dig +short "$DOMAIN" @1.1.1.1 | head -1)" "$PUBLIC_URL/healthz"
echo "   <- expect: ok"

Apex → www redirect (only if PUBLIC_URL uses a www. host)

If PUBLIC_URL is https://www.example.com, visitors (and email links) that hit the bare apex example.com should 308 to the canonical www origin, or they get a cert/host mismatch. Route the apex to the same ingress (a second public hostname on the tunnel, or another DNS record at your proxy), then turn on the built-in canonicalization in app/.env and recreate the app stack:

cat >> "$DEPLOY_ROOT/app/.env" <<EOF
NGINX_REDIRECT_HOST=${DOMAIN#www.}
NGINX_CANONICAL_ORIGIN=$PUBLIC_URL
EOF
( cd "$DEPLOY_ROOT/app" && docker compose up -d postgrip-ui )

The UI then answers the apex host with a 308 to $PUBLIC_URL. (Skip this entirely if PUBLIC_URL is already an apex domain.)


Phase 8 — End-to-end verification and first account

Prerequisites: Phases 4–7 complete.

# Per-stack status is least ambiguous if other containers run on the host:
for d in data app workerorchestrator agentorchestrator; do
  ( cd "$DEPLOY_ROOT/$d" && docker compose ps )
done
# (If the host can't resolve its own public domain, use the --resolve fallback
#  from Phase 7's Verify block.)
curl -fsS "$PUBLIC_URL/api/auth/get-session" >/dev/null && echo "auth edge OK"

⚠ Email must work for sign-up. postgrip-server runs with requireEmailVerification: true and sends a verification email on sign-up via the Cloudflare Email Service (configured as a required input in Phases 2–4.5). If CLOUDFLARE_ACCOUNT_ID / CLOUDFLARE_API_TOKEN / FROM_EMAIL are wrong, the token lacks Email Sending → Edit, or FROM_EMAIL isn't a verified sender, the verification email never arrives and the first sign-up cannot complete or log in — which reads as "the install is broken" but isn't. The step below confirms email is actually working.

Then open $PUBLIC_URL in a browser and sign up. Each sign-up creates an account with a personal organization, assigned to the seeded region, and triggers a verification email. The account is unusable until that email is confirmed, so a real (deliverable) address is the cleanest end-to-end test.

If a verification email does not arrive, check the server logs for the send result:

( cd "$DEPLOY_ROOT/app" && docker compose logs --tail=100 postgrip-server ) | grep -i -E "email|cloudflare|verif"
# A Cloudflare API/auth/sender error here points at the token scope or an
# unverified FROM_EMAIL, not at the application.

Core platform acceptance checklist (setup is not complete until Phase 9 is also done — GitHub deploy is required):

  • All five app containers report healthy (postgres, postgrip-server, postgrip-ui, postgrip-workerorchestrator, postgrip-agentorchestrator) — plus cloudflared if you chose the tunnel (six total).
  • GET $PUBLIC_URL/healthz returns ok.
  • GET $PUBLIC_URL/api/auth/get-session returns a 2xx/JSON (not 5xx).
  • The sign-up page loads and a test account can be created.
  • A verification email is actually delivered to the sign-up address, and confirming it lets the account log in. (No email = check the Cloudflare token scope / verified FROM_EMAIL.)
  • After login, the dashboard loads (proves UI → server → workerorchestrator routing).
  • Secret files are backed up and excluded from version control (do it now — see below).
  • Proceed to Phase 9 (required): GitHub deploy integration is configured and connected.

Back up and protect the secret files

All credentials live in six files under $DEPLOY_ROOT: .deploy.env, .secrets.env, and data/.env, app/.env, workerorchestrator/.env, agentorchestrator/.env. Make one encrypted-at-rest copy and store it in a password manager / secrets vault — losing them means losing DB access and invalidating sessions.

# One restricted-permission archive of every secret-bearing file:
tar -czf "$HOME/postgrip-secrets-$(date +%Y%m%d).tgz" -C "$DEPLOY_ROOT" \
  .deploy.env .secrets.env \
  data/.env app/.env workerorchestrator/.env agentorchestrator/.env
chmod 600 "$HOME"/postgrip-secrets-*.tgz
# Move that .tgz to your password manager / vault, then delete the local copy.

If $DEPLOY_ROOT (or any copy of these files) is ever inside a git repo, ignore them:

printf '%s\n' '.env' '.deploy.env' '.secrets.env' '*/.env' 'data/pgdata/' \
  >> "$DEPLOY_ROOT/.gitignore"

Do not back up data/pgdata/ with these (it's the live database; use a proper pg_dump/volume backup for data). These files are credentials only.


Phase 9 — GitHub deploy integration (required)

Building/deploying connected GitHub repos (preview deploys for PRs, production deploys on push/merge) is core to PostGrip — treat this phase as required, not optional. A deployment without it can't do the thing the platform exists for. Configured on the workerorchestrator stack.

Deploys also need a worker. This phase wires up the GitHub integration (credentials + webhooks on the workerorchestrator). Builds and deployments actually execute on an enrolled worker node, which is customer-hosted and added from the UI (Add Worker wizard) — see Adding workers and agents. Configure the integration here; enroll at least one worker before expecting deploys to run.

Separate from GitHub sign-in (POSTGRIP_AUTH_GITHUB_* on the server). Use a different GitHub App / OAuth app for deploys.

Prerequisites: Phases 4–8 done; the site is reachable from the public internet; POSTGRIP_PUBLIC_WORKERORCHESTRATOR_URL=$PUBLIC_URL (already set in Phase 5). GitHub-facing URLs all live on the public origin:

Purpose URL
OAuth/App callback & setup $PUBLIC_URL/api/integrations/github/callback
GitHub App webhook $PUBLIC_URL/api/webhooks/github/app
Per-repo webhook (OAuth mode, auto-created) $PUBLIC_URL/api/webhooks/github/deploy/{projectId}

Option A — GitHub App (recommended)

  1. Create a GitHub App (org Settings → Developer settings → GitHub Apps → New):
    • Callback URL and Setup URL: $PUBLIC_URL/api/integrations/github/callback (tick Redirect on update).
    • Webhook: Active; URL $PUBLIC_URL/api/webhooks/github/app; Secret = openssl rand -hex 32.
    • Repository permissions: Contents: Read-only, Metadata: Read-only, Pull requests: Read-only.
    • Subscribe to events: Pull request, Push.
  2. Collect: App ID, the slug from https://github.com/apps/<slug>, and a generated private key (.pem).
  3. Install the app on the target repos.
  4. Set in $DEPLOY_ROOT/workerorchestrator/.env:
    GITHUB_APP_ID=123456
    GITHUB_APP_SLUG=your-app-slug
    # One line; use literal \n between PEM lines (quotes are stripped):
    GITHUB_APP_PRIVATE_KEY=-----BEGIN RSA PRIVATE KEY-----\n...\n-----END RSA PRIVATE KEY-----\n
    GITHUB_APP_WEBHOOK_SECRET=<the webhook secret>
    GITHUB_APP_SETUP_URL=$PUBLIC_URL/api/integrations/github/callback

The workerorchestrator validates the private key at startup and refuses to start if GITHUB_APP_ID/GITHUB_APP_PRIVATE_KEY are set without GITHUB_APP_WEBHOOK_SECRET. App mode is active only when slug + ID + private key are all present.

Option B — OAuth App (fallback)

Create an OAuth App (Authorization callback URL $PUBLIC_URL/api/integrations/github/callback), then set:

GITHUB_OAUTH_CLIENT_ID=...
GITHUB_OAUTH_CLIENT_SECRET=...
GITHUB_OAUTH_REDIRECT_URL=$PUBLIC_URL/api/integrations/github/callback
# Optional: GITHUB_OAUTH_STATE_SECRET (defaults to KEK_SECRET),
#           GITHUB_OAUTH_SCOPES (default: repo admin:repo_hook read:user)

OAuth mode registers a per-repo webhook automatically, which is why the default scopes include admin:repo_hook and why POSTGRIP_PUBLIC_WORKERORCHESTRATOR_URL must be public.

Apply and verify

cd "$DEPLOY_ROOT/workerorchestrator" && docker compose up -d
docker compose logs --tail=50 postgrip-workerorchestrator     # no GITHUB_APP_* errors

Checklist:

  • A GitHub App or OAuth app is created with the exact callback/webhook URLs above.
  • The matching GITHUB_* vars are set in the workerorchestrator .env and the container restarted cleanly.
  • In the UI at $PUBLIC_URL/console/deployCreate deploy projectConnect GitHub succeeds (status shows Connected to GitHub as @login).
  • A test webhook delivery returns 2xx (GitHub App → Advanced → Recent Deliveries, or repo → Settings → Webhooks).

Event behavior: pull_request opened/reopened/synchronize/ready_for_review → preview build; pull_request closed → cleanup (and if merged, deploy the matching prod/branch env); push → deploy the matching prod/branch env.


Adding workers and agents

The server-side control plane (Phases 1–9) does not execute customer workloads itself. Deploys/builds (and SQL execution against customer databases) run on worker nodes, and automation runs on agent nodes. Both are customer-hosted (often a separate machine) and enrolled from the UI after the control plane is up:

  • Worker — required to actually run deploys. Without at least one enrolled worker, GitHub deploy (Phase 9) is configured but has nothing to build on. In the UI, use the Add Worker wizard: it generates a one-time enrollment key and a ready-to-run docker compose snippet for the worker host. The worker connects back to POSTGRIP_PUBLIC_WORKERORCHESTRATOR_URL (your public origin).
  • Agent — optional automation. Use the Add Agent wizard; before this, set POSTGRIP_PUBLIC_AGENTORCHESTRATOR_URL in app/.env (the Phase 6 deferred TODO) and recreate the app stack.

Worker/agent host setup is out of scope for this server-side runbook — the wizards generate the exact files for those machines. Ask if you want a companion worker-enrollment runbook.


Other optional integrations (quick reference)

Set on the app stack .env, then docker compose up -d from app/:

Integration Variables
Google sign-in POSTGRIP_AUTH_GOOGLE_CLIENT_ID, POSTGRIP_AUTH_GOOGLE_CLIENT_SECRET (callback $PUBLIC_URL/api/auth/callback/google)
GitHub sign-in POSTGRIP_AUTH_GITHUB_CLIENT_ID, POSTGRIP_AUTH_GITHUB_CLIENT_SECRET (callbacks $PUBLIC_URL/api/auth/callback/github and $PUBLIC_URL/api/signup/github/callback)
Stripe billing POSTGRIP_BILLING_PROVIDER=stripe, STRIPE_SECRET_KEY, STRIPE_PUBLISHABLE_KEY, STRIPE_POSTGRIP_PLUS_PRICE_ID, STRIPE_POSTGRIP_PLUS_SERVICE_PRICE_ID, STRIPE_POSTGRIP_PRO_PRICE_ID, STRIPE_POSTGRIP_PRO_SERVICE_PRICE_ID, STRIPE_WEBHOOK_SECRET (webhook → $PUBLIC_URL/api/webhooks/stripe)

Email (CLOUDFLARE_ACCOUNT_ID / CLOUDFLARE_API_TOKEN / FROM_EMAIL) is not here — it is required and configured in Phases 2–4.5. See §0 Required inputs.


Environment variable reference

Data stack (data/.env)

Variable Required Description
POSTGRES_USER / POSTGRES_PASSWORD / POSTGRES_DB Bundled Postgres identity. Reuse the same values in the app and workerorchestrator stacks.
POSTGRES_PORT Host port mapping for admin access (default 5433, loopback).
POSTGRES_DATA_DIR Host path bind-mounted for Postgres data.
POSTGRIP_SERVER_NETWORK Shared external network name this stack creates (default postgrip-server-net).
CLOUDFLARE_TUNNEL_TOKEN Cloudflare Tunnel token (only with --profile cloudflare).

App stack (app/.env)

Variable Required Description
POSTGRES_HOST / POSTGRES_INTERNAL_PORT / POSTGRES_USER / POSTGRES_PASSWORD / POSTGRES_DB How postgrip-server connects to the data stack's DB (default host postgres, port 5432). User/password/db must match the data stack.
PUBLIC_URL Public site URL; also BETTER_AUTH_URL and invite-link base.
CORS_ORIGINS Comma-separated allowed origins.
BETTER_AUTH_SECRET Session signing secret (32+ chars).
POSTGRIP_IMAGE_VERSION Image tag for the app images (default latest; set a number to pin).
POSTGRIP_HIBP_PASSWORD_CHECK Breached-password check (default true; false for offline).
POSTGRIP_AUTH_GOOGLE_CLIENT_ID / ..._SECRET Google sign-in OAuth.
POSTGRIP_AUTH_GITHUB_CLIENT_ID / ..._SECRET GitHub sign-in OAuth.
CLOUDFLARE_ACCOUNT_ID / CLOUDFLARE_API_TOKEN / FROM_EMAIL Required. Cloudflare Email Service REST API (POST /accounts/{id}/email/sending/send), used for the sign-up verification email and org invites. Token scope Email Sending → Edit; FROM_EMAIL must be a verified sender. Sign-up fails without it.
POSTGRIP_BILLING_PROVIDER none (default) disables billing; stripe enables it with the STRIPE_* keys.
POSTGRIP_WORKERORCHESTRATOR_URL Internal URL the server uses to reach the workerorchestrator.
POSTGRIP_AGENTORCHESTRATOR_URL Internal URL the server uses to proxy /api/agents/*.
POSTGRIP_PUBLIC_AGENTORCHESTRATOR_URL Public agentorchestrator URL baked into customer agent setup (set before enrolling agents).
DEFAULT_CELL_WORKERORCHESTRATOR_ID ✅* Lowercase slug for the seeded signup region. *Required for signup unless you create a region in the admin UI.
DEFAULT_CELL_PUBLIC_WORKERORCHESTRATOR_URL ✅* Public workerorchestrator URL stored in the region (customer workers connect here) — the site origin.
DEFAULT_CELL_WORKERORCHESTRATOR_DISPLAY_NAME / ..._REGION_CODE / ..._VISIBLE Region display name, region code, and signup visibility.
POSTGRIP_SERVER_NETWORK Shared external network the app stack joins (default postgrip-server-net).
NGINX_CANONICAL_ORIGIN / NGINX_REDIRECT_HOST Optional apex → www canonical-host 308 redirect.

The server also accepts a full DATABASE_URL instead of the POSTGRES_* pieces. The pieces are preferred and are what this runbook uses.

Workerorchestrator stack (workerorchestrator/.env)

Variable Required Description
POSTGRES_HOST / POSTGRES_INTERNAL_PORT / POSTGRES_USER / POSTGRES_PASSWORD / POSTGRES_DB Same database as the data stack (user/password/db are mandatory).
KEK_SECRET Envelope-encryption key.
INTERNAL_API_KEY Internal API key for the control plane.
WORKER_TOKEN_SECRET recommended Signs worker access tokens.
KEK_KEY_ID Active KEK id (default kek-v1).
AUTH_ENABLED / AUTH_SERVER_URL Validate sessions against postgrip-server.
CORS_ORIGINS Match the app stack.
POSTGRIP_PUBLIC_WORKERORCHESTRATOR_URL Public base URL (GitHub webhook registration; customer worker connect URL).
POSTGRIP_WORKERORCHESTRATOR_PORT Host port mapping (default 127.0.0.1:4000).
WORKER_EXPECTED_VERSION / WORKER_IMAGE Customer worker self-upgrade target.
GITHUB_APP_* / GITHUB_OAUTH_* ✅ (one set) GitHub deploy integration — required (core feature). Configure either the GitHub App vars (recommended) or the OAuth vars in Phase 9.
WORKER_ATTESTATION_MODE / ..._MAX_AGE Worker attestation policy.

Agentorchestrator stack (agentorchestrator/.env)

Variable Required Description
POSTGRIP_AGENT_TOKEN_SECRET Signs agent access tokens.
POSTGRIP_AGENT_ENROLLMENT_KEY Bootstrap enrollment key.
AUTH_SERVER_URL Auth server for console session validation.
POSTGRIP_AGENT_DATABASE_URL ✅ in production Full Postgres URL (use internal port 5432); needed for durable state and auth.member lookups.
POSTGRIP_AGENTORCHESTRATOR_PORT Host port mapping (default 127.0.0.1:4100).
POSTGRIP_AGENT_EXPECTED_VERSION / POSTGRIP_AGENT_IMAGE Customer agent self-upgrade target.
POSTGRIP_AGENT_MAINTENANCE_WINDOW_* Fallback self-upgrade window.
POSTGRIP_AGENT_ATTESTATION_MODE / ..._MAX_AGE Agent attestation policy.

Upgrades

From each stack directory, in order (data → app, then orchestrators):

docker compose pull && docker compose up -d

Day-to-day you only update the app stack (cd app && docker compose pull && docker compose up -d); it recreates only postgrip-server/postgrip-ui and never touches the data stack. Bumping POSTGRIP_IMAGE_VERSION in a stack .env (or editing the image tag) then up -d recreates that stack at the new version. Running the app stack before the orchestrators lets new migrations apply first.


Troubleshooting

Symptom Fix
postgrip-server crash-loops on a fresh DB Bring up data → app in order (data with --wait first). The app's postgrip-server runs the migrations (incl. a controlworker schema rename) the orchestrators depend on, and restarts until the DB answers. init.sql + start order prevent the race.
network postgrip-server-net not found Start the data stack first (it creates the network), or docker network create postgrip-server-net.
Service name won't resolve (postgres, postgrip-server) Every stack's .env must use the same POSTGRIP_SERVER_NETWORK.
Workerorchestrator exits: set POSTGRES_USER/PASSWORD/DB Those .env values are mandatory; ensure they match the data stack.
agentorchestrator 500s on console calls POSTGRIP_AGENT_DATABASE_URL must use the shared postgrip_admin role (it has SELECT on auth.member) and the internal port 5432.
Signup shows no region / can't complete Set the DEFAULT_CELL_WORKERORCHESTRATOR_* block (Phase 4.5c) or create a visible region in the admin UI.
Emails not sending Set CLOUDFLARE_ACCOUNT_ID, CLOUDFLARE_API_TOKEN (Email Sending → Edit), and a verified FROM_EMAIL.

Logs: cd <stack-dir> && docker compose logs -f <service>.

About

PostGrip

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors