Skip to content

Production Checklist

sarmakska edited this page May 31, 2026 · 2 revisions

Production Checklist

Things to confirm before putting this in front of real webhook traffic.

Must-have

  • HTTPS only. Behind Caddy, nginx, Cloudflare, or your platform's TLS termination.
  • HMAC verification on every public source. Set WEBHOOK_SECRET. The verifier picks the right per-provider scheme from the source path. No exceptions for Stripe, GitHub, Cal.com or Linear.
  • Verified Resend sender domain. The default onresend.dev sender gets spam-filtered.
  • Persistent dead-letter volume. Mount a volume at /app/data so failures survive a redeploy.
  • Health check endpoint. /health, already used by docker-compose.yml.
  • Restart policy. restart: unless-stopped in compose, or platform equivalent.
  • Logs to stdout. Pipe to your log aggregator.
  • Protect /dead-letter. It returns stored payloads; keep it behind platform auth or a private network if your payloads are sensitive.

Should-have

Tune the retry queue

The defaults (5 attempts, 500ms base, 30s cap, full jitter) suit most setups. If Resend rate-limits you under bursts, raise RETRY_MAX_ATTEMPTS and RETRY_MAX_DELAY_MS so the backoff rides out the limit window rather than dead-lettering early.

Alert on the dead-letter count

A rising count from GET /dead-letter is the single most useful signal that delivery is broken. Scrape it or alert on the log line, and triage by reading each entry's error field.

Rate limiting

Prevent abuse if your URL leaks. Add express-rate-limit on /hooks, or rate-limit at the proxy:

const rateLimit = require('express-rate-limit')
app.use('/hooks', rateLimit({ windowMs: 60_000, max: 100, standardHeaders: true, legacyHeaders: false }))

Body size cap

Default is 1MB (the bodyLimit passed to createApp). Most webhooks fit. Raise it only if a source genuinely needs more; above 10MB, reject at the load balancer.

IP allowlist (where possible)

Stripe and GitHub publish their webhook IP ranges. Allowlist them at the firewall as belt-and-braces alongside HMAC.

Nice-to-have

Replay tooling

The dead-letter file stores the original payload per line, so a short script can re-POST failures once the underlying problem is fixed. See Retry-and-Dead-Letter for an example.

Queue ahead of this service

If you process well beyond typical webhook volumes or need durability across hard crashes, run several instances behind a broker. The in-memory queue handles transient outages within a process; a broker handles process loss.

Per-source metrics

Expose per-source success rate and latency, and alert when a source's success rate drops.

Skip these

  • A database for every webhook. Only the failures are persisted, in the dead-letter file. That is enough for recovery.
  • Hand-written HTML emails. Templates return Markdown and the renderer styles it. Save the design budget for customer-facing email.

Smoke test post-deploy

curl https://your-domain.com/health
# {"ok":true}

SECRET="your-secret"
BODY='{"test":true}'
SIG=$(printf '%s' "$BODY" | openssl dgst -sha256 -hmac "$SECRET" -hex | awk '{print $2}')
curl -X POST https://your-domain.com/hooks/smoketest \
  -H "Content-Type: application/json" \
  -H "X-Signature: sha256=$SIG" \
  --data-raw "$BODY"
# 202 {"ok":true,"queued":true,...}

curl https://your-domain.com/dead-letter
# confirm the count stays at 0

Clone this wiki locally