Skip to content

Production Checklist

sarmakska edited this page May 3, 2026 · 2 revisions

Production Checklist

Things to add before putting this in front of real webhook traffic.

Must-have

  • HTTPS only. Webhooks send sensitive payloads. Behind Caddy, nginx, Cloudflare, or your platform's TLS termination.
  • HMAC verification on every public source. Set WEBHOOK_SECRET. No exceptions for Stripe / GitHub / Cal.com.
  • Verified Resend sender domain. The default onresend.dev gets spam-filtered.
  • Health check endpoint. Already at /health, used by docker-compose.yml.
  • Restart policy. restart: unless-stopped in compose, or platform equivalent.
  • Logs to stdout. Already done. Pipe to your log aggregator (BetterStack, Axiom, Datadog).

Should-have

Rate limiting

Prevent webhook abuse if your URL leaks.

const rateLimit = require('express-rate-limit')
app.use('/hooks', rateLimit({
  windowMs: 60_000,
  max: 100,  // 100 req/min per IP
  standardHeaders: true,
  legacyHeaders: false,
}))

For per-source limits, key by source path:

keyGenerator: (req) => `${req.ip}:${req.params.source}`,

Body size cap

Default is 1MB (express.json({ limit: '1mb' })). Most webhooks fit. If yours don't:

  • Tune up to 10MB if you genuinely need it
  • Above 10MB, you're getting attacked. Reject at the load balancer.

Failed delivery alerting

If Resend goes down, your webhooks silently 500. Add alerting:

async function sendEmailWithRetry(opts) {
  for (let attempt = 0; attempt < 2; attempt++) {
    try {
      await resend.emails.send(opts)
      return
    } catch (e) {
      if (attempt === 1) {
        // Alert: webhooks falling through
        await alertOps({ error: e.message, opts })
        throw e
      }
    }
  }
}

alertOps could post to PagerDuty, Slack, or just write to a dead-letter file.

IP allowlist (when possible)

Stripe publishes their webhook IPs. So does GitHub. Other providers do too. Whitelist them at the firewall:

# nginx
location /hooks/stripe {
  allow 3.18.12.63;
  allow 3.130.192.231;
  allow 13.235.14.237;
  # ... full list at https://stripe.com/files/ips/ips_webhooks.txt
  deny all;
  proxy_pass http://localhost:3000;
}

Combined with HMAC, this is belt-and-braces.

Replay protection

The basic verifier doesn't prevent replays. If an attacker captures a valid request, they can replay it forever.

For Stripe, use stripe.webhooks.constructEvent which checks the timestamp.

For others, you'd need to track recent signatures (Redis) and reject duplicates within a window. Roughly 30 lines if you need it.

Nice-to-have

Queue ahead of this service

If you process more than ~100 webhooks/sec or care about durability:

graph LR
  W[Webhook source] --> LB[Load balancer]
  LB --> Q[Redis queue]
  Q --> S1[Worker 1]
  Q --> S2[Worker 2]
  S1 --> R[Resend]
  S2 --> R
Loading

Use BullMQ (Redis-backed). Workers are this Express service minus the HTTP layer; they pop from the queue and email.

Adds operational complexity. Only do this when the throughput math actually requires it.

Dead letter queue

When email send fails after retries, persist the payload somewhere recoverable:

// On final failure:
await fs.appendFile(
  '/var/lib/webhook-to-email/dead-letter.jsonl',
  JSON.stringify({ ts: Date.now(), source, payload, error }) + '\n',
)

Or push to S3, or to a Postgres failed_webhooks table. Replay later with a tiny script.

Webhook replay endpoint

Useful for debugging:

app.post('/admin/replay/:id', authMiddleware, async (req, res) => {
  const event = await loadFromDeadLetter(req.params.id)
  // re-run through normal processing
})

Auth-gate this. It's a footgun.

Per-source health metrics

Track per-source success rate and latency. When Stripe's success rate drops below 99 percent for an hour, get alerted.

Skip these

  • Database for every webhook — adds I/O without benefit. Only persist on failure (dead letter).
  • Complex retry policies — one retry covers 95 percent of transient issues. More retries belong in a queue.
  • Templated HTML emails with React Email — the inline HTML in src/index.js is fine for ops emails. Save the design budget for customer-facing emails.

Smoke test post-deploy

# Verify the service is up
curl https://your-domain.com/health
# {"ok":true}

# Verify a known source works (with valid signature)
SECRET="..."
BODY='{"test":true}'
SIG=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" -hex | awk '{print $2}')
curl -X POST https://your-domain.com/hooks/smoketest \
  -H "Content-Type: application/json" \
  -H "X-Signature: sha256=$SIG" \
  -d "$BODY"

# Check email arrived

Clone this wiki locally