-
Notifications
You must be signed in to change notification settings - Fork 0
Retry and Dead Letter
Delivery is decoupled from the request. POST /hooks/:source enqueues the rendered message and returns 202 immediately. A background worker drains the queue, retrying failed deliveries with exponential backoff, and writes anything that fails every attempt to a durable dead-letter inbox.
The queue (src/queue.js) attempts each job up to RETRY_MAX_ATTEMPTS times. Between attempts it waits an exponentially growing delay with full jitter:
delay(attempt) = random(0, min(baseDelayMs * 2^attempt, maxDelayMs))
| Variable | Default | Meaning |
|---|---|---|
RETRY_MAX_ATTEMPTS |
5 | Attempts before dead-lettering |
RETRY_BASE_DELAY_MS |
500 | Base delay, doubled each attempt |
RETRY_MAX_DELAY_MS |
30000 | Cap on a single backoff delay |
Full jitter spreads retries randomly within the window, so a provider recovering from an outage is not hit by a synchronised burst. Only an email send failure triggers a retry; Slack and Telegram fan-out failures are logged and ignored.
The GET / endpoint reports the current queue depth, and GET /dead-letter reports the count of recorded failures.
When a job exhausts every attempt, it is recorded in the dead-letter inbox (src/deadletter.js):
- Appended to a JSON Lines file at
DEAD_LETTER_FILE(default./data/dead-letter.jsonl) so it survives a restart. - Held in a bounded in-memory ring (the most recent 100 by default) so the listing endpoint is fast.
Each entry looks like:
{
"id": "ltm3k9-a1b2c3",
"ts": "2026-05-31T10:00:00.000Z",
"source": "stripe",
"subject": "Invoice paid: 99.00 GBP",
"attempts": 5,
"error": "resend rejected: invalid api key",
"payload": { "type": "invoice.paid", "data": { "object": { "amount_paid": 9900 } } }
}curl http://localhost:3000/dead-letter # most recent 50
curl 'http://localhost:3000/dead-letter?limit=200'The endpoint returns failures most recent first. Because entries contain the original payload, keep this endpoint behind your platform auth or a private network if your payloads are sensitive.
Set WEBHOOK_REPLAY_TOKEN to enable an authenticated replay endpoint. It re-renders a stored failure from its saved payload and re-enqueues it for delivery. This skips the verifier and the source entirely, so it is the right tool once you have fixed a template or a flaky provider has recovered.
# find the id of the failure you want to replay
curl http://localhost:3000/dead-letter | jq -r '.items[0].id'
# replay it
curl -X POST http://localhost:3000/dead-letter/<id>/replay \
-H "Authorization: Bearer $WEBHOOK_REPLAY_TOKEN"A success returns 202 {"ok":true,"replayed":true} and removes the entry from the in-memory inbox. The original line stays in the JSONL audit log on disk, so the record of the failure is never lost. Behaviour by case:
| Condition | Response |
|---|---|
WEBHOOK_REPLAY_TOKEN unset |
404 (endpoint disabled) |
| Missing or wrong bearer token | 401 |
| Unknown id (aged out of the ring, or never existed) | 404 |
Template now returns { skip: true }
|
200 {"replayed":false,"skipped":true}, entry removed |
| Valid token and id |
202, entry re-enqueued and removed |
The token is compared in constant time. Because the in-memory ring holds the most recent 100 failures by default, replay the endpoint targets those; older entries live only in the JSONL file.
The stored payload is the original webhook body, so a full file replay is a short script that re-POSTs each line through the public hooks endpoint:
while read -r line; do
src=$(echo "$line" | jq -r .source)
echo "$line" | jq -c .payload | \
curl -sS -X POST "http://localhost:3000/hooks/$src" \
-H "Content-Type: application/json" --data-binary @-
echo
done < data/dead-letter.jsonlIf WEBHOOK_SECRET is set you will need to sign each replayed request, since the verifier runs on the hooks endpoint. The built-in /dead-letter/:id/replay endpoint avoids that because it re-enqueues directly.
On SIGTERM or SIGINT the service flushes any undelivered jobs still in the queue to the dead-letter inbox before exiting, so a planned restart or redeploy never silently drops queued work. A hard crash can still lose a job that is mid-retry; if you need durability across crashes, put a real broker in front.
The Docker image creates /app/data, and the docker-compose file mounts a named volume there. Mount a volume at /app/data (or point DEAD_LETTER_FILE elsewhere) on any platform so the inbox survives redeploys.