-
Notifications
You must be signed in to change notification settings - Fork 529
Description
Description
Queue.sendBatch() intermittently throws an undocumented error:
Error: Queue sendBatch failed: Queue is overloaded. Please back off.
This error is not documented anywhere — not in the Queues limits, error handling docs, or changelog. It is distinct from the documented Too Many Requests rate limit error.
Reproduction
- Context:
sendBatch()called inside a Cloudflare Workflowstep.do()callback - Batch size: 100 messages (~14 msg/s throughput, well under the 5,000 msg/s documented limit)
- Failure duration: 57 seconds (request hangs, then fails)
- Retry behavior: Succeeds in <1s on immediate retry — clearly transient
- Frequency: Observed on multiple occasions across different workflow instances
Evidence
From wrangler workflows instances describe:
┌───────────────────────┬───────────────────────┬────────────┬──────────┬──────────────────────────────────────────────────────────────────────┐
│ Start │ End │ Duration │ State │ Error │
├───────────────────────┼───────────────────────┼────────────┼──────────┼──────────────────────────────────────────────────────────────────────┤
│ 2/9/2026, 10:20:57 AM │ 2/9/2026, 10:21:54 AM │ 57 seconds │ ❌ Error │ Error: Queue sendBatch failed: Queue is overloaded. Please back off. │
├───────────────────────┼───────────────────────┼────────────┼──────────┼──────────────────────────────────────────────────────────────────────┤
│ 2/9/2026, 10:22:04 AM │ 2/9/2026, 10:22:05 AM │ 1 second │ ✅ Success│ │
└───────────────────────┴───────────────────────┴────────────┴──────────┘
Root Cause Analysis
From the workerd source (src/workerd/api/queue.c++):
JSG_REQUIRE(response.statusCode == 200, Error,
kj::str("Queue sendBatch failed: ", response.statusText));The error is the literal HTTP statusText from the internal queue backend. Based on the Queues v2 architecture blog post, the backend uses Storage Shard Durable Objects. The 57-second hang + overload message is consistent with the DO's internal request queue exceeding capacity — likely because the randomly-assigned shard was hot (other tenants' traffic or autoscaling lag).
Impact
When this error occurs inside a Workflow step.do(), the Workflows engine retries the entire callback. If the callback contained sendBatch() calls that already succeeded before the failure, all messages are re-sent with different message IDs — creating invisible duplicates that cannot be deduplicated at the message level.
In our case, this caused 840 duplicate queue messages per incident.
Questions
- What causes this error? Is it DO shard overload as described above?
- Why does it trigger at ~14 msg/s when the documented limit is 5,000 msg/s?
- Can this be documented alongside the existing
Too Many Requestserror? - Is there a recommended retry strategy beyond what the error message suggests?
Related
MY_QUEUE.send()regularly fails withInternal Server Errorat runtime #1483 —Queue.send()failing with "Internal Server Error" (similar backend-level failures, open since 2023)