Skip to content

Durable Object ordering is not preserved when mixing RPC and fetch calls on the same stub #6561

@threepointone

Description

@threepointone

Summary

When a caller fires stub.rpc() and stub.fetch() calls interleaved on the same Durable Object stub without awaiting each one, the DO does not process them in send order. Instead, all fetch calls are processed before all RPC calls, regardless of the order they were initiated. This violates the expected E-order / actor-model guarantee.

Same-type ordering works correctly — pure RPC or pure fetch calls maintain send order. The bug is exclusively about cross-type ordering.

// Caller sends: rpc-0, fetch-1, rpc-2, fetch-3, rpc-4, fetch-5, ...
//
// DO receives:  fetch-1, fetch-3, fetch-5, ..., rpc-0, rpc-2, rpc-4, ...
//               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^
//               all fetches first               then all RPCs

Reproduction

Repo: https://github.com/threepointone/test-do-rpc-fetch-ordering

Live production endpoint (confirms this is not a local-only issue):

curl https://test-ordering.threepointone.workers.dev/test-all?n=20

The mixed result shows "inOrder": false with all fetches before all RPCs.

Test Production wrangler dev vite dev
RPC only (N=20) PASS PASS PASS
fetch only (N=20) PASS PASS PASS
mixed RPC+fetch (N=20) FAIL FAIL FAIL

Root Cause

stub.fetch() and stub.rpc() both create a WorkerInterface via ActorChannel::startRequest(), but they call different methods on it — request() for fetch, customEvent() for RPC. These two paths reach the DO's InputGate (the FIFO queue that serializes incoming events) through a very different number of async hops:

fetch path (~1 hop to InputGate):

WorkerEntrypoint::request()
  → context.run()
    → InputGate::wait()     ← queued here
      → GlobalScope::request()

RPC path (~4+ hops to InputGate):

WorkerEntrypoint::customEvent()
  → JsRpcSessionCustomEvent::run()
    → incomingRequest->delivered()
    → creates EntrypointJsRpcTarget + Cap'n Proto server
    → fulfills capFulfiller
    → co_await donePromise
       ↓ (Cap'n Proto dispatches the pipelined call)
       → JsRpcTargetBase::call()
         → co_await kj::yield()       ← explicit extra yield
         → ctx.run()
           → InputGate::wait()        ← queued here

Since all operations originate from the same synchronous JS execution, all fetch calls (fewer hops) enqueue at the InputGate before any RPC calls (more hops) get there.

The kj::yield() in JsRpcTargetBase::call() (added in 9cf133b4f for ExternalPusher ordering) contributes one extra turn, but is not the sole cause — the Cap'n Proto session setup and capability fulfillment already add several turns before it.

Relevant Code

Possible Fix

The core issue is that InputGate::wait() is called at different points in the call chain for fetch vs RPC. One approach: eagerly acquire the InputGate position in JsRpcSessionCustomEvent::run() (right after delivered(), before the Cap'n Proto session setup), then thread the lock through to the first ctx.run() call. IoContext::run() already accepts a kj::Maybe<InputGate::Lock> parameter for exactly this pattern (used by ensureConstructedImpl). This would not require changing the CustomEvent::run() signature or affecting other custom event types.

The kj::yield() in JsRpcTargetBase::call() and ExternalPusher ordering would be unaffected — the InputGate position is reserved before the yield runs.

I'm happy to send a PR for this if the team agrees with the approach.


The root cause analysis and fix direction were developed in collaboration with Cursor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions