Skip to content

Discussion: should subAgent/parentAgent return types narrow through Rpc.Stub<T> instead of InstanceType<T>? #1399

@threepointone

Description

@threepointone

Context, not a feature request

Filing this as a question / discussion rather than something to act on immediately. Surfacing it because we just hit the failure mode this typing-question would have caught at compile time, and it seems worth the team having on its radar.

The signature today

subAgent (and parentAgent, getSubAgentByName) are typed as:

async subAgent<T extends typeof Agent>(
  Cls: T,
  name: string
): Promise<InstanceType<T>>;

i.e. you call methods on the helper as if it were a local instance of the class. The DX is great — methods read like regular method calls, properties access works, types come through cleanly.

The trade-off: InstanceType<T> skips the Rpc.Stubable filter that Rpc.Stub<T> would normally apply. Method return types and parameter types aren't checked against the constraints workerd enforces at runtime for native DO RPC.

What we hit

In a new examples/agents-as-tools (helpers-as-sub-agents pattern, not yet merged), I wrote:

export class Researcher extends Agent<Env> {
  startAndStream(query: string, helperId: string): ReadableStream<{ sequence: number; body: string }> {
    /* ... new ReadableStream({ start(controller) { controller.enqueue({ sequence, body }); ... } }) */
  }
}

class Assistant extends Think<Env> {
  // ...
  const helper = await this.subAgent(Researcher, helperId);
  const stream = await helper.startAndStream(query, helperId);  // ← inferred as ReadableStream<{...}>
  const reader = stream.getReader();
  await reader.read();  // ← throws "Network connection lost" at runtime
}

This compiles cleanly. At runtime, reader.read() immediately throws "Network connection lost", the helper's start(controller) callback never runs, and you spend hours chasing eviction / lifecycle / I/O timeout red herrings before realizing workerd's DO RPC ReadableStream only transports Uint8Array chunks.

That constraint is already encoded in @cloudflare/workers-types's Rpc.Stubable union (experimental/index.d.ts ~line 14122) — ReadableStream<Uint8Array> is in the union, arbitrary ReadableStream<T> is not. Raw DO RPC would have caught it:

const stub = env.RESEARCHER.idFromName("x").get(); // DurableObjectStub<Researcher> ≈ Stub<Researcher>
const stream = await stub.streamObjects();         // → narrowed to Promise<never>, TS compile error

But because subAgent returns InstanceType<T> and not Stub<T>, our setup happily inferred ReadableStream<{...}> and TS never complained.

The question

Should the agents framework's subAgent / parentAgent / getSubAgentByName return types narrow through Rpc.Stub<...> so that the Stubable constraint propagates to method calls on the helper?

async subAgent<T extends typeof Agent>(
  Cls: T,
  name: string
): Promise<Rpc.Stub<InstanceType<T>>>;

(Or some equivalent — Agent doesn't currently extend RpcTarget, so getting Stub<...> to apply cleanly probably requires Agent to carry the RpcTargetBranded brand or similar. That's a bigger ripple but a one-time change.)

Pro

  • TS would have caught our case at compile time. helper.startAndStream(...) would have inferred Promise<never> and the cast to ReadableStream<{...}> would have been forced — making the constraint visible exactly where it matters.
  • Aligns with the typing convention raw DO RPC already uses (DurableObjectStub<T> is Stub<T>-shaped). Same constraint, same enforcement, same surface.
  • Future runtime constraints baked into Stubable get caught for free.

Con

  • Loses some "feels like a local object" ergonomics. Properties / methods / return types not in Stubable would narrow to never, which is the desired behavior for RPC-incorrect cases but can be friction for code that's "morally local" (e.g., a helper class that consumers happen to import for shared types but never call across DO boundaries).
  • Existing code in the wild that depends on the looser typing might break. Probably small population since agents is still pre-1.0, but worth surveying.
  • The Stubable constraint is broad and not always intuitive. Folks would occasionally hit "why is my Map<K, V> return type narrowed?" surprises until they internalize the rule.

Open questions

  • Is Agent already meant to be RpcTarget-shaped? It runs on Durable Objects, calls cross DO boundaries are RPC, so morally yes. Whether the type system formalizes that is a separate question.
  • Are there cases where you'd want a sub-agent stub typed as InstanceType<T> deliberately? E.g. accessing protected state introspection methods from tests / coordination code that isn't crossing the RPC boundary. If so, an opt-out would be useful.
  • Migration cost. All examples that currently rely on the loose typing would need an audit. Given how new most of these examples are, probably manageable; worth a sweep.

Why this is "discuss, not do now"

The runtime error message that made us hit this is the bigger problem — that's tracked in cloudflare/workerd#6675, which proposes either a clearer error string or expanding Stubable to cover ReadableStream<StructuredCloneable>. If workerd ships option 2, the constraint relaxes and our case wouldn't have failed at all (in which case the typing tightening here would be slightly less urgent).

Also, the agents framework is mid-evolution on a bunch of related typing concerns (sub-agent routing types, multi-session, helper protocol — see wip/inline-sub-agent-events.md for context on where this surfaced). It's worth folding the typing question into those broader discussions rather than answering it in isolation.

So: noting it for later, with a concrete failure case for context. Not asking for action.

Concrete reproduction trail

  • The helpers-as-sub-agents prototype where this surfaced: examples/agents-as-tools (in-flight branch, not yet merged).
  • The actual fix once we knew the runtime constraint: switch Researcher.startAndStream to return ReadableStream<Uint8Array> (NDJSON-encoded) instead of an object stream. ~30 LOC; straightforward once visible.
  • Standalone workerd reproduction (no agents at all, just raw DO RPC + a cast to bypass the type system): see cloudflare/workerd#6675 — confirms the runtime behavior independent of agents.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions