Skip to content

getWritable() returns a new TransformStream per call — racing pipes reorder chunks when callers acquire per-write #2058

@rpinckne

Description

@rpinckne

Summary

Calling getWritable() more than once within a single step body produces non-deterministic chunk ordering on Vercel prod. The API surface gives no hint that re-acquisition is unsafe.

Where the race lives

@workflow/core/dist/step/writable-stream.js:16-39 (workflow@5.0.0-beta.6):

export function getWritable(options = {}) {
    // ...
    const serialize = getSerializeStream(getExternalReducers(...), ctx.encryptionKey);
    const serverWritable = new WorkflowServerWritableStream(runId, name);
    const state = createFlushableState();
    ctx.ops.push(state.promise);
    flushablePipe(serialize.readable, serverWritable, state).catch(() => {});
    pollWritableLock(serialize.writable, state);
    return serialize.writable;
}

Every invocation constructs a fresh TransformStream + WorkflowServerWritableStream + an independent background flushablePipe(...). The (runId, name) is shared, but the pipes are independent client-side and race to flush into the same downstream queue.

Repro

Sequential per-chunk acquisition in a step body, no parallelism:

'use step';
export async function emitStream(parts: UIMessageChunk[]) {
  for (const part of parts) {
    const writer = getWritable<UIMessageChunk>().getWriter();
    try {
      await writer.write(part);
    } finally {
      writer.releaseLock();
    }
  }
}

Emit ~6 small chunks (e.g. text-deltas "nov", "o", " e", "2", "e", " ok"). Read the resulting stream via workflowRun.getReadable<UIMessageChunk>(...).

  • Local dev (world-local): deltas arrive in order. Stream reads "novo e2e ok".
  • Vercel prod (world-vercel): deltas reorder deterministically. Stream reads e.g. "novo2e e ok", with text-delta arriving before text-start.

We verified the model output is correct upstream — direct streamText() and direct engine consumption both produce the right deltas in the right order. The reorder happens entirely between the per-chunk getWritable() calls and the downstream reader.

Mechanism

Documented in your own docs/changelog/eager-processing.mdx:330-331:

On local (world-local), stream writes go to the filesystem — effectively instant. On Vercel (world-vercel), writes go through HTTP to workflow-server → S3, adding 50-100ms latency.

That latency turns the per-pipe race window from microseconds (locally invisible) into tens of milliseconds (prod-observable). Small/fast deltas — exactly the shape of reasoning-model streaming output — surface the bug most reliably.

Canonical pattern (works)

docs/api-reference/workflow-ai/durable-agent.mdx:46 shows the right shape:

const writable = getWritable<UIMessageChunk>();
const result = await agent.stream({ messages, writable });

One call. pipeTo(writable) (or the equivalent inside DurableAgent) holds the writer lock for the pipe's lifetime, so there's one TransformStream + one flushablePipe and the race goes away. We fixed our consumer by switching to this pattern + the AI SDK's toUIMessageStream().pipeTo(writable, { preventClose: true, preventAbort: true }).

Suggestion

The API surface gives no hint that re-acquisition is unsafe. Two options that would have caught this at dev time without needing prod telemetry:

  1. Dev-mode console.warn when getWritable() is called more than once per step context. contextStorage already tracks per-step state — a simple counter on the ctx would flag the misuse the second time it happens.

  2. Memoize per (runId, namespace): repeat calls within the same step ctx return the same handle (same underlying TransformStream, same flushablePipe). Idempotent, no behavior change for the canonical pattern, makes the unsafe pattern simply correct instead of subtly broken.

Either would have caught the bug in our first local test run instead of letting it ship to prod.

Context

We're building Novo Agents (hosted multi-agent service) on Workflow. The full fix on our side is in rpinckne/harness#50. Happy to help with a repro repo or PR if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions