[codex] Handle stream write failures without unhandled rejections#2142
Conversation
🦋 Changeset detectedLatest commit: 107df58 The changes in this PR will be included in the next version bump. This PR includes changesets to release 17 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🧪 E2E Test Results✅ All tests passed Summary
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
|
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 1 step💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express workflow with 25 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 50 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express Promise.all with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Next.js (Turbopack) | Express Promise.race with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 10 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 25 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 50 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 10 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) workflow with 25 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) workflow with 50 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) stream pipeline with 5 transform steps (1MB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express | Nitro | Next.js (Turbopack) 10 parallel streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) fan-out fan-in 10 streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Nitro | Express | Next.js (Turbopack) SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
|
There was a problem hiding this comment.
Pull request overview
Prevents stream write failures from surfacing as unhandledRejection before the step runtime collects them via ops, and enriches Vercel stream HTTP errors with the request endpoint and Vercel correlation headers.
Changes:
- Attach a no-op
.catch()toFlushableStreamState.promiseat creation so early rejections are observed before the runtime awaits them. - Centralize Vercel stream write/close error construction in
createStreamRequestError(), includingPUTURL plusx-vercel-id/x-vercel-errorheaders. - Add tests for unhandled-rejection behavior and Vercel error diagnostics, plus a patch changeset.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/flushable-stream.ts | Adds no-op rejection observer on state.promise in createFlushableState. |
| packages/core/src/flushable-stream.test.ts | New test verifying no unhandledRejection is emitted on early state rejection. |
| packages/world-vercel/src/streamer.ts | Adds createStreamRequestError helper; uses it in write/writeMulti/close failures. |
| packages/world-vercel/src/streamer.test.ts | New test asserting endpoint and Vercel correlation headers appear in the failure message. |
| .changeset/stream-failure-diagnostics.md | Patch changeset for @workflow/core and @workflow/world-vercel. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
TooTallNate
left a comment
There was a problem hiding this comment.
Approve — correct fix for a real bug, with a strong regression test
Two changes:
createFlushableStateattaches a no-op.catch(() => {})tostate.promiseat creation time, so an early rejection (before the runtime collectsstate.promiseinto itsopsarray andawaits it) doesn't fireunhandledRejection.world-vercelstream errors now includePUT <url>+x-vercel-id+x-vercel-errorheaders in the error message — useful for tracing transient stream failures back to a specific Vercel function invocation.
Why this fix is correct
The .catch(() => {}) is a standard Node.js pattern for "this rejection will be observed later." It does not swallow the error:
.catch()returns a NEW promise that resolves (because the handler returnedundefined).- The original
state.promiseis still rejected and still surfaces the same rejection when laterawaited. - Both
.then()/.catch()chains on the same promise observe the same rejection — they don't compete.
I traced where state.promise is consumed: serialization.ts (×6 sites) and writable-stream.ts all ops.push(state.promise), and the runtime does await Promise.all(ops) after user code returns. So the rejection still propagates through the normal error path. ✓
Test design is excellent
The new test (does not emit an unhandled rejection before the runtime awaits a failed operation):
- Listens for
process.on('unhandledRejection'). - Creates state, rejects it, waits a tick (long enough for Node to fire
unhandledRejectionif it would). - Asserts no
unhandledRejectionfired ANDstate.promisestill rejects when awaited.
The second assertion is critical — it proves the fix didn't accidentally swallow the rejection.
I verified the test fails without the fix by temporarily removing the .catch() line:
expected [] to deeply equal [
Error { "message": "Stream write failed" }
]
So the test correctly catches the regression it's protecting against.
What I verified locally
pnpm install --frozen-lockfile✓pnpm turbo run build --filter @workflow/core --filter @workflow/world-vercel✓cd packages/core && pnpm exec vitest run src/flushable-stream.test.ts✓ (10/10)pnpm --filter @workflow/world-vercel test✓ (110/110)- Reverted the
.catch(() => {})line and confirmed the regression test fails as expected — proves the test isn't a vacuous pass.
CI
106 success, 5 failures, all unrelated:
E2E Vercel Prod Tests (vite)— single workflow run flake on Vercel infrastructureBenchmark Vercel/Local (nitro-v3)— main also fails these regularlyBenchmark Community World (Redis + BullMQ)— unrelated to flushable-streamE2E Required Check— meta-check failing because the above failed
Restraint on retry is the right call
The PR body's note about not adding automatic retries is exactly right. Stream writes aren't idempotent server-side, and a response error after server-accepted bytes can't be distinguished from a pre-accept network failure — blindly retrying could duplicate chunks. The diagnostic improvement (request correlation headers in the error) is the better incremental win: now when a stream write fails, the user has the Vercel request ID to look up server-side state.
Approving.
Signed-off-by: Pranay Prakash <pranay.gp@gmail.com>
|
Backport PR opened against |
Summary
unhandledRejectionPUTendpoint and safe Vercel response correlation headers (x-vercel-id,x-vercel-error) in write/close failure errors while preserving the response body@workflow/coreand@workflow/world-vercelRoot cause
A streaming step can fail a server write while user code is still running.
flushablePipe()rejectsstate.promise, but the step runtime only waits on its collected operation promises after the step body returns. During that gap Node can classify the rejection as unhandled, allowing a transient stream failure to terminate the invocation instead of being handled through the runtime error path.This PR attaches an immediate no-op rejection observer to the existing promise while leaving that original rejected promise in place for the runtime to await and propagate normally.
Retry behavior
This intentionally does not add automatic retries for failed stream writes. Writes are not currently idempotent, and a response failure after server acceptance cannot be distinguished from a failed write; blind retries can duplicate stream chunks.
Validation
pnpm exec biome check --write packages/core/src/flushable-stream.ts packages/core/src/flushable-stream.test.ts packages/world-vercel/src/streamer.ts packages/world-vercel/src/streamer.test.tspnpm --filter @workflow/core... buildWORKFLOW_TARGET_WORLD=local pnpm --filter @workflow/core exec vitest run src(1,034 tests passed)pnpm --filter @workflow/world-vercel exec vitest run src(110 tests passed)pnpm changeset status --since=origin-https/maingit diff --check