[world-vercel] Retry transient response-body parse failures in the HTTP client#2204
Conversation
…e HTTP client A sporadic failure reading/decoding a 2xx response body (truncated or terminated stream, connection reset mid-body, or a gateway returning a non-CBOR/JSON body) was surfaced immediately as a PARSE_ERROR. The shared RetryAgent only retries connection/5xx failures — body consumption happens after it returns the response, so these escape its retry logic. Retry such failures inside `makeRequest` with bounded exponential backoff, scoped to idempotent methods (GET/HEAD) so writes are never replayed. This fixes the reported `events.list` parse failure at the adapter layer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 1706a5e The changes in this PR will be included in the next version bump. This PR includes changesets to release 17 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🧪 E2E Test Results✅ All tests passed Summary
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
|
📊 Benchmark Results
workflow with no steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 1 step💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 10 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 25 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 50 sequential steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express Promise.all with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express Promise.all with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express Promise.all with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express Promise.race with 10 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express Promise.race with 25 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express Promise.race with 50 concurrent steps💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 10 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 25 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 50 sequential data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 10 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 25 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express workflow with 50 concurrent data payload steps (10KB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express stream pipeline with 5 transform steps (1MB)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express 10 parallel streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express fan-out fan-in 10 streams (1MB each)💻 Local Development
▲ Production (Vercel)
🔍 Observability: Express SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
❌ Some benchmark jobs failed:
Check the workflow run for details. |
Pairs with the world-vercel in-adapter retry: when a response-body parse failure survives the adapter's retries (or comes from a non-idempotent write that is never retried in-process), it must not fail the run. Re-throw such transient world errors from the replay loop so they propagate to the queue handler, which replays the whole run — safe because replay is idempotent. Schema-validation contract errors stay fatal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…queue" This reverts commit 7bb62e9.
pranaygp
left a comment
There was a problem hiding this comment.
Reviewed the current adapter-only diff at 1706a5e61: response-body retries are bounded to idempotent GET/HEAD requests, POST writes are covered as non-retryable, and the PR description now matches the remaining scope. Required checks are green. The non-required Benchmark Vercel (nextjs-turbopack) stream-correctness failure should still be rerun or explicitly accepted before merge.
…TP client (#2204) * fix(world-vercel): retry transient response-body parse failures in the HTTP client A sporadic failure reading/decoding a 2xx response body (truncated or terminated stream, connection reset mid-body, or a gateway returning a non-CBOR/JSON body) was surfaced immediately as a PARSE_ERROR. The shared RetryAgent only retries connection/5xx failures — body consumption happens after it returns the response, so these escape its retry logic. Retry such failures inside `makeRequest` with bounded exponential backoff, scoped to idempotent methods (GET/HEAD) so writes are never replayed. This fixes the reported `events.list` parse failure at the adapter layer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(core): propagate exhausted transient world errors to the queue Pairs with the world-vercel in-adapter retry: when a response-body parse failure survives the adapter's retries (or comes from a non-idempotent write that is never retried in-process), it must not fail the run. Re-throw such transient world errors from the replay loop so they propagate to the queue handler, which replays the whole run — safe because replay is idempotent. Schema-validation contract errors stay fatal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Revert "fix(core): propagate exhausted transient world errors to the queue" This reverts commit 7bb62e9. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Peter Wielander <mittgfu@gmail.com>
|
Backport PR opened against |
* origin/main: [world-vercel] Retry transient response-body parse failures in the HTTP client (#2204) Add virtualization to the trace viewer (#2205) Trace viewer: scroll-load events past an auto-load cap (#2200) fix(core): resolve forwarded stream keys across deployments (#2191) [e2e] Improve error labeling in event-log-race-repro CI job (#2190) [core] Harden event pagination response parsing (#2180) (#2179) Add loading skeleton to the new trace viewer (#2164) Add tooltip components + apply on up/down detail pane (#2163) fix(swc-plugin): allow wasm host imports during link (#2174) [test] Forward-port reused-sleep replay divergence test (#2172) [e2e] Add `event-log-race-repro` label for triggering CI stress-test (#2159) Version Packages (beta) (#2162) fix(world-local): skip Nov 2025 ghost versions on npm (#2168) fix(core,errors): classify SDK encryption failures as RUNTIME_ERROR (#2145) [web-shared][web] Fix events tab search (#2107) Version Packages (beta) (#2147) Allow setting workflow attributes from steps (#2157) Better search handling on the trace viewer (#2144) [docs] Document experimental attributes feature (#2141)
…TP client (#2204) (#2207) * fix(world-vercel): retry transient response-body parse failures in the HTTP client A sporadic failure reading/decoding a 2xx response body (truncated or terminated stream, connection reset mid-body, or a gateway returning a non-CBOR/JSON body) was surfaced immediately as a PARSE_ERROR. The shared RetryAgent only retries connection/5xx failures — body consumption happens after it returns the response, so these escape its retry logic. Retry such failures inside `makeRequest` with bounded exponential backoff, scoped to idempotent methods (GET/HEAD) so writes are never replayed. This fixes the reported `events.list` parse failure at the adapter layer. * fix(core): propagate exhausted transient world errors to the queue Pairs with the world-vercel in-adapter retry: when a response-body parse failure survives the adapter's retries (or comes from a non-idempotent write that is never retried in-process), it must not fail the run. Re-throw such transient world errors from the replay loop so they propagate to the queue handler, which replays the whole run — safe because replay is idempotent. Schema-validation contract errors stay fatal. * Revert "fix(core): propagate exhausted transient world errors to the queue" This reverts commit 7bb62e9. --------- Signed-off-by: Peter Wielander <mittgfu@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Nathan Rajlich <n@n8.io>
Problem
A sporadic failure reading or decoding a successful (
2xx) response body - for example, a truncated or terminated stream, a connection reset mid-body, or a gateway returning a non-CBOR/JSON body - surfaces as aPARSE_ERROR.The shared
RetryAgentinhttp-client.tsretries connection and 5xx failures, but body consumption happens after it has handed back the response, so transient response-body failures do not reach that retry logic.Approach
@workflow/world-vercelnow retries response-body read/decode failures directly inmakeRequestwith bounded exponential backoff:MAX_BODY_PARSE_RETRIES = 2BODY_PARSE_RETRY_BASE_MS = 100GET/HEAD)This covers the reported
events.listfailure path while avoiding retries for writes that may already have been applied. Schema-validation failures remain non-retryable because retrying the same decoded payload would not make it valid.If an idempotent request continues to fail after the retry budget is exhausted, or if a non-idempotent request encounters a body-parse failure,
makeRequestcontinues to surface aPARSE_ERROR.Tests
packages/world-vercel/src/utils.test.ts: aGETretries a transient body-read failure and then succeeds.packages/world-vercel/src/utils.test.ts: aGETexhausts the retry budget and throwsPARSE_ERROR.packages/world-vercel/src/utils.test.ts: aPOSTbody-parse failure is not retried.