Add first-class serialization for FatalError and RetryableError#1513
Add first-class serialization for FatalError and RetryableError#1513TooTallNate merged 6 commits intomainfrom
FatalError and RetryableError#1513Conversation
🦋 Changeset detectedLatest commit: c64a805 The changes in this PR will be included in the next version bump. This PR includes changesets to release 17 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🧪 E2E Test Results✅ All tests passed Summary
Details by Category✅ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
|
There was a problem hiding this comment.
Pull request overview
Adds first-class devalue serialization support for Workflow DevKit error types FatalError and RetryableError, improving type preservation across serialization boundaries.
Changes:
- Add
FatalError/RetryableErrorreducers and revivers to the common devalue pipeline. - Extend serialization tests with new round-trip coverage for both error types and adjust cross-VM FatalError expectations.
- Add a changeset to release the update as a patch to
@workflow/core.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| packages/core/src/serialization.ts | Introduces new special reducers/revivers for FatalError and RetryableError in the common serialization pipeline. |
| packages/core/src/serialization.test.ts | Updates cross-VM FatalError assertions and adds new round-trip tests for Fatal/Retryable errors. |
| .changeset/fatal-retryable-error-serialization.md | Declares a patch release note for the new serialization behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
c735685 to
a30a601
Compare
a14b17e to
6ea2e0e
Compare
7a43bf9 to
909f31f
Compare
b27fa71 to
1c8650d
Compare
📊 Benchmark Results
workflow with no steps💻 Local Development
workflow with 1 step💻 Local Development
workflow with 10 sequential steps💻 Local Development
workflow with 25 sequential steps💻 Local Development
workflow with 50 sequential steps💻 Local Development
Promise.all with 10 concurrent steps💻 Local Development
Promise.all with 25 concurrent steps💻 Local Development
Promise.all with 50 concurrent steps💻 Local Development
Promise.race with 10 concurrent steps💻 Local Development
Promise.race with 25 concurrent steps💻 Local Development
Promise.race with 50 concurrent steps💻 Local Development
workflow with 10 sequential data payload steps (10KB)💻 Local Development
workflow with 25 sequential data payload steps (10KB)💻 Local Development
workflow with 50 sequential data payload steps (10KB)💻 Local Development
workflow with 10 concurrent data payload steps (10KB)💻 Local Development
workflow with 25 concurrent data payload steps (10KB)💻 Local Development
workflow with 50 concurrent data payload steps (10KB)💻 Local Development
Stream Benchmarks (includes TTFB metrics)workflow with stream💻 Local Development
stream pipeline with 5 transform steps (1MB)💻 Local Development
10 parallel streams (1MB each)💻 Local Development
fan-out fan-in 10 streams (1MB each)💻 Local Development
SummaryFastest Framework by WorldWinner determined by most benchmark wins
Fastest World by FrameworkWinner determined by most benchmark wins
Column Definitions
Worlds:
|
|
Addressed the outstanding review feedback in e4e692c: FatalError cause preservation (Copilot, line on reducer): Fixed. RetryableError realm/validation (Copilot): Fixed. Two changes:
FatalError reviver comment (Copilot): No longer applicable. PR 3 was refactored to use the custom class serde flow ( e2e test will always time out (vercel): Fixed. You were right — the step throw path always reconstructs errors as |
…leError Add custom serialization methods to FatalError and RetryableError in @workflow/errors, enabling the SWC plugin to discover and register them through the standard class serialization pipeline. This preserves class identity (instanceof), the fatal flag, and the retryAfter date when these errors cross serialization boundaries. - Add @workflow/serde dependency to @workflow/errors - Add WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE static methods to both classes - Add unit tests verifying Instance-based round-trip serialization - Add e2e workflow tests verifying class identity preservation end-to-end
- FatalError: preserve cause property when present (Copilot feedback) - RetryableError: preserve cause property when present - RetryableError: serialize retryAfter as numeric timestamp for realm safety (the Date reducer uses instanceof global.Date which fails across VM realms; timestamps sidestep that issue) - Replace e2e tests with step return value serialization (step throw path always reconstructs errors as FatalError, so those tests don't exercise the new serde code path) - Add unit tests for cause preservation on both classes
Adding WORKFLOW_SERIALIZE / WORKFLOW_DESERIALIZE hooks to FatalError and RetryableError is a feature, not a bug fix.
Replace the WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE static methods on FatalError and RetryableError with dedicated reducers/revivers in the common serialization module. The Instance/Class pipeline relies on the SWC plugin discovering classes and registering them by classId, which means values constructed in environments that don't run the plugin (vitest e2e runner, ad-hoc Node scripts) can't be deserialized. Treating FatalError/RetryableError as first-class serialization targets makes them round-trip from any environment with no setup, matching the behavior of TypeError, RangeError, etc. added in the previous commit. - Drop @workflow/serde dependency on @workflow/errors - Remove WORKFLOW_SERIALIZE/DESERIALIZE statics from FatalError/RetryableError - Add FatalError/RetryableError reducers to serialization/reducers/common.ts with cached base-reducer factories for the subclasses that wrap the shared shape (RetryableError, AggregateError) - Migrate unit tests off registerSerializationClass setup - Extend the errorSubclassRoundTripWorkflow e2e test to cover FatalError and RetryableError, and drop the parallel errorFatalSerdeRoundTrip / errorRetryableSerdeRoundTrip tests
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Soundness: split makeErrorSubclassReducer into a shared base helper (reduceErrorBase / reduceNamedErrorSubclassBase returning the BaseErrorPayload shape) plus a thin wrapper constrained to subclass keys whose serialized shape is exactly that base payload. The AggregateError and RetryableError reducers — which extend the base with extra fields — now consume reduceNamedErrorSubclassBase directly instead of calling makeErrorSubclassReducer with an unsound type cast. The compiler now rejects accidental misuse (SimpleErrorSubclassKey type guard). - Realm safety: RetryableError reviver constructs retryAfter via new global.Date(...) to match the rest of the module and ensure the resulting Date passes instanceof global.Date checks in the target realm. - Test strength: assert serialized payloads contain the literal devalue marker ["FatalError",N] / ["RetryableError",N] rather than the bare class name (which would also match a generic Error payload whose name happens to be "FatalError"). Also assert the generic ["Error",N] marker is absent.
Bundlers like Turbopack compile `export class FatalError extends Error
{...}` into a registration call like `e.s(["FatalError", 0, class
extends Error {...}])` — passing an anonymous class expression as a
function argument. The resulting constructor function has `name === ''`,
which broke the previous `value.constructor?.name === subclassName`
match: an instance of the bundled FatalError class no longer matched the
dedicated FatalError reducer and instead fell through to the generic
`Error` reducer, losing class identity across the workflow boundary.
This was caught by the local-prod CI matrix, where each Next.js route
gets its own bundled chunk: a real `new FatalError('fatal!')` returned
from a workflow was serialized as a plain Error and revived without
`instanceof FatalError` holding on the consumer side.
Switch the match in `reduceNamedErrorSubclassBase` to `value.name`,
which:
- works for built-in subclasses (TypeError/RangeError/… all set
`name` automatically and aren't bundled, so behavior is unchanged
in practice).
- works for FatalError/RetryableError, whose constructors set
`this.name` explicitly — robust across realms AND bundlers.
- is consistent with how `FatalError.is()` / `RetryableError.is()`
already identify their values.
Two existing cross-VM Error tests (added in #1164) used `name = 'FatalError'`
on a plain Error to stand in for any cross-realm error — which now hits
the dedicated FatalError reducer (returning a host-realm FatalError)
instead of the generic Error reducer (which constructs a VM-realm Error).
Renamed the stand-in to `'CustomError'` so they continue to exercise the
intended path.
karthikscale3
left a comment
There was a problem hiding this comment.
ai review: Wire-format rollback risk — the new ["FatalError", ...] / ["RetryableError", ...] devalue keys are not known to older SDK versions. If this release is rolled back after any workflows have executed steps that serialize these errors in the new format, the old deserializer will encounter an unknown reducer key and fail to hydrate those events. This is unlikely in practice (FatalError terminates execution immediately; RetryableError's retry scheduling doesn't re-read the serialized error payload), but worth being aware of if a hotfix rollback is ever needed after deploy.
karthikscale3
left a comment
There was a problem hiding this comment.
Claude flagged a minor issue related to rollbacks. Otherwise LGTM
Summary
Add first-class serialization for
FatalErrorandRetryableErrorso they round-trip with class identity preserved across all serialization boundaries — including from environments that don't run the SWC plugin (e.g. the vitest e2e runner, ad-hoc Node scripts).How it works
FatalErrorandRetryableErrorget dedicated reducers/revivers in@workflow/core's common serialization module, alongside the built-inErrorsubclasses (TypeError,RangeError, etc.) added in #1511:FatalErroruses the standard subclass reducer/reviver pair (makeErrorSubclassReducer/makeErrorSubclassReviver).RetryableErrorextends the shared shape with a numericretryAfterepoch timestamp (instead of aDate) so it's realm-safe — theDatereducer usesinstanceof global.Datewhich fails across VM realms.@workflow/errorsrather than reading them fromglobal(since they're not built-ins).Why not
WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE?An earlier iteration of this PR routed
FatalError/RetryableErrorthrough theWORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZEstatic methods, relying on the SWC plugin to discover the classes and register them byclassId. That approach has a real limitation: in environments without the SWC plugin (the vitest e2e runner, ad-hoc Node scripts), constructed instances can't be deserialized because the class registration never happens. Treating them as first-class targets makes them work everywhere with no setup, matching the behavior ofTypeError,RangeError, etc.Test plan
serialization.test.tscovering type preservation, stack,causechains, andretryAfter(noregisterSerializationClasssetup needed)errorSubclassRoundTripWorkflowe2e test (added in Add first-class serialization for built-in Error subclasses #1511) is extended to includeFatalErrorandRetryableErroralongside the built-in subclasses, verifying the full client → workflow → step → workflow → client round-tripnextjs-turbopackdeploymentStack
FatalError/RetryableErrorfirst-class serderun_failed/step_failederrors through serialization pipeline #1851 — Serializerun_failed/step_failederrors through the serialization pipeline (stacked on top)