First major release for V2. Upgrade target for users on the 1.x line (last release:
1.1.7). 2.0.0 ships six categories of change:
withRetry— a new helper for retrying a chunk of durable logic
(multi-operation blocks containingwaitForCallback,invoke, etc.) with a
configurable backoff strategy.- Linear retry strategy —
createLinearRetryStrategyand the new
retryPresets.linearpreset for fixed-increment backoff alongside the
existing exponential strategies. - Serdes upgrades —
context.configureSerdes()for setting default serdes
once,createFileSystemSerdesfor offloading large payloads to a durable
mount (Amazon S3 Files, EFS), and inline previews to keep PII out of
GetDurableExecutionHistoryand the AWS console. - Cost / scale knob for batch operations —
NestingType.FLATonmap
andparallelskips the per-iterationCONTEXToperation for up to 2x cost
reduction and up to 2x more iterations per execution, trading per-branch
observability for throughput. - More precise error types — promise combinators and callbacks throw
specific error subclasses (PromiseCombinatorError,CallbackExternalError,
CallbackTimeoutError,CallbackSubmitterError). This is the main source of
breaking changes for code that branches on error type. - Observability plugins (experimental) — a
DurableInstrumentationPlugin
interface for emitting custom instrumentation. This API is experimental and
may change in a backward-incompatible way in any release.
A number of operational fixes and security updates are also included.
⚠ Upgrade guide (breaking changes)
1. Promise combinator failures now throw PromiseCombinatorError
context.Promise.all, Promise.allSettled, Promise.any, and Promise.race
previously rejected with StepError. They now reject with
PromiseCombinatorError, which extends DurableOperationError directly — not
StepError or ChildContextError.
// v1.x — no longer matches in v2
try {
await context.Promise.all([...]);
} catch (err) {
if (err instanceof StepError) { /* ... */ }
}
// v2.0
import { PromiseCombinatorError } from "@aws/durable-execution-sdk-js";
try {
await context.Promise.all([...]);
} catch (err) {
if (err instanceof PromiseCombinatorError) {
// err.cause is the original failure from the first rejecting branch
}
}If you don't care about distinguishing the operation type, catch the base class
DurableOperationError.
The error-type change is a side effect of reimplementing combinators on
runInChildContext(so idle branches no longer block Lambda termination). That
also changes the execution history shape — branches now appear asCONTEXT
operations rather thanSTEPoperations. See the "Changed" section below.
2. Callback failures now form a typed hierarchy under CallbackError
createCallback and waitForCallback previously threw CallbackError for every
failure mode. v2 organizes the specific callback errors into a hierarchy and
adds CallbackExternalError, which is thrown when the external entity completes a
callback with a failure (via SendDurableExecutionCallbackFailure):
CallbackError
+- CallbackExternalError // external entity reported failure (was CallbackError)
+- CallbackTimeoutError // callback timed out
+- CallbackSubmitterError // waitForCallback submitter function threw
| Scenario | v1.x error | v2 error |
|---|---|---|
Callback completed with FAILED |
CallbackError |
CallbackExternalError |
Callback timed out (TIMED_OUT) |
CallbackError |
CallbackTimeoutError |
waitForCallback submitter threw |
CallbackError |
CallbackSubmitterError |
| Internal error (e.g. no callback ID) | CallbackError |
CallbackError |
instanceof CallbackError is safe. Because CallbackExternalError,
CallbackTimeoutError, and CallbackSubmitterError all extend CallbackError,
existing catch (e) { if (e instanceof CallbackError) ... } code keeps matching
all callback failures. The break only affects code that relies on the exact
errorType / name string being "CallbackError" for external failures or
timeouts.
// v2.0 — branch on the specific subtype, or catch the base class
import {
CallbackError,
CallbackExternalError,
CallbackTimeoutError,
CallbackSubmitterError,
} from "@aws/durable-execution-sdk-js";
try {
await context.waitForCallback("approval", submitter, {
timeout: { hours: 1 },
});
} catch (err) {
if (err instanceof CallbackTimeoutError) {
// Approver missed the deadline
} else if (err instanceof CallbackSubmitterError) {
// The submitter function (e.g. publishing the approval URL) threw
} else if (err instanceof CallbackExternalError) {
// External system completed the callback with FAILED
} else if (err instanceof CallbackError) {
// Any other callback failure (base class)
}
}3. KMS exceptions during checkpoint / state APIs are non-retryable
KMS exceptions during CheckpointDurableExecution and GetDurableExecutionState
are now treated as non-retryable customer errors instead of being retried.
4. runInChildContext applies the serdes round-trip in all modes
runInChildContext previously ran deserialize(serialize(result)) only for
small (checkpointed) payloads; large payloads (replay-children mode) and virtual
contexts returned the raw in-memory result. With a non-identity serdes, the
value a caller received depended on payload size or the virtual flag. All three
modes now apply the same round-trip, so callers observe consistent results
regardless of payload size. No public API change, but observed values may change
if you use a serdes whose deserialize(serialize(x)) differs from x.
5. FileSystemSerdesConfig.mode renamed to storageMode
Only relevant if you adopted createFileSystemSerdes during the 2.0.0-alpha
line; 1.x users are unaffected.
Added
withRetry — retry a block of durable logic
A new helper for retrying chunks of logic that contain operations a step
cannot host (e.g. waitForCallback, invoke). Semantically a
runInChildContext with a retry policy wrapped around it.
import { withRetry, createRetryStrategy } from "@aws/durable-execution-sdk-js";
const result = await withRetry(
context,
"approval",
(ctx, attempt) =>
ctx.waitForCallback(`approval-${attempt}`, submitter, {
timeout: { hours: 24 },
}),
{
retryStrategy: createRetryStrategy({
maxAttempts: 3,
initialDelay: { seconds: 2 },
backoffRate: 2,
}),
},
);createLinearRetryStrategy + retryPresets.linear
Linear backoff with a configurable initial delay and increment.
import {
createLinearRetryStrategy,
retryPresets,
} from "@aws/durable-execution-sdk-js";
// Custom: 8 attempts, starting at 2s, +3s each attempt -> 2,5,8,11,14,17,20s
const strategy = createLinearRetryStrategy(8, 2, 3);
await context.step("flaky", fn, { retryStrategy: strategy });
// Or use the new preset (6 attempts: 1,2,3,4,5s)
await context.step("flaky", fn, { retryStrategy: retryPresets.linear });context.configureSerdes() + SerdesConfig
Set default serdes once on the context instead of passing serdes: to every
operation. Defaults flow into step, runInChildContext, invoke, and
waitForCondition. Callbacks (createCallback, waitForCallback) require an
explicit defaultCallbackDeserializer — they keep the passthrough
deserializer otherwise so customer-provided callback payloads aren't accidentally
JSON-parsed. Per-operation serdes: arguments still win over the default.
createFileSystemSerdes(basePath, config?)
A built-in Serdes that writes each value to a file under basePath and stores
only a small file pointer in the checkpoint, keeping executions under the
per-checkpoint size limit (~256KB) when individual operations produce large
results. Supports FileSystemSerdesMode.ALWAYS (default) and
FileSystemSerdesMode.OVERFLOW (inline JSON until a threshold, then spill to a
file).
⚠ Use a durable, shared mount. S3 Files (Lambda S3 mount) and EFS are
supported. Do not point this at Lambda's/tmp—/tmpis
per-execution-environment and a replay on a different sandbox won't find the
file, breaking deserialization.
It also accepts an optional generatePreview function (with the buildPreview
helper plus PreviewMode, FieldMatchMode) to store a redacted inline preview
in the checkpoint while keeping the full value on disk — useful for keeping PII
out of GetDurableExecutionHistory and the AWS console.
NestingType for map and parallel
Trade observability for cost on batch operations. The default is
NestingType.NESTED (existing behavior), so existing code is unaffected unless
you opt in. NestingType.FLAT skips per-iteration CONTEXT operations for up
to 2x cost reduction and higher per-execution scale, at the price of less
detailed history.
errorMapper on ChildConfig
runInChildContext, promise combinators, and withRetry accept a function to
remap the thrown error type — useful when you want a domain error class instead
of ChildContextError.
Instrumentation plugin system (experimental)
⚠ Experimental. This API is unstable and may change in a backward-
incompatible way in any future release, including minor and patch versions. It
is not covered by semantic versioning guarantees yet. Use with caution in
production and pin your SDK version if you depend on it.
A new DurableInstrumentationPlugin interface with lifecycle hooks for
execution, invocation, operation, and attempt-level events, composed by a plugin
runner and wired through the entire execution path (step, runInChildContext,
invoke, wait, waitForCondition, callbacks). Register plugins via the new
plugins field on DurableExecutionConfig. Plugin errors are isolated
(fire-and-forget) and can never alter customer output.
Safe DurableContext detection
DurableContextImpl now carries a package-namespaced
Symbol.for("@aws/durable-execution-sdk-js/durable-context") brand and a
Symbol.toStringTag of "DurableContext", so external libraries can detect a
durable context without importing the SDK or relying on name checks.
Changed
-
Promise combinators no longer block Lambda termination during idle waits.
context.Promise.all,Promise.allSettled,Promise.any, andPromise.race
previously ran each branch inside an internalctx.step. A step keeps the
Lambda invocation alive, so when every branch was idle (e.g. all waiting on a
waitorwaitForCallback) the function could not be torn down and you kept
paying for idle compute. The combinators are now implemented on top of
runInChildContext, which lets Lambda terminate while all branches are idle
and resume on a later invocation.Consequence — execution history shape changes. Because each branch now runs
in a child context instead of a step,GetDurableExecutionHistory(and the AWS
console) now show aCONTEXToperation per branch instead of aSTEP
operation. Anything that parses history or asserts on operation types/counts
for code using promise combinators must be updated. This change is also what
enables the typedPromiseCombinatorError(upgrade guide §1). -
UserAgent header uses
aws-durable-execution-sdk-js/<version>and appends
-bundledwhen running from the Lambda bundled runtime path.
Fixed
errorDatais preserved across nestedrunInChildContextboundaries.
DurableOperationError.toErrorObjectnow walks the cause chain (bounded to 10
hops) to surface the firsterrorDatait finds, instead of dropping it each
time an error is re-wrapped in a freshChildContextError.- Child context result matches replay on first run, including serdes handling
in batch operations. PromiseCombinatorErrorsurvives replay. It was missing from
DurableOperationError.fromErrorObject, so it deserialized asStepErroron
replay, breakinginstanceofchecks and error-identity determinism.buildPreviewno longer leaks excluded subtrees. A field excluded via
FieldMatchMode.PATHstill had its descendant leaves surface under
INCLUDE_ALLmode, leaking PII intoGetDurableExecutionHistory. Excluded
nodes now skip recursion entirely.createFileSystemSerdesOVERFLOW threshold accounts for double-encoding.
The check now measures the final persisted envelope ({ data: <json> }, which
re-escapes quotes/backslashes and inflates size 10–30%) rather than the raw
inline JSON, so boundary payloads correctly overflow to file and stay under the
~256KB checkpoint limit.- Interrupted step with
shouldRetry: falseno longer crashes on replay.
AnAtMostOncePerRetrystep interrupted (e.g. Lambda timeout) whose retry
strategy declines a retry now passes the required metadata when marking the
operation complete, instead of throwingmetadata required on first callon a
fresh Lambda instance. - eslint-plugin: remove
context.getSourceCode()so the plugin works on
ESLint v10. - sdk: resolve ESLint warnings surfaced during the build.
- examples: prevent duplicate
DeleteFunctioncalls during cleanup in
integration tests.
Security
- Bump
@aws-sdk/*deps to pull in the patchedfast-xml-parser. - Resolve high-severity advisories in
fast-xml-parser,handlebars,flatted,
lodash,minimatch,picomatch, andvite; updateeslint-plugin-tsdocto
^0.5.2.