Snapshot Runtime: QuickJS WASM VM with snapshot/restore for workflow execution by TooTallNate · Pull Request #1300 · vercel/workflow

TooTallNate · 2026-03-09T00:35:33Z

Summary

Implements the snapshot-based workflow runtime described in RFC #1298. Instead of replaying the full event log on every workflow handler invocation, workflows run inside a QuickJS WASM VM that is snapshotted at suspension points and restored on resumption — so each invocation only fetches and processes events that arrived since the last save.

The snapshot runtime is the default in this PR. The previous event-replay runtime remains available as an opt-out via WORKFLOW_RUNTIME=replay or executionContext.workflowRuntime: 'replay'.

How it works

Workflow code runs inside a QuickJS WASM VM.
When the workflow awaits a step / hook / sleep, the VM suspends and its heap is serialized.
Bytes go through a compress → encrypt pipeline (zstd on Node 22.15+, gzip fallback; AES-256-GCM when an encryption key is configured) and are persisted via world.snapshots.save.
On the next workflow handler invocation, world.snapshots.load returns the bytes, the inverse decrypt → decompress pipeline restores them, and vm.restore() resumes the VM at the exact suspension point.
The runtime fetches only events newer than the snapshot's eventsCursor, processes them, and either resolves to a result, suspends on a new pending op, or fails.

Most of the snapshot-runtime work lives in @workflow/core (runtime/snapshot-runtime.ts, runtime/snapshot-entrypoint.ts, serialization/compression.ts, serialization/vm-bundle-entry.ts); each world implements snapshots.save / load / delete for its storage backend.

Scope of this PR

@workflow/core: snapshot runtime, VM bootstrap, event-cursor-driven resume, deterministic correlationIds (seeded ULIDs across concurrent VM invocations of the same resumption), encryption and compression pipeline, WORKFLOW_RUNTIME env-var dispatch with replay-runtime fallback, OTel spans/attributes for the snapshot lifecycle, CI-visible diagnostic checkpoints (SNAPSHOT_DIAG).
@workflow/world: new Snapshots interface (save / load / delete) and metadata schema.
@workflow/world-vercel: workflow-server snapshot endpoints (PUT/GET/DELETE /v2/runs/:runId/snapshot), opaque-bytes transport, switch to undici.request() for retry-with-Buffer-body correctness, atomic per-(run, correlation) uniqueness for entity-creating events.
@workflow/world-postgres: new workflow_snapshots table, unique partial index on workflow_events(run_id, correlation_id, type) for entity-creating events.
@workflow/world-local: filesystem-backed snapshot storage ({runId}.bin + {runId}.json), atomic correlationId uniqueness for step_created / wait_created.
CI: vitest plugin matrix split across [snapshot, replay], full Vercel-prod E2E coverage of the snapshot runtime across 11 frameworks.

Custom serializers (Symbol.for('workflow-serialize') / Symbol.for('workflow-deserialize')) and workflow-side DOMException / WorkflowFunction round-trip through the VM serde bundle alongside the standard reducers.

Out of scope / future work

A dedicated CLI command to fetch Vercel function logs by runId (getVercelFunctionLogs was removed from the e2e diagnostic harness — belongs in its own PR).
Workflow-bundle bloat (the QuickJS heap snapshot is dominated by the user's compiled bundle, which today inlines @opentelemetry/api, zod, ai-sdk, etc. — tree-shaking those out is a builder-side change worth pursuing later).
Performance tuning for very-many-step workflows on cloud worlds (per-step round-trip is currently dominated by snapshot.save + storage RTT; further work could batch saves or skip them entirely for ops the runtime can recompute).

Based on serialization-refactor (PR #1299).

…refix Start of the serialization refactor (separate from snapshot-runtime). New files: - serialization/types.ts — SerializationFormat enum, SerializableSpecial interface, Reducers/Revivers types - serialization/codec.ts — Codec interface with formatPrefix, serialize, deserialize, and optional deserializeLegacy - serialization/format.ts — Format prefix encode/decode/peek, moved from the monolithic serialization.ts The Codec interface enables future alternative formats (CBOR, JSON) while keeping the devalue implementation as the current default.

Serialization refactor Phase 1: create the new module structure alongside the existing monolithic serialization.ts (which continues to work). New files: - serialization/reducers/common.ts — Date, Error, Map, Set, URL, BigInt, typed arrays, Headers, Request, Response, RegExp, URLSearchParams - serialization/reducers/class.ts — Class/Instance with WORKFLOW_SERIALIZE/ DESERIALIZE support - serialization/reducers/step-function.ts — StepFunction with closure vars - serialization/codec-devalue.ts — devalue Codec implementation - serialization/encryption.ts — composable encrypt/decrypt layer - serialization/workflow.ts — synchronous, no encryption, for VM use - serialization/step.ts — async with encryption, for step handler - serialization/client.ts — async with encryption, for start() API - serialization/index.ts — re-exports all public API - serialization/serialization.test.ts — 25 focused tests All modes compose their reducer/reviver sets from the shared building blocks. Cross-mode compatibility verified: data serialized in any mode can be deserialized in any other mode (for common types). Existing 108 serialization tests continue to pass unchanged.

- Add ./serialization/workflow export to @workflow/core package.json - Add ./internal/serialization re-export to workflow meta-package - The workflow bundle can now import serialize/deserialize via: import { serialize, deserialize } from 'workflow/internal/serialization' Full test suite passes: 493 tests across 22 files (including 25 new serialization module tests).

1. Fix reducer composition order: Class/Instance reducers now come BEFORE common reducers in all three modes (workflow, step, client). This ensures custom Error subclasses with WORKFLOW_SERIALIZE are handled by the Instance reducer before the generic Error reducer (devalue uses first-match-wins semantics). 2. Fix encryption decrypt() to fail fast when encrypted data is encountered without a decryption key, instead of silently returning encrypted bytes that would fail later with an unhelpful format error. 3. Remove Request/Response from common reducers — they don't have matching common revivers, so including them caused asymmetric behavior (serialize as Request, deserialize as plain object). Request/Response handling belongs in mode-specific modules that can provide proper revivers. 4. Document Node.js dependency in the workflow serialization re-export. The current implementation uses node:util and Buffer. For the QuickJS VM (snapshot runtime), these will need polyfills — tracked separately.

The Codec interface now takes a SerializationMode ('workflow', 'step', 'client') instead of raw reducers/revivers. The reducer/reviver composition is internal to the devalue codec implementation. This is the right abstraction because reducers/revivers are devalue- specific concepts. A future CBOR codec would handle Date, typed arrays, Map, Set natively via the CBOR type system — it wouldn't use reducers at all. A JSON codec would only support standard JSON types. The mode-specific modules (workflow.ts, step.ts, client.ts) are now simpler — they just pass the mode string to the codec.

The format prefix is now a branded string type validated by isFormatPrefix() — any 4-character [a-z0-9] string is valid. This removes the hard-coded enum of known formats, making the system truly open for extension: type FormatPrefix = string & { __brand: 'FormatPrefix' }; function isFormatPrefix(value: string): value is FormatPrefix; The SerializationFormat object still provides well-known constants ('devl', 'encr') but they're now just typed constants, not an exhaustive enum. peekFormatPrefix() and decodeFormatPrefix() use isFormatPrefix() for validation instead of checking against a known list. Unknown but valid prefixes (e.g. 'cbor', 'json', 'v2b1') are accepted — the caller decides whether they can handle the format. 6 new isFormatPrefix tests covering: valid strings, too short, too long, uppercase, special characters. 1 new test for unknown-but-valid prefixes.

Proves that data serialized by the new modules can be deserialized by the old serialization.ts functions, and vice versa. This validates that the new modules are wire-format compatible and safe for incremental migration: - new workflow.serialize → old hydrateStepReturnValue (primitives, Date, Map, nested) - old dehydrateStepReturnValue → new workflow.deserialize (primitives, Date, nested) - old dehydrateWorkflowArguments → new workflow.deserialize - new client.serialize → old hydrateWorkflowArguments - new step.serialize + encryption → old hydrateStepArguments + decryption - old dehydrateStepArguments + encryption → new step.deserialize + decryption All 11 tests pass, confirming the new and old modules produce identical wire formats and can coexist during the migration.

Phase 1 of the VM snapshot runtime (RFC #1298). World interface changes (packages/world): - Add SnapshotMetadata type (lastEventId, createdAt) with zod schema - Add snapshots sub-interface to Storage: save(), load(), delete() - Export new types and schema from @workflow/world world-local implementation (packages/world-local): - Filesystem-based snapshot storage in {dataDir}/snapshots/ - {runId}.bin for serialized VM snapshot data - {runId}.json for metadata (lastEventId, createdAt) - save() overwrites existing snapshots (atomic via ensureDir + write) - load() returns null if no snapshot exists - delete() removes both files - Wired into createStorage() with tracing instrumentation

Phase 2 of the VM snapshot runtime (RFC #1298). - Add quickjs-wasi dependency to @workflow/core - Create snapshot-runtime.ts with the basic structure: - runSnapshotWorkflow() entry point - Fresh VM creation with deterministic WASI clock and seeded Math.random - Snapshot restore path (TODO: event processing) - Host function stubs for useStep, sleep, createHook via Symbol.for() - Interrupt handler (30s timeout) - Memory limit (64MB) - Snapshot serialization on suspension The useStep, sleep, and createHook host functions are stubs with TODO markers — the basic VM lifecycle and snapshot/restore flow is in place.

Demonstrates the core snapshot/restore mechanism with a compiled workflow pattern: - useStep implemented inside QuickJS as JS code (not host functions) - Pending step resolve/reject functions stored on globalThis.__resolvers - Step metadata (stepId, args) preserved across snapshot/restore - Multi-step workflow: snapshot at each suspension, restore and resolve, workflow continues from exact suspension point - Both tests pass: simple workflow + metadata preservation

The snapshot runtime (runSnapshotWorkflow) now handles the complete workflow lifecycle: - First run: bootstrap VM with workflow primitives, evaluate compiled workflow bundle, start workflow function, process any existing events - Snapshot: capture VM state when workflow suspends on step/sleep - Restore: deserialize snapshot, process delta events to resolve/reject pending promises, execute pending jobs - Completion: detect workflow result or error Workflow primitives (useStep, sleep) are implemented as JavaScript code inside the QuickJS VM, not as host function callbacks. This keeps the implementation simple — the host communicates by evaluating small JS snippets to resolve/reject promises. 7 tests covering: simple completion, step suspension, snapshot/restore with step completion, multi-step across 3 snapshots, sleep suspension and wake, step failure with try/catch.

…napshot flag - Add snapshot-entrypoint.ts that handles the full lifecycle: snapshot load → event fetching → runSnapshotWorkflow → result handling (create events, queue steps, save/delete snapshots) - Add feature flag: set WORKFLOW_RUNTIME=snapshot to use the new runtime - When enabled, the snapshot path runs before the event-replay path - Step queuing matches the existing step handler's expected payload format - Wait handling includes timeout calculation for delayed re-queuing - Extract workflow ID from SWC-compiled bundle's manifest comment

The snapshot runtime now successfully: 1. Evaluates the compiled workflow bundle in QuickJS 2. Suspends on the first step call 3. Snapshots the VM state 4. Creates step_created events and queues step execution Web API stubs added for TransformStream, ReadableStream, WritableStream, TextEncoder, TextDecoder, Headers, URL, console — these are referenced by the compiled bundle but not needed for basic step/sleep workflows. Remaining issue: step_created events use raw JSON for step input args, but the step handler expects devalue-serialized data. This is the data serialization boundary that needs to be resolved (RFC #1298 discusses moving devalue inside the QuickJS VM).

…untime The step_created events now contain properly devalue-serialized input data (Uint8Array with 'devl' format prefix) instead of raw JSON. This makes the step handler's hydrateStepArguments() work correctly. When processing step_completed events, the output is deserialized via workflow.deserialize() on the host side before passing to the QuickJS VM as JSON. This handles the devalue format prefix correctly. Also properly serializes the run_completed output.

Step arguments are now wrapped in { args: [...], closureVars?: {...} } before being serialized with workflow.serialize(), matching the format expected by the step handler's hydrateStepArguments(). The step handler successfully: - Receives the step message - Deserializes the step arguments - Executes the step function (add(10, 7)) - Handles retry on retryable errors - Completes the step and re-queues the workflow

New files: - serialization/base64.ts — pure-JS base64 encode/decode (no Buffer) - serialization/reducers/common-vm.ts — VM-compatible reducers using instanceof Error instead of types.isNativeError(), pure-JS base64 instead of Buffer - serialization/codec-devalue-vm.ts — devalue codec using VM reducers - serialization/workflow-vm.ts — VM workflow serialize/deserialize The VM serializer produces the EXACT same wire format as the Node.js serializer (devl-prefixed devalue data). Verified by 14 tests including critical cross-compatibility: - VM serialize → Node.js hydrateStepArguments (step handler path) - Node.js dehydrateStepReturnValue → VM deserialize (step result path) - Pure-JS base64 matches Node.js Buffer base64 Sub-path export: @workflow/core/serialization/workflow-vm Re-export: workflow/internal/serialization now points to workflow-vm

Data now flows as format-prefixed devalue bytes (devl + devalue.stringify) across the VM boundary, with no JSON conversion in the middle: Step args: VM __wdk_serialize({args}) → Uint8Array → event input Step results: event output Uint8Array → VM __wdk_deserialize → value Workflow result: VM __wdk_serialize(result) → Uint8Array → event output Host functions __wdk_serialize/__wdk_deserialize are installed on globalThis and use the VM-compatible workflow serializer (pure JS, no Node.js deps). They are re-installed after snapshot restore since host callbacks don't survive the snapshot. VM-compatible serializer (workflow-vm.ts) produces the EXACT same wire format as the Node.js serializer — verified by cross-compatibility tests.

The serializer (devalue + reducers + TextEncoder/TextDecoder polyfills) is now bundled as a 16.6KB IIFE that's evaluated inside the QuickJS VM during bootstrap. The serialize/deserialize functions are real JS functions running inside the VM, operating on QuickJS-native values (Date, Map, Set, etc.) that can't cross the VM boundary via dump(). Architecture: - vm-bundle-entry.ts is bundled by esbuild into a self-contained IIFE - esbuild inject option ensures TextEncoder/TextDecoder polyfills run before any module-level code - The host only passes opaque Uint8Array blobs (devl-prefixed devalue) across the VM boundary - On snapshot restore, the serde functions survive in the QuickJS heap (no re-registration needed) New files: - polyfills/text-encoder.ts — pure JS TextEncoder (from nx.js) - polyfills/text-decoder.ts — pure JS TextDecoder (from nx.js) - polyfills/install-text-coding.ts — installs polyfills on globalThis - serialization/vm-bundle-entry.ts — esbuild entry for VM serde bundle - runtime/vm-serde-bundle.generated.ts — auto-generated bundle string - scripts/build-vm-serde-bundle.js — build script (runs during pnpm build) Removed: installSerdeHostFunctions (no longer needed — serde is in-VM)

…ecution The snapshot metadata now stores eventsCursor (the pagination cursor from events.list()) instead of lastEventId (the raw event ID). The world-local pagination expects cursors in 'timestamp|id' format, not raw event IDs. This fix enables the full workflow lifecycle: 1. First invocation: QuickJS VM evaluates workflow, suspends on step_0 2. Step handler executes add(10, 7) = 17 3. Second invocation: snapshot restored, step_0 resolved, suspends on step_1 4. Step handler executes add(17, 8) = 25 5. Third invocation: snapshot restored, both steps resolved, workflow completes 6. run_completed event created, snapshot cleaned up Verified end-to-end with the nextjs-turbopack workbench: - All events created correctly (run_created → run_completed) - Step retries work (the add function throws on first attempt) - Snapshots are saved/restored/deleted at correct lifecycle points - Run status transitions to 'completed'

changeset-bot · 2026-03-09T00:35:37Z

🦋 Changeset detected

Latest commit: 8349c88

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 20 packages

Name	Type
@workflow/core	Minor
@workflow/world-local	Minor
@workflow/world-postgres	Minor
@workflow/world-vercel	Minor
@workflow/builders	Patch
@workflow/cli	Patch
@workflow/next	Patch
@workflow/nitro	Patch
@workflow/vitest	Patch
@workflow/web-shared	Patch
@workflow/web	Patch
workflow	Minor
@workflow/world-testing	Patch
@workflow/astro	Patch
@workflow/nest	Patch
@workflow/rollup	Patch
@workflow/sveltekit	Patch
@workflow/vite	Patch
@workflow/nuxt	Patch
@workflow/ai	Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-03-09T00:35:39Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
example-nextjs-workflow-turbopack	Ready	Preview, Comment	May 1, 2026 7:21am
example-nextjs-workflow-webpack	Ready	Preview, Comment	May 1, 2026 7:21am
example-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-astro-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-express-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-fastify-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-hono-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-nitro-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-nuxt-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-sveltekit-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workbench-vite-workflow	Ready	Preview, Comment	May 1, 2026 7:21am
workflow	Error		May 1, 2026 7:21am
workflow-docs	Ready	Preview, Comment, Open in v0	May 1, 2026 7:21am
workflow-nest	Ready	Preview, Comment	May 1, 2026 7:21am
workflow-swc-playground	Ready	Preview, Comment	May 1, 2026 7:21am
workflow-web	Ready	Preview, Comment	May 1, 2026 7:21am

github-actions · 2026-03-09T00:35:43Z

🧪 E2E Test Results

✅ All tests passed

Summary

	Passed	Skipped	Total
✅ ▲ Vercel Production	1956	134	2090
✅ 💻 Local Development	2108	172	2280
✅ 📦 Local Production	2108	172	2280
✅ 🐘 Local Postgres	2108	172	2280
✅ 🪟 Windows	190	0	190
✅ 📋 Other	534	36	570
Total	9004	686	9690

Details by Category

✅ ▲ Vercel Production

App	Passed	Skipped
✅ astro-replay	88	7
✅ astro-snapshot	88	7
✅ example-replay	88	7
✅ example-snapshot	88	7
✅ express-replay	88	7
✅ express-snapshot	88	7
✅ fastify-replay	88	7
✅ fastify-snapshot	88	7
✅ hono-replay	88	7
✅ hono-snapshot	88	7
✅ nextjs-turbopack-replay	93	2
✅ nextjs-turbopack-snapshot	93	2
✅ nextjs-webpack-replay	93	2
✅ nextjs-webpack-snapshot	93	2
✅ nitro-replay	88	7
✅ nitro-snapshot	88	7
✅ nuxt-replay	88	7
✅ nuxt-snapshot	88	7
✅ sveltekit-replay	88	7
✅ sveltekit-snapshot	88	7
✅ vite-replay	88	7
✅ vite-snapshot	88	7

✅ 💻 Local Development

App	Passed	Skipped
✅ astro-stable-replay	89	6
✅ astro-stable-snapshot	89	6
✅ express-stable-replay	89	6
✅ express-stable-snapshot	89	6
✅ fastify-stable-replay	89	6
✅ fastify-stable-snapshot	89	6
✅ hono-stable-replay	89	6
✅ hono-stable-snapshot	89	6
✅ nextjs-turbopack-canary-replay	76	19
✅ nextjs-turbopack-canary-snapshot	76	19
✅ nextjs-turbopack-stable-replay	95	0
✅ nextjs-turbopack-stable-snapshot	95	0
✅ nextjs-webpack-canary-replay	76	19
✅ nextjs-webpack-canary-snapshot	76	19
✅ nextjs-webpack-stable-replay	95	0
✅ nextjs-webpack-stable-snapshot	95	0
✅ nitro-stable-replay	89	6
✅ nitro-stable-snapshot	89	6
✅ nuxt-stable-replay	89	6
✅ nuxt-stable-snapshot	89	6
✅ sveltekit-stable-replay	89	6
✅ sveltekit-stable-snapshot	89	6
✅ vite-stable-replay	89	6
✅ vite-stable-snapshot	89	6

✅ 📦 Local Production

App	Passed	Skipped
✅ astro-stable-replay	89	6
✅ astro-stable-snapshot	89	6
✅ express-stable-replay	89	6
✅ express-stable-snapshot	89	6
✅ fastify-stable-replay	89	6
✅ fastify-stable-snapshot	89	6
✅ hono-stable-replay	89	6
✅ hono-stable-snapshot	89	6
✅ nextjs-turbopack-canary-replay	76	19
✅ nextjs-turbopack-canary-snapshot	76	19
✅ nextjs-turbopack-stable-replay	95	0
✅ nextjs-turbopack-stable-snapshot	95	0
✅ nextjs-webpack-canary-replay	76	19
✅ nextjs-webpack-canary-snapshot	76	19
✅ nextjs-webpack-stable-replay	95	0
✅ nextjs-webpack-stable-snapshot	95	0
✅ nitro-stable-replay	89	6
✅ nitro-stable-snapshot	89	6
✅ nuxt-stable-replay	89	6
✅ nuxt-stable-snapshot	89	6
✅ sveltekit-stable-replay	89	6
✅ sveltekit-stable-snapshot	89	6
✅ vite-stable-replay	89	6
✅ vite-stable-snapshot	89	6

✅ 🐘 Local Postgres

App	Passed	Skipped
✅ astro-stable-replay	89	6
✅ astro-stable-snapshot	89	6
✅ express-stable-replay	89	6
✅ express-stable-snapshot	89	6
✅ fastify-stable-replay	89	6
✅ fastify-stable-snapshot	89	6
✅ hono-stable-replay	89	6
✅ hono-stable-snapshot	89	6
✅ nextjs-turbopack-canary-replay	76	19
✅ nextjs-turbopack-canary-snapshot	76	19
✅ nextjs-turbopack-stable-replay	95	0
✅ nextjs-turbopack-stable-snapshot	95	0
✅ nextjs-webpack-canary-replay	76	19
✅ nextjs-webpack-canary-snapshot	76	19
✅ nextjs-webpack-stable-replay	95	0
✅ nextjs-webpack-stable-snapshot	95	0
✅ nitro-stable-replay	89	6
✅ nitro-stable-snapshot	89	6
✅ nuxt-stable-replay	89	6
✅ nuxt-stable-snapshot	89	6
✅ sveltekit-stable-replay	89	6
✅ sveltekit-stable-snapshot	89	6
✅ vite-stable-replay	89	6
✅ vite-stable-snapshot	89	6

✅ 🪟 Windows

App	Passed	Failed	Skipped
✅ nextjs-turbopack-replay	95	0	0
✅ nextjs-turbopack-snapshot	95	0	0

✅ 📋 Other

App	Passed	Skipped
✅ e2e-local-dev-nest-stable-replay	89	6
✅ e2e-local-dev-nest-stable-snapshot	89	6
✅ e2e-local-postgres-nest-stable-replay	89	6
✅ e2e-local-postgres-nest-stable-snapshot	89	6
✅ e2e-local-prod-nest-stable-replay	89	6
✅ e2e-local-prod-nest-stable-snapshot	89	6

📋 View full workflow run

vercel

Additional Suggestion:

The Storage interface requires a snapshots property but packages/world-vercel/src/storage.ts does not implement it, causing TypeScript build failures (TS2741).

- Extract workflow arguments from run_created event and pass to the workflow function via __wdk_deserialize() - Call executePendingJobs() after each step_completed/step_failed/ wait_completed event to allow async function await resumptions to unwind one step at a time - Add debug logging for workflow result bytes The addTenWorkflow e2e test is still failing: the workflow result bytes are 'devl-1' (devalue for undefined) even though all steps complete successfully. The issue appears to be that the async function return value is not propagating through the SWC-compiled workflow bundle's promise chain. This needs investigation — the unit tests with simple inline workflow code work correctly.

Adds snapshot.* semantic conventions and threads the parent `WORKFLOW {workflowName}` span into the snapshot entrypoint and VM runner so operators can see snapshot-restore latency, snapshot size, encrypt/decrypt overhead, and event-fetch behavior in their traces. Attributes attached to the parent span: - snapshot.runtime ('snapshot' | 'replay') - snapshot.invocation_kind ('first' | 'restore') - snapshot.outcome ('completed' | 'suspended' | 'failed') - snapshot.events.preloaded, .fetched_count, .fetched_pages - snapshot.pending_ops_count, .events_cursor - snapshot.{load,save,delete,decrypt,encrypt,deserialize,serialize}.duration_ms - snapshot.{load,save}.bytes, snapshot.save.plaintext_bytes Two child spans: - snapshot.load — wraps world.snapshots.load + decrypt (deserialize duration is recorded as an attribute since it occurs inside the VM runner where the load span is no longer in scope). - snapshot.save — wraps QuickJS.serializeSnapshot + encrypt + world.snapshots.save. No metrics histograms — the codebase has no metric pipeline yet, so this matches the existing attributes-on-spans convention used by the replay runtime.

Previously the seedrandom seed for each VM invocation was `runId:workflowName:startedAt` — constant across all resumptions of a run. Each restore re-initialized the RNG from that same seed and replayed the first-N draws, so the VM's `__generateUlid` and `__generateNanoid` produced identical IDs on every resumption. That collapsed the hasCreatedEvent dedup guard and caused step / hook correlation IDs to drift between invocations. Mix `existingSnapshot.metadata.eventsCursor` into the seed when restoring. The cursor is stable for retries of the same resumption (idempotent within a single resume) but advances across resumes, which is exactly the determinism boundary we want.

…invocations Two queue messages for the same workflow run can be processed concurrently by separate workflow handler instances. The replay runtime is naturally idempotent (full event-log replay produces deterministic correlationIds via the seeded PRNG), but the snapshot runtime previously used `ulid(Date.now())` for correlationIds — concurrent VMs hit it at slightly different ms and produced different ULIDs even though the seeded PRNG portion was identical. The world had no way to dedup these as duplicates, so a single logical step became two step_created events with two independent step handlers. For workflows like fibonacciWorkflow that do `Promise.all([runA.returnValue, runB.returnValue])`, this manifested as 4 step_created events for 2 logical operations, with 2 of the 4 `Run#returnValue` proxies hanging because nothing wrote their step_completed. Inject a deterministic timestamp (`workflowRun.startedAt`, constant per-run) into the VM as `__ulidTimestamp`. The bundle's `__generateUlid` reads it instead of `Date.now()` when present, so concurrent VMs produce identical ULIDs. Distinctness across resumptions still comes from the cursor mixed into the seedrandom seed, which advances the PRNG sequence between resumes.

Three unit tests covering: - Same fresh start (no snapshot) → identical correlationIds across two concurrent invocations. - Same restore (snapshot + same events) → identical correlationIds across two concurrent invocations. - Different resume (cursor advanced) → distinct correlationIds across resumes (so EntityConflictError doesn't falsely dedup unrelated steps). The first two tests fail against the pre-fix runtime (different ULID timestamp portions across concurrent invocations); the third test was already passing pre-fix because the cursor-mixed seedrandom seed already produced distinct random portions across resumes.

…-local Concurrent invocations producing identical correlationIds (as the snapshot runtime does by design across replays) previously both succeeded and persisted duplicate events. step_created had no guard at all; wait_created used a TOCTOU read-then-check that allowed both writers through under concurrency. Both now claim a per-(runId, correlationId) constraint file with O_CREAT|O_EXCL before writing, so the loser surfaces as EntityConflictError — which the runtime's dedup catch path already handles.

…in world-postgres Adds a unique partial index on workflow_events(run_id, correlation_id, type) filtered to step_created/hook_created/wait_created, and translates the resulting unique-violation (pg code 23505, surfaced via DrizzleQueryError.cause) into EntityConflictError. The steps table already deduped via onConflictDoNothing, but the event row still inserted, leaving duplicate events in the log. Now both rows are kept consistent and the runtime's existing dedup catch path handles concurrent writers cleanly.

Three coupled changes in the snapshot entrypoint's suspension handler: 1. Build per-pending-op promises and await them with Promise.all instead of running them in a sequential for-loop. Mirrors the replay runtime's suspension-handler.ts pattern. 2. Run snapshot.save concurrently with the op dispatch via the same Promise.all. The snapshot is an optimization — if save lags or fails, the next workflow invocation simply replays from events. Previously blocked step queueing on a full storage round-trip. 3. Drop the redundant hooks.list pre-check from the hook_created branch. With deterministic correlationIds (snapshot runtime PRNG fix) and per-(runId, correlationId) uniqueness in worlds (world-local + world-postgres dedup fixes), EntityConflictError on events.create is the correct dedup signal and the pre-check is an unnecessary round-trip per pending hook. CI run 25095263499 measured snapshot ~2.37x slower than replay per-test on Vercel (sum: 2418s vs 1021s); these changes should narrow that gap considerably on cloud worlds where each storage call is a network round-trip.

Hook-related e2e tests (hookWorkflow, hookCleanupTestWorkflow, hookDisposeTestWorkflow, hookWithSleepWorkflow, distributedAbortController) previously slept a fixed 5 seconds before calling getHookByToken to wait for the hook to be registered. On slower runtimes — notably the snapshot runtime on Vercel where each workflow round-trip is several seconds longer than replay — that fixed budget is too tight and the test fails with HookNotFoundError. On faster runtimes it's unnecessarily slow. Adds a waitForHook(token, { timeoutMs, intervalMs, runId }) helper that polls until the hook resolves or the timeout (default 30s) expires, with an optional runId filter for token-reuse tests where eventually-consistent backends may briefly still report a stale hook. Each hook-wait site now uses this helper. Non-hook fixed sleeps (workflow-progress polling for sleepingWorkflow cancel tests, payload-processing waits in hookWithSleepWorkflow) are unchanged.

The recursion-hazard fixes that motivated the blast-radius cap have all landed: 1. Snapshot runtime correlationIds are now deterministic across concurrent VM invocations (commit 83bcec — `__ulidTimestamp` injection so same-resumption invocations produce identical ULIDs). 2. The seeded PRNG state is preserved by the VM heap snapshot itself (commit a71503 — events cursor mixed into seed; ULID monotonicFactory closure persists in the QuickJS heap). 3. Per-(runId, correlationId) uniqueness is enforced atomically in world-local (commit ca0078) and via unique partial index in world-postgres (commit 009a00) for step_created / hook_created / wait_created. With those guarantees the duplicate `start()` invocation that previously fanned out hundreds of thousands of child runs on the fastify deployment is no longer possible. Restore the full Vercel project matrix (11 frameworks) and unskip fibonacciWorkflow on Vercel.

…aces Pipelining world.snapshots.save with the per-pending-op events.create + queueMessage dispatch (introduced in 22ab779) opened a window where a fast-completing step could re-invoke the workflow handler before the new snapshot was persisted. The handler then loads a stale (or missing) snapshot whose coroutine state doesn't match the latest events, leaving the workflow stuck. CI run 25098135190 caught this: fetchWorkflow on Vercel snapshot mode regressed from ~16s passing to a 60s timeout. Diagnostic showed both step_completed events landed at +5.5s but no run_completed ever fired. Restore the original ordering: await snapshot.save fully before any step is queued. Per-pending-op dispatch within a single suspension still runs in parallel via Promise.all, which retains the bulk of the wall-clock reduction (run 25098135190 measured ~568s saved on Vercel snapshot vs. the pre-parallelize baseline). Only the cross-invocation pipelining of save with queue is rolled back.

Wedges on Vercel snapshot runtime under concurrent matrix load are opaque from CI logs alone — the workflow handler runs inside a function on Vercel and its console output isn't surfaced in the CI job. This commit adds two pieces of diagnostic plumbing: 1. Always-on checkpoint logs at every major step of the snapshot suspension/restore lifecycle (`SNAPSHOT_DIAG`), plus matching entry/exit logs in the workflow and step queue handlers (`WORKFLOW_HANDLER_DIAG`, `STEP_HANDLER_DIAG`). Each record carries a per-invocation id, runId, elapsed time, and structured fields (snapshot bytes, events fetched + counts by type, pending op summary, outcome, exit action). Emitted at `warn` level so they show up in Vercel function logs without DEBUG=1. 2. e2e diagnostic harness extension that fetches matching function logs from `/v3/deployments/:id/events` for the wedged runId after a test failure and appends them to the existing run-diagnostic block. Only runs when `WORKFLOW_VERCEL_AUTH_TOKEN` / `WORKFLOW_VERCEL_TEAM` / `VERCEL_DEPLOYMENT_ID` are set (i.e. the Vercel-prod CI matrix); silently no-ops elsewhere. Together these let a failed test surface the function-side activity for its wedged run \u2014 e.g. whether the snapshot runtime even reached its post-VM checkpoint, what its last successful save / queue operation was, whether the next handler invocation ever started, etc. That visibility is what we need to actually find the wedge cause.

…reserve Buffer body across retries Wedge root cause for snapshot runtime on Vercel under concurrent matrix load. The old save() in world-vercel/src/snapshots.ts used: fetch(url, { method: 'PUT', body: compressed, dispatcher: getDispatcher() }) where getDispatcher() returns a RetryAgent. fetch() wraps Buffer/Uint8Array bodies in a one-shot ReadableStream (web fetch spec), so when the RetryAgent retries on a transient 5xx or network error, the second attempt has nothing left to read — the iterable yields 0 bytes, undici detects the mismatch with Content-Length, and throws UND_ERR_REQ_CONTENT_LENGTH_MISMATCH. With 5–15 MB snapshot bodies the bug fires under any meaningful network turbulence. The downstream impact is a permanent wedge: 1. Save throws -> workflow handler returns 500. 2. Queue retries the handler with backoff. 3. Each retry repeats the same save -> same throw -> same 500. 4. Production logs showed attempt: 19 (≈1.5 hours of retries) before the test framework gave up at the 60s test timeout. Switch to undici.request() (the lower-level API), which hands the Buffer to the connection layer directly without stream wrapping, so retries can replay the same body. Verified locally with a vitest regression test that reproduces the exact production stack trace (AsyncWriter.end -> writeIterable -> UND_ERR_REQ_CONTENT_LENGTH_MISMATCH) without the fix and passes with it. Other world-vercel endpoints (events, hooks, runs, …) hit the same underlying undici limitation but in practice rarely fail this way: their bodies are tiny (KB CBOR-encoded payloads), so the chance of network turbulence mid-stream is much lower. They remain on fetch() for now.

Avoid a guaranteed-404 round-trip to the snapshot storage backend on the very first workflow handler invocation. The suspension handler in this file always saves the snapshot BEFORE creating any step_created / hook_created / wait_created events, so if the events preloaded by events.create('run_started') contain only run_created / run_started, no save cycle has run yet and no snapshot can exist. Detected by the new exported `canSkipSnapshotLoad(preloadedEvents)` helper, with 8 unit tests covering each event-type combination (undefined / empty / run_created+run_started / run_started only / step_* / hook_received / wait_completed). When the helper returns true, `existingSnapshot` is set to null without calling `world.snapshots.load()` and the entrypoint falls through to the first-run path with the preloaded events. The wfdiag('snapshot_loaded') checkpoint now also reports `skippedLoad: true` when the fast path was taken so we can confirm the optimization is firing in production logs. Reduces 404 noise on workflow-server's `/v2/runs/:runId/snapshot` endpoint and saves a network round-trip on every initial workflow invocation. Falls back to the normal load path whenever `preloadedEvents` is missing or contains any non-initial event.

…ming breakdown Two changes that go together: 1. New `stripInlineSourceMap()` helper in `source-map.ts` (with 4 unit tests). The runtime entrypoint now strips the trailing `//# sourceMappingURL=data:…` comment from the workflow bundle before passing it to `vm.evalCode()`. The original (unstripped) string is kept in the host-side scope so `remapErrorStack` can still resolve original source positions on workflow failures. The map is purely host-side metadata for stack-trace remapping — the VM never reads it. But QuickJS retains source text for stack-trace line lookups, so the multi-MB base64 comment was being carried into the VM heap and showing up in every snapshot save+load round-trip. Empirically, on the example workbench's bundle: - Bundle string drops 5.16 MB → 1.20 MB (-77%) - QuickJS heap snapshot drops 11.75 MB → 8.00 MB (-32%) That maps to ~1s saved per per-step round-trip on Vercel. 2. Extend the `SNAPSHOT_DIAG snapshot_loaded` and `SNAPSHOT_DIAG snapshot_saved` checkpoint logs with per-stage byte counts and timings: - load: returnedBytes (post-decompress, pre-decrypt), loadDurationMs (HTTP round-trip), decryptDurationMs - save: plaintextBytes (raw QuickJS output), handedToWorldBytes (after host-side encrypt), encryptDurationMs, storeDurationMs So the savings show up in CI-fetched function logs alongside the existing OTel attributes. Naming clarified: 'returnedBytes' / 'handedToWorldBytes' instead of misleading 'wireBytes', because the world (e.g. world-vercel) applies its own gzip layer below this — true on-the-wire bytes are emitted by world-vercel's own diagnostic (separate commit).

Adds `WORLD_SNAPSHOT_DIAG` checkpoint logs to the snapshot save and load paths. Save reports inputBytes (what the core handed in) → wireBytes (after gzipSync) → compressionRatio, plus separate gzipDurationMs and putDurationMs. Load reports the equivalents: wireBytes (raw HTTP body) → decompressedBytes (after gunzipSync), plus getDurationMs and gunzipDurationMs. Pairs with the core `SNAPSHOT_DIAG` checkpoints from the previous commit so the entire snapshot lifecycle for any wedged run is grep-able by runId in Vercel function logs. Also covers the 404 (no-snapshot) case so a core `skippedLoad: true` checkpoint can be cross-referenced against the world's view: when both line up, the optimization is firing as intended; when only one side fires, something's off. All emitted at `console.warn` level — no DEBUG required, matching the format/style of the core wfdiag helper.

…able The snapshot save path was doing the wrong thing: each world (vercel, postgres, local) gzipped the bytes BEFORE handing them to its transport, but core's encryption wrapped them AFTER. Net result was `gzip(encrypt(plain))` on the wire — encryption produces ciphertext that doesn't compress, so the gzip step was largely wasted CPU. Flip the order so compression goes BEFORE encryption (the standard compress-then-encrypt pattern used for at-rest blob encryption — no CRIME/BREACH applicability here since the snapshot is opaque, no attacker injection, no per-request size leakage). Move compression into core so it happens once, in the right place, and so the world layers can be simplified to opaque-bytes transport. Codec choice: zstd when available (Node 22.15+), gzip otherwise. Benchmarked against an 8 MB QuickJS heap snapshot (representative production payload): | codec | ratio | compress | decompress | |--------|-------|----------|------------| | zstd-3 | 4.29x | 18 ms | 6 ms | | gzip-6 | 4.02x | 127 ms | 11 ms | zstd is faster AND smaller. The format prefix on each blob (`zstd` or `gzip`) marks the codec, so deployments running different Node versions remain interoperable. Pipeline now: - SAVE: serialize → compress → encrypt → world.snapshots.save - LOAD: world.snapshots.load → decrypt → decompress → deserialize `@workflow/core`: * New `serialization/compression.ts` with `compress` / `decompress` / `isCompressed` / `PREFERRED_CODEC`. 11 unit tests covering codec selection, idempotency, format-prefix dispatch, legacy-blob passthrough. * New SerializationFormat constants `GZIP` / `ZSTD`. * `runtime/snapshot-entrypoint.ts` save path: compress → encrypt → store. Load path: decrypt → decompress. New byte-count and timing fields on `SNAPSHOT_DIAG snapshot_saved` / `snapshot_loaded` (compressedBytes, compressionRatio, compressionCodec, compressDurationMs, decompressDurationMs). * 7 new tests in `runtime/snapshot-encryption.test.ts` covering the full pipeline round-trip with and without encryption, plus legacy-blob backward compatibility. `@workflow/world-vercel`: * Drop `gzipSync` from save. Body is sent verbatim (already compressed+encrypted by core upstream). * Drop the `X-Snapshot-Content-Encoding: gzip` header on save. * Load still gunzips when the response carries that header — for backward compatibility with blobs written by older deployments. `@workflow/world-postgres`: * Drop `gzipSync` / `gunzipSync`. Stores opaque bytes. Snapshots table is created per CI run; no migration concern. `@workflow/world-local`: * Save as `{runId}.bin` (was `.bin.gz`). Load still gunzips legacy `.bin.gz` files via the `dataFile` metadata so a developer's stale `.workflow-data/` directory keeps working.

The compress-then-encrypt pipeline that landed in 519bb1d added backward-compatibility code to read older snapshot blobs that were written under the previous SDK-side gzip scheme. The snapshot runtime is still on the snapshot-runtime feature branch and has no production deploy, so no such blob has ever been written under the old scheme that needs to outlive a feature-branch deploy. world-vercel: - Remove the X-Snapshot-Content-Encoding: gzip header round-trip on save and load. - Drop the gunzipSync import. - File header comment no longer mentions back-compat. world-local: - Drop the .bin.gz / dataFile metadata mechanism. Snapshots are now always stored as {runId}.bin alongside {runId}.json. - Drop the gunzipSync import and the LocalSnapshotMetadataSchema extension; metadata is just SnapshotMetadataSchema (eventsCursor + createdAt). - File-naming helpers extracted as dataPath() / metadataPath(). core: remove the now-irrelevant 'legacy snapshots saved before compression was added' test from snapshot-encryption.test.ts. The remaining 'plaintext bytes pass through unchanged' test still exercises the contract that decryptSerializedData() does not require prefixed input — that's a real pre-existing API contract used by non-snapshot callers, not snapshot back-compat.

Replaces 14 incremental per-commit changesets with 4 terse, package-scoped ones (one each for @workflow/core, world-vercel, world-postgres, world-local). The detailed per-change context is preserved in git history; CHANGELOG entries from changesets should describe what consumers need to know, not the implementation history.

This changeset is part of the serialization-refactor base branch (introduced in 6add40c) and was incorrectly deleted in the previous consolidation pass. Only changesets local to the snapshot-runtime branch should have been consolidated.

The file is regenerated on every build (`scripts/build-vm-serde-bundle.js`) and is already listed under turbo.json's outputs for caching. Tracking it just produced noisy diffs whenever someone built the package with a slightly different esbuild version.

…isites Standardize on `Symbol.for('workflow-serialize')` / `Symbol.for('workflow-deserialize')` everywhere — the parallel `globalThis.__wdk_serialize` / `__wdk_deserialize` aliases have been removed from `vm-bundle-entry.ts` and the snapshot runtime's inline JS strings now use the symbol form directly. Single canonical name, no duplication. Drop the `?? Math.random` and `?? Date.now()` fallbacks from the ULID generator setup. Both prerequisites (`globalThis.__ulidTimestamp` and the host-replaced seeded `Math.random`) are always set by `snapshot-runtime.ts` before the serde bundle is evaluated; silently falling back to unseeded `Math.random` or live `Date.now()` would re-introduce the non-determinism we deliberately fixed (concurrent VM invocations of the same resumption must produce identical correlationIds for the world's EntityConflictError dedup to work). Now throws if `__ulidTimestamp` isn't a number, and passes the seeded `Math.random` reference explicitly to `monotonicFactory` so upstream's `detectPRNG` never runs (it'd throw in QuickJS anyway, since `crypto` is unavailable). Drop the `URL` / `URLSearchParams` / `DOMException` availability guards in `common-vm.ts`. quickjs-wasi's URL extension is always loaded (`url.so`) and DOMException is always constructible — the guards were dead code carried over from when those weren't reliably available. The reducer/reviver code is now straightforward `instanceof URL` / `new URL(...)` / `new DOMException(...)`. Remove `packages/core/src/serialization/base64.ts` and its sub-path exports (`./serialization/workflow`, `./serialization/workflow-vm`). The pure-JS base64 helpers were leftover from before `base64.so` shipped `btoa`/`atob` natively; the VM-side reducers in `common-vm.ts` now build base64 strings via the native ones. The sub-path exports had zero consumers in this repo (the same cleanup landed on the `serialization-refactor` branch in 05e0fee but never made it onto `snapshot-runtime` because the branches diverged earlier). Remove `packages/workflow/src/internal/serialization.ts` and its `./internal/serialization` package.json export. Same story — zero consumers, previously removed in #1082, then accidentally reintroduced via `f04fd8e91`.

The `/v3/deployments/:id/events` endpoint mostly returned empty results in our wedge-debugging usage and the runId-substring filter made it slow when it did return data. The function-log fetch belongs in a dedicated diagnostic CLI command rather than baked into the test diagnostic block. Dropping for now; can be revived in a follow-up PR if needed.

Updates the per-package changesets to match AGENTS.md guidance and the current state of the PR: - Bump from `patch` to `minor` (snapshot runtime is a new feature, not a bug fix; correctness matters when the changesets land on `stable`) - Correct snapshot-runtime-core.md: snapshot is now the default, with replay available via `WORKFLOW_RUNTIME=replay` (was incorrectly describing snapshot as opt-in) - Drop the misleading 'enforces uniqueness' line from snapshot-runtime-world-vercel.md (no uniqueness work happens in this package; that lives in workflow-server) - Tighten language across all four changesets per AGENTS.md ('Keep the changesets terse')

…stack regression Per CI history (runs 25100278265 vs 25130930859), the regression boundary for the 'basic step error preserves message and stack trace' / 'cross-file step error preserves message and function names in stack' e2e tests on astro local-dev is commit 770c433 ('Add CI-visible runtime diagnostics for snapshot wedges'), NOT the later 9168353 source-map-strip commit. The astro-dev failure reproduces on both replay and snapshot runtimes with identical symptoms (function name shows up as `__getOwnPropDesc` instead of the actual step function name in the source-mapped stack), which rules out any snapshot-runtime specific cause. The STEP_HANDLER_DIAG entries were always-on `runtimeLogger.warn` calls inside the step queue handler. They didn't add real diagnostic value beyond what the existing OTel spans already cover; their main purpose was to grep-correlate step activity with SNAPSHOT_DIAG checkpoints in Vercel function logs during the wedge-debugging session that's now resolved. SNAPSHOT_DIAG and WORKFLOW_HANDLER_DIAG are kept; only the STEP_HANDLER_DIAG pair is removed. The exact mechanism by which the diagnostic warns affect the `stepFn.apply()` stack frame's source-mapped function name is still unclear (the most plausible explanation is that the line-shift in step-handler.ts perturbed Vite's dev-mode module graph in a way that changes which export getter wraps the step function reference at the `__copyProps` site shared with the namespace import in `_workflows.ts`). Reverting the diagnostic is sufficient to restore the test, and the diagnostic itself is not load-bearing.

socket-security · 2026-05-01T07:18:47Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	npm/quickjs-wasi@2.0.0

View full report

Copilot

Pull request overview

Implements the new default snapshot-based workflow runtime (QuickJS WASM VM with snapshot/restore) and wires snapshot persistence into world backends, while keeping the existing event-replay runtime as an opt-out via WORKFLOW_RUNTIME=replay.

Changes:

Add snapshot runtime execution path in @workflow/core (VM bootstrap, snapshot save/load pipeline with compression + optional encryption, runtime-mode dispatch, and new telemetry attributes).
Introduce snapshots.save/load/delete to the @workflow/world storage interface and implement it for world-vercel, world-postgres, and world-local.
Expand CI/E2E coverage to run tests against both runtimes and reduce E2E flakiness by polling for hook registration instead of fixed sleeps.

Reviewed changes

Copilot reviewed 53 out of 54 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
scripts/create-test-matrix.mjs	Duplicates app matrix across `snapshot` and `replay` runtime axes.
pnpm-lock.yaml	Adds `quickjs-wasi@2.0.0` lock entries.
packages/world/src/snapshots.ts	Adds `SnapshotMetadataSchema` (`eventsCursor`, `createdAt`).
packages/world/src/interfaces.ts	Extends `Storage` with `snapshots.save/load/delete`.
packages/world/src/index.ts	Exposes snapshot types/schema from `@workflow/world`.
packages/world-vercel/src/storage.ts	Wires `snapshots` into Vercel storage and instrumentation.
packages/world-vercel/src/snapshots.ts	Implements snapshot storage via workflow-server snapshot endpoints.
packages/world-vercel/src/snapshots.test.ts	Adds tests for PUT body correctness and retry behavior.
packages/world-postgres/test/storage.test.ts	Adds tests asserting dedup behavior for entity-creation races.
packages/world-postgres/src/storage.ts	Maps pg unique-violation for entity-creating events to `EntityConflictError`.
packages/world-postgres/src/snapshots.ts	Implements Postgres snapshot upsert/load/delete storage.
packages/world-postgres/src/index.ts	Wires snapshots storage into Postgres `createStorage`.
packages/world-postgres/src/drizzle/schema.ts	Adds snapshots table + entity-creation partial unique index.
packages/world-postgres/src/drizzle/migrations/meta/_journal.json	Registers new migrations in drizzle journal.
packages/world-postgres/src/drizzle/migrations/0010_add_snapshots_table.sql	Creates `workflow.workflow_snapshots` table.
packages/world-postgres/src/drizzle/migrations/0011_add_events_entity_creation_unique_index.sql	Adds partial unique index for step/hook/wait creation events.
packages/world-local/src/storage/snapshots-storage.ts	Adds filesystem-backed snapshot storage (bytes + metadata files).
packages/world-local/src/storage/index.ts	Wires snapshots storage into local storage and instrumentation.
packages/world-local/src/storage/events-storage.ts	Adds atomic lock-file dedup for `step_created` and `wait_created`.
packages/world-local/src/storage.test.ts	Adds race tests for local step/wait creation dedup behavior.
packages/world-local/src/queue.ts	Logs queue handler errors with stack for debugging.
packages/core/turbo.json	Adds generated VM bundle/assets files to build outputs.
packages/core/src/telemetry/semantic-conventions.ts	Adds snapshot runtime semantic convention attributes.
packages/core/src/source-map.ts	Adds `stripInlineSourceMap()` to reduce VM heap/snapshot size.
packages/core/src/source-map.test.ts	Tests `stripInlineSourceMap()` behavior.
packages/core/src/serialization/workflow-vm.ts	Adds VM-safe workflow-mode serializer/deserializer.
packages/core/src/serialization/workflow-vm.test.ts	Tests VM serializer and VM↔Node compatibility.
packages/core/src/serialization/vm-bundle-entry.ts	VM bundle entry: installs serde + deterministic ULID generator.
packages/core/src/serialization/types.ts	Adds compression format prefixes (`gzip`, `zstd`).
packages/core/src/serialization/reducers/common-vm.ts	Adds VM-safe reducers/revivers (base64 via `btoa/atob`).
packages/core/src/serialization/compression.ts	Adds compress/decompress layer with gzip/zstd feature detection.
packages/core/src/serialization/compression.test.ts	Tests compression layer behavior and codec selection.
packages/core/src/serialization/compat.test.ts	Adds compatibility tests between modular and legacy serialization APIs.
packages/core/src/serialization/codec-devalue.ts	Adds clarifying notes about modular modules vs legacy runtime path.
packages/core/src/serialization/codec-devalue-vm.ts	Adds VM-compatible devalue codec using VM reducers/revivers.
packages/core/src/runtime/start.ts	Propagates `WORKFLOW_RUNTIME` choice into `executionContext`.
packages/core/src/runtime/snapshot-runtime.ts	Implements QuickJS snapshot/restore runtime engine.
packages/core/src/runtime/snapshot-runtime.test.ts	Unit tests for snapshot runtime behavior and determinism.
packages/core/src/runtime/snapshot-entrypoint.ts	Integrates snapshot runtime into devkit entrypoint + storage pipeline.
packages/core/src/runtime/snapshot-entrypoint.test.ts	Tests snapshot-load skip heuristic.
packages/core/src/runtime/snapshot-encryption.test.ts	Tests compress→encrypt→decrypt→decompress contract.
packages/core/src/runtime/runtime-mode.ts	Adds `WORKFLOW_RUNTIME` parsing/validation.
packages/core/src/runtime/runtime-mode.test.ts	Tests runtime-mode env parsing.
packages/core/src/runtime.ts	Switches default runtime to snapshot with replay fallback.
packages/core/scripts/build-vm-serde-bundle.js	Generates VM serde bundle source used by snapshot runtime.
packages/core/scripts/build-quickjs-assets.js	Generates embedded quickjs-wasi wasm/extension assets.
packages/core/package.json	Adds `quickjs-wasi` dependency and generators to build script.
packages/core/e2e/e2e.test.ts	Replaces fixed hook sleeps with polling helper to reduce flakiness.
packages/core/.gitignore	Ignores generated VM bundle/assets files.
.github/workflows/tests.yml	Expands CI matrix across runtimes and avoids ARG_MAX in sticky comment.
.changeset/snapshot-runtime-world-vercel.md	Changeset for world-vercel snapshot storage + undici.request rationale.
.changeset/snapshot-runtime-world-postgres.md	Changeset for world-postgres snapshots + event uniqueness fix.
.changeset/snapshot-runtime-world-local.md	Changeset for world-local snapshots + event dedup fix.
.changeset/snapshot-runtime-core.md	Changeset for core snapshot runtime default + replay opt-out.

Files not reviewed (1)

pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  "scripts": {
-    "build": "genversion --es6 src/version.ts && tsc",
+    "build": "genversion --es6 src/version.ts && node scripts/build-vm-serde-bundle.js && node scripts/build-quickjs-assets.js && tsc",
    "dev": "genversion --es6 src/version.ts && tsc --watch",
    "clean": "tsc --build --clean && rm -rf dist src/version.ts docs ||:",


+ * The binary data is stored gzip-compressed in the `data` column.
+ * Metadata (`eventsCursor`, `createdAt`) lives alongside for cheap loads.
+ */


+    const escapedCid = cid.replace(/"/g, '\\"');
+    const eventData =


+function arrayBufferToBase64(
+  value: ArrayBufferLike,
+  offset: number,
+  length: number
+): string {
+  if (length === 0) return '.';
+  // btoa requires a binary string. Build it from the byte view.
+  const uint8 = new Uint8Array(value, offset, length);
+  let binary = '';
+  for (let i = 0; i < uint8.length; i++) {
+    binary += String.fromCharCode(uint8[i]!);
+  }
+  return btoa(binary);


TooTallNate added 19 commits March 7, 2026 20:26

vercel Bot reviewed Mar 9, 2026

View reviewed changes

vercel Bot had a problem deploying to Preview – example-workflow March 9, 2026 01:22 Failure

vercel Bot had a problem deploying to Preview – workflow-docs March 9, 2026 01:22 Failure

vercel Bot had a problem deploying to Preview – workbench-fastify-workflow March 9, 2026 01:22 Failure

vercel Bot had a problem deploying to Preview – workbench-nitro-workflow March 9, 2026 01:22 Failure

vercel Bot had a problem deploying to Preview – workbench-express-workflow March 9, 2026 01:22 Failure

vercel Bot had a problem deploying to Preview – workflow-nest March 9, 2026 01:22 Failure

TooTallNate added 22 commits April 28, 2026 17:45

Restore .changeset/serialization-refactor.md

3b108a6

This changeset is part of the serialization-refactor base branch (introduced in 6add40c) and was incorrectly deleted in the previous consolidation pass. Only changesets local to the snapshot-runtime branch should have been consolidated.

TooTallNate added 2 commits April 30, 2026 01:35

Copilot AI reviewed May 1, 2026

View reviewed changes

TooTallNate mentioned this pull request May 1, 2026

ci: pass stale-banner via path: to sticky-pull-request-comment in tests + benchmarks workflows #1887

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot Runtime: QuickJS WASM VM with snapshot/restore for workflow execution#1300

Snapshot Runtime: QuickJS WASM VM with snapshot/restore for workflow execution#1300
TooTallNate wants to merge 134 commits intomainfrom
snapshot-runtime

TooTallNate commented Mar 9, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Mar 9, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Mar 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 9, 2026 •

edited

Loading

Uh oh!

vercel Bot left a comment

Uh oh!

socket-security Bot commented May 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		const escapedCid = cid.replace(/"/g, '\\"');
		const eventData =

Conversation

TooTallNate commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Scope of this PR

Out of scope / future work

Uh oh!

changeset-bot Bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧪 E2E Test Results

Summary

Details by Category

Uh oh!

vercel Bot left a comment

Choose a reason for hiding this comment

Uh oh!

socket-security Bot commented May 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TooTallNate commented Mar 9, 2026 •

edited

Loading

changeset-bot Bot commented Mar 9, 2026 •

edited

Loading

vercel Bot commented Mar 9, 2026 •

edited

Loading

github-actions Bot commented Mar 9, 2026 •

edited

Loading