feat(core): add workflow time-travel deterministic replay APIs#1099
feat(core): add workflow time-travel deterministic replay APIs#1099
Conversation
🦋 Changeset detectedLatest commit: 6829072 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
This comment has been minimized.
This comment has been minimized.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds deterministic "time travel" replay APIs to Workflow and WorkflowChain, implements streaming and resume semantics, persists replay lineage metadata on workflow state, updates observability/tracing for replay provenance, exposes public types, adds server endpoints/handlers, tests, and docs. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Workflow
participant MemoryV2 as Memory (V2)
participant Engine as ExecutionEngine
Client->>Workflow: timeTravel({executionId, stepId, ...})
activate Workflow
Workflow->>MemoryV2: fetch source execution + per-step data
activate MemoryV2
MemoryV2-->>Workflow: execution record & step snapshots
deactivate MemoryV2
Workflow->>Workflow: validate, locate target step, reconstruct input/state, build lineage metadata
Workflow->>Engine: start replay execution (new executionId) from target step
activate Engine
Engine->>MemoryV2: persist replay execution, steps, lineage metadata
activate MemoryV2
MemoryV2-->>Engine: persisted
deactivate MemoryV2
Engine-->>Workflow: emit replay events / suspension / completion
deactivate Engine
Workflow-->>Client: return WorkflowExecutionResult or stream events
deactivate Workflow
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
2 issues found across 10 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/core/src/workflow/core.ts">
<violation number="1" location="packages/core/src/workflow/core.ts:126">
P2: Silently defaulting an unrecognized step status to `"success"` can mask data corruption during replay. Consider throwing an error (or at least logging a warning) when the status is invalid, rather than silently promoting it to success.</violation>
<violation number="2" location="packages/core/src/workflow/core.ts:2587">
P1: Using `.find()` returns the first matching step-complete event instead of the last. For retried steps, this retrieves data from the initial (failed) attempt rather than the final (successful) one. Use a reverse search to get the most recent event for the step.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| continue; | ||
| } | ||
|
|
||
| const fallbackEvent = sourceStepCompleteEvents.find( |
There was a problem hiding this comment.
P1: Using .find() returns the first matching step-complete event instead of the last. For retried steps, this retrieves data from the initial (failed) attempt rather than the final (successful) one. Use a reverse search to get the most recent event for the step.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/workflow/core.ts, line 2587:
<comment>Using `.find()` returns the first matching step-complete event instead of the last. For retried steps, this retrieves data from the initial (failed) attempt rather than the final (successful) one. Use a reverse search to get the most recent event for the step.</comment>
<file context>
@@ -2493,6 +2518,193 @@ export function createWorkflow<
+ continue;
+ }
+
+ const fallbackEvent = sourceStepCompleteEvents.find(
+ (event) =>
+ event.from === step.id ||
</file context>
Deploying voltagent with
|
| Latest commit: |
6829072
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://078a49e8.voltagent.pages.dev |
| Branch Preview URL: | https://feat-workflow-time-travel-de.voltagent.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
packages/core/src/workflow/chain.ts (1)
991-1023: Consider adding the same memory-requirement note asrestartto both JSDoc comments.
timeTravelandtimeTravelStreamneed access to the source execution state in persistent memory, just likerestart. The existingrestartandrestartAllActivemethods already document this explicitly (lines 1025–1030 and 1047–1050). Without the same caveat, callers using an ephemeral/default in-memory store may get a confusing runtime error.📝 Proposed JSDoc additions
- /** - * Replay a historical execution from the selected step - */ + /** + * Replay a historical execution from the selected step. + * This recreates a workflow instance via `createWorkflow(...)` on each call. + * Use persistent/shared memory (or register the workflow) so prior execution state is discoverable. + * For ephemeral setup patterns, prefer `chain.toWorkflow().timeTravel(...)` and reuse that instance. + */ async timeTravel( - /** - * Stream a historical replay from the selected step - */ + /** + * Stream a historical replay from the selected step. + * This recreates a workflow instance via `createWorkflow(...)` on each call. + * Use persistent/shared memory (or register the workflow) so prior execution state is discoverable. + * For ephemeral setup patterns, prefer `chain.toWorkflow().timeTravelStream(...)` and reuse that instance. + */ timeTravelStream(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/chain.ts` around lines 991 - 1023, Add the same persistent-memory requirement note used on restart/restartAllActive to the JSDoc for timeTravel and timeTravelStream: update the comments for the timeTravel and timeTravelStream methods in chain.ts (methods named timeTravel and timeTravelStream) to explicitly state that these operations require access to the source execution state in a persistent store (not the ephemeral/default in-memory store) so callers know to configure a persistent memory backend to avoid runtime errors.packages/core/src/workflow/time-travel.spec.ts (1)
9-13: Add a test for the explicitly documentedrunning-state rejection.The PR notes call out: "timeTravel rejects source executions in the running state (crash recovery continues to use restart(...))", and the overview docs repeat this under "Notes". This is a load-bearing invariant that currently has no test coverage.
A test for a non-existent
executionIdwould also be worth adding as a companion edge case.✅ Suggested test additions
it("should reject time travel on a running execution", async () => { const memory = new Memory({ storage: new InMemoryStorageAdapter() }); const workflow = createWorkflow( { id: "time-travel-running-rejection", name: "Time Travel Running Rejection", input: z.object({ value: z.number() }), result: z.object({ value: z.number() }), memory, }, andThen({ id: "step-1", execute: async ({ data }) => data }), ); const registry = WorkflowRegistry.getInstance(); registry.registerWorkflow(workflow); // Manually persist a "running" state to simulate an in-flight execution const runningId = "exec-running-test"; await memory.setWorkflowState(runningId, { id: runningId, workflowId: "time-travel-running-rejection", workflowName: "Time Travel Running Rejection", status: "running", createdAt: new Date(), updatedAt: new Date(), }); await expect( workflow.timeTravel({ executionId: runningId, stepId: "step-1" }), ).rejects.toThrow(/running/i); }); it("should fail with actionable error when execution does not exist", async () => { const memory = new Memory({ storage: new InMemoryStorageAdapter() }); const workflow = createWorkflow( { id: "time-travel-missing-exec", name: "Time Travel Missing Exec", input: z.object({ value: z.number() }), result: z.object({ value: z.number() }), memory, }, andThen({ id: "step-1", execute: async ({ data }) => data }), ); const registry = WorkflowRegistry.getInstance(); registry.registerWorkflow(workflow); await expect( workflow.timeTravel({ executionId: "non-existent-id", stepId: "step-1" }), ).rejects.toThrow(); });Want me to open a new issue to track these missing test cases?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/time-travel.spec.ts` around lines 9 - 13, Add two unit tests to packages/core/src/workflow/time-travel.spec.ts: one that persists a workflow state with status "running" (use Memory with InMemoryStorageAdapter, createWorkflow and WorkflowRegistry.registerWorkflow) and asserts that workflow.timeTravel({ executionId: runningId, stepId }) rejects with an error mentioning "running"; and a second that calls workflow.timeTravel with a non-existent executionId and asserts it rejects with an actionable error. Use the existing createWorkflow/andThen helpers and Memory/WorkflowRegistry symbols so the tests simulate the running/inexistent states without starting actual executions.packages/core/src/workflow/core.ts (1)
122-127: Silent default to"success"for unrecognized statuses.This is used when reconstructing historical step data for replay. If a step had an unexpected status value (e.g., data corruption), it would silently be treated as
"success". Consider logging a warning when the fallback is used, so replay anomalies are observable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/core.ts` around lines 122 - 127, The toWorkflowStepStatus function silently defaults unrecognized values to "success", which can hide replay anomalies; update toWorkflowStepStatus to log a warning (including the raw value) whenever isWorkflowStepStatus(value) is false before returning "success" — use the project's preferred logger (or console.warn if none available) and reference toWorkflowStepStatus, isWorkflowStepStatus, and WorkflowStepData["status"] so the warning makes replay data issues observable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@website/docs/workflows/streaming.md`:
- Around line 45-50: The docs list of VoltAgent methods omits the new
.timeTravel() API; update the methods list in
website/docs/workflows/streaming.md to include `.timeTravel()` alongside
`.timeTravelStream()`, `.stream()`, `.run()`, and `.startAsync()` so the list
reflects all five public methods and their brief descriptions.
---
Nitpick comments:
In `@packages/core/src/workflow/chain.ts`:
- Around line 991-1023: Add the same persistent-memory requirement note used on
restart/restartAllActive to the JSDoc for timeTravel and timeTravelStream:
update the comments for the timeTravel and timeTravelStream methods in chain.ts
(methods named timeTravel and timeTravelStream) to explicitly state that these
operations require access to the source execution state in a persistent store
(not the ephemeral/default in-memory store) so callers know to configure a
persistent memory backend to avoid runtime errors.
In `@packages/core/src/workflow/core.ts`:
- Around line 122-127: The toWorkflowStepStatus function silently defaults
unrecognized values to "success", which can hide replay anomalies; update
toWorkflowStepStatus to log a warning (including the raw value) whenever
isWorkflowStepStatus(value) is false before returning "success" — use the
project's preferred logger (or console.warn if none available) and reference
toWorkflowStepStatus, isWorkflowStepStatus, and WorkflowStepData["status"] so
the warning makes replay data issues observable.
In `@packages/core/src/workflow/time-travel.spec.ts`:
- Around line 9-13: Add two unit tests to
packages/core/src/workflow/time-travel.spec.ts: one that persists a workflow
state with status "running" (use Memory with InMemoryStorageAdapter,
createWorkflow and WorkflowRegistry.registerWorkflow) and asserts that
workflow.timeTravel({ executionId: runningId, stepId }) rejects with an error
mentioning "running"; and a second that calls workflow.timeTravel with a
non-existent executionId and asserts it rejects with an actionable error. Use
the existing createWorkflow/andThen helpers and Memory/WorkflowRegistry symbols
so the tests simulate the running/inexistent states without starting actual
executions.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
website/docs/api/endpoints/workflows.md (1)
623-647:⚠️ Potential issue | 🟡 MinorReplay lineage fields (
replayedFromExecutionId,replayFromStepId) absent from state response example.The changeset documents that workflow state records now carry
replayedFromExecutionIdandreplayFromStepId. API consumers polling the state endpoint on a replay execution will receive these fields with no reference in the docs.📝 Proposed addition to the state response example
"context": { "userId": "user-123", "priority": "high" - } + }, + "replayedFromExecutionId": "exec_1234567890_abc123", + "replayFromStepId": "approval-required" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@website/docs/api/endpoints/workflows.md` around lines 623 - 647, Update the state response example to include the new replay lineage fields by adding "replayedFromExecutionId" and "replayFromStepId" into the top-level "data" object of the JSON example (use realistic sample values like "exec_0987654321_replay" and "step-approval-1"); ensure these keys appear alongside existing fields such as "executionId", "workflowId", "status", and "suspension" so API consumers polling a replay execution will see the replay lineage in the documented response.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@website/docs/api/endpoints/workflows.md`:
- Line 566: Fix the grammar in the error description string "400 - Invalid
replay parameters (for example invalid `stepId` or source execution still
running)" by inserting a comma after "for example" so it reads "...(for example,
invalid `stepId` or source execution still running)"; update that exact line in
workflows.md accordingly.
- Around line 546-562: Add the missing "suspended" response variant to the
replay response docs for the Execute endpoint: after the existing completed
example include a JSON example where "status": "suspended", "endAt": null,
"result": null and a "suspension" object containing "suspendedAt", "reason", and
"suspendedStepIndex" (keep executionId/startAt values consistent with the
example); ensure field names match the existing schema (executionId, startAt,
endAt, status, result, suspension) so replay documentation covers both completed
and suspended outcomes.
- Around line 513-613: Remove the entire "Replay Workflow" documentation block
that declares the POST /workflows/:id/executions/:executionId/replay endpoint
because there is no server route for it; instead, update docs to reflect that
replay/time-travel functionality is SDK-only (referencing the timeTravelStream
API) and ensure any examples, request/response schemas, and error cases for that
endpoint are deleted so the docs only show implemented REST endpoints (execute,
stream, suspend, cancel, resume, state retrieval).
---
Outside diff comments:
In `@website/docs/api/endpoints/workflows.md`:
- Around line 623-647: Update the state response example to include the new
replay lineage fields by adding "replayedFromExecutionId" and "replayFromStepId"
into the top-level "data" object of the JSON example (use realistic sample
values like "exec_0987654321_replay" and "step-approval-1"); ensure these keys
appear alongside existing fields such as "executionId", "workflowId", "status",
and "suspension" so API consumers polling a replay execution will see the replay
lineage in the documented response.
| ## Replay Workflow | ||
|
|
||
| Create a deterministic replay execution from a historical run and selected step. | ||
|
|
||
| **Endpoint:** `POST /workflows/:id/executions/:executionId/replay` | ||
|
|
||
| **Request Body:** | ||
|
|
||
| ```json | ||
| { | ||
| "stepId": "approval-step", | ||
| "inputData": { | ||
| "amount": 2500 | ||
| }, | ||
| "resumeData": { | ||
| "approved": true, | ||
| "approvedBy": "ops-user-1" | ||
| }, | ||
| "workflowStateOverride": { | ||
| "replayReason": "incident-1234" | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **Parameters:** | ||
|
|
||
| | Field | Type | Description | | ||
| | ----------------------- | ------ | --------------------------------------- | | ||
| | `stepId` | string | Historical step ID to replay from | | ||
| | `inputData` | any | Optional selected-step input override | | ||
| | `resumeData` | any | Optional resume payload override | | ||
| | `workflowStateOverride` | object | Optional shared workflow state override | | ||
|
|
||
| **Response:** | ||
|
|
||
| ```json | ||
| { | ||
| "success": true, | ||
| "data": { | ||
| "executionId": "exec_replay_123", | ||
| "startAt": "2024-01-15T11:00:00.000Z", | ||
| "endAt": "2024-01-15T11:00:02.250Z", | ||
| "status": "completed", | ||
| "result": { | ||
| "approved": true, | ||
| "finalAmount": 2500 | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **Error Cases:** | ||
|
|
||
| - `400` - Invalid replay parameters (for example invalid `stepId` or source execution still running) | ||
| - `404` - Workflow or source execution not found | ||
| - `500` - Replay failed due to server error | ||
|
|
||
| **cURL Example (Default Replay):** | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:3141/workflows/order-approval/executions/exec_123/replay \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "stepId": "approval-step" | ||
| }' | ||
| ``` | ||
|
|
||
| **cURL Example (Replay With Overrides):** | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:3141/workflows/order-approval/executions/exec_123/replay \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "stepId": "approval-step", | ||
| "inputData": { "amount": 2500 }, | ||
| "resumeData": { "approved": true, "approvedBy": "ops-user-1" }, | ||
| "workflowStateOverride": { "replayReason": "incident-1234" } | ||
| }' | ||
| ``` | ||
|
|
||
| **JavaScript Example:** | ||
|
|
||
| ```javascript | ||
| const response = await fetch( | ||
| "http://localhost:3141/workflows/order-approval/executions/exec_123/replay", | ||
| { | ||
| method: "POST", | ||
| headers: { "Content-Type": "application/json" }, | ||
| body: JSON.stringify({ | ||
| stepId: "approval-step", | ||
| inputData: { amount: 2500 }, | ||
| resumeData: { approved: true, approvedBy: "ops-user-1" }, | ||
| workflowStateOverride: { replayReason: "incident-1234" }, | ||
| }), | ||
| } | ||
| ); | ||
|
|
||
| const replay = await response.json(); | ||
| console.log("Replay execution ID:", replay.data.executionId); | ||
| console.log("Replay status:", replay.data.status); | ||
| ``` |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n --type ts -C3 "replay.*stream|stream.*replay" --ignore-case -g '!*.spec.ts' -g '!*.md'Repository: VoltAgent/voltagent
Length of output: 6140
🏁 Script executed:
# Also search for route/endpoint definitions related to replay
rg -n --type ts -C2 "POST.*replay|replay.*endpoint" --ignore-case -g '!*.spec.ts' -g '!*.md'Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Search for timeTravelStream method definitions
rg -n "timeTravelStream" --type ts -g '!*.spec.ts' -g '!*.md' -C3Repository: VoltAgent/voltagent
Length of output: 2446
🏁 Script executed:
# Look for streaming endpoint definitions (like /stream endpoints) to understand the pattern
rg -n "\/stream|streaming.*endpoint" --type ts -C2 -g '!*.spec.ts' -g '!*.md' | head -100Repository: VoltAgent/voltagent
Length of output: 157
🏁 Script executed:
# Look for all replay route definitions
rg -n "replay" --type ts packages/server-core/src/routes/ -C3Repository: VoltAgent/voltagent
Length of output: 856
🏁 Script executed:
# Check if there are any timeTravelStream route handlers
rg -n "timeTravelStream" --type ts packages/server-elysia/src/routes/ packages/server-core/src/routes/ -C3Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Find all POST endpoint definitions in routes
rg -n "POST|method.*POST" --type ts packages/server-core/src/routes/definitions.ts -C2 | grep -A2 "replay"Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Find all route/path definitions in the routes files
rg -n "path.*replay|replay" --type ts packages/server-core/src/routes/definitions.ts packages/server-elysia/src/routes/ -C2Repository: VoltAgent/voltagent
Length of output: 1196
🏁 Script executed:
# Check the actual route file to see all registered paths
head -150 packages/server-core/src/routes/definitions.ts | tail -100Repository: VoltAgent/voltagent
Length of output: 3727
🏁 Script executed:
# List all routes in routes/definitions.ts to see the complete API
rg "path:" --type ts packages/server-core/src/routes/definitions.ts | head -20Repository: VoltAgent/voltagent
Length of output: 737
🏁 Script executed:
# Get the complete list of paths to see all workflow-related routes
rg "path: \"/workflows" --type ts packages/server-core/src/routes/definitions.tsRepository: VoltAgent/voltagent
Length of output: 515
🏁 Script executed:
# Search more broadly in the entire routes definitions file
wc -l packages/server-core/src/routes/definitions.tsRepository: VoltAgent/voltagent
Length of output: 114
🏁 Script executed:
# Look for the replay endpoint in the entire file
cat packages/server-core/src/routes/definitions.ts | grep -n "replay" -iRepository: VoltAgent/voltagent
Length of output: 282
🏁 Script executed:
# Check if there's a replay endpoint implementation elsewhere
rg -n "replay" --type ts packages/server-core/src/ packages/server-elysia/src/ -C2 | grep -i "post\|method\|handler\|route"Repository: VoltAgent/voltagent
Length of output: 5370
🏁 Script executed:
# Check the handlers directory for replay-related code
ls -la packages/server-core/src/handlers/Repository: VoltAgent/voltagent
Length of output: 1320
🏁 Script executed:
# Look at the workflow handlers to see what operations are exposed
rg -n "export.*function|export.*const" packages/server-core/src/handlers/workflow.handlers.ts | head -30Repository: VoltAgent/voltagent
Length of output: 550
🏁 Script executed:
# Check if there's a replay handler defined somewhere
rg -n "handleReplay\|replay.*handler" --type ts packages/server-core/src/ packages/server-elysia/src/Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Get the actual list of all endpoints in routes/definitions.ts with method and path
rg "method:|path:" --type ts packages/server-core/src/routes/definitions.ts | grep -A1 "workflow" | head -50Repository: VoltAgent/voltagent
Length of output: 800
🏁 Script executed:
# Search in the elysia routes file for any replay endpoint
cat packages/server-elysia/src/routes/workflow.routes.ts | grep -n "replay\|time.*travel" -iRepository: VoltAgent/voltagent
Length of output: 265
🏁 Script executed:
# Double-check by searching the entire routes definitions for "replay" with full context
rg -n "replay" --type ts packages/server-core/src/routes/definitions.ts -B5 -A5Repository: VoltAgent/voltagent
Length of output: 679
🏁 Script executed:
# Check the elysia routes file more comprehensively
cat packages/server-elysia/src/routes/workflow.routes.ts | grep -n "post\|get\|put\|delete" | head -40Repository: VoltAgent/voltagent
Length of output: 423
🏁 Script executed:
# Check if the /replay endpoint exists in the route definitions at all
cat packages/server-core/src/routes/definitions.ts | grep -n "/replay" -iRepository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Search the entire codebase for any /replay route
rg "executions.*replay|/replay" --type ts -iRepository: VoltAgent/voltagent
Length of output: 144
🏁 Script executed:
# Check if there's a handler for the replay endpoint
rg -n "handleReplay\|replayWorkflow" --type tsRepository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Check the elysia routes file for all POST endpoints to see if replay is there
cat packages/server-elysia/src/routes/workflow.routes.ts | grep -B2 "app.post"Repository: VoltAgent/voltagent
Length of output: 505
🏁 Script executed:
# Check the actual documentation file to see what endpoints are documented
cat website/docs/api/endpoints/workflows.md | sed -n '513,613p'Repository: VoltAgent/voltagent
Length of output: 2715
🏁 Script executed:
# Check if there are ANY replay endpoints documented further in the file
cat website/docs/api/endpoints/workflows.md | grep -n "replay\|time.*travel" -iRepository: VoltAgent/voltagent
Length of output: 1213
🏁 Script executed:
# Verify that the /replay endpoint is actually NOT in the routes
wc -l packages/server-core/src/routes/definitions.ts && \
cat packages/server-core/src/routes/definitions.ts | grep -c "path:"Repository: VoltAgent/voltagent
Length of output: 117
🏁 Script executed:
# Get a complete list of all endpoint definitions
rg "^\s+[a-zA-Z]+WorkflowOperations?:" --type ts packages/server-core/src/routes/definitions.ts | head -20Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Check if replay endpoint might be registered separately
find packages/server-elysia/src/routes -name "*.ts" -exec grep -l "replay" {} \;Repository: VoltAgent/voltagent
Length of output: 115
🏁 Script executed:
# Search for any route handler that might implement replay
rg -n "replay" --type ts packages/server-elysia/src/routes/workflow.routes.tsRepository: VoltAgent/voltagent
Length of output: 181
🏁 Script executed:
# Final verification: list all endpoint keys in definitions.ts
rg "^\s+[a-zA-Z]+Workflow[a-zA-Z]*:\s*\{" --type ts packages/server-core/src/routes/definitions.tsRepository: VoltAgent/voltagent
Length of output: 270
🏁 Script executed:
# Check the documentation to confirm the endpoint claim is actually in the docs
grep -n "POST /workflows/:id/executions/:executionId/replay" website/docs/api/endpoints/workflows.mdRepository: VoltAgent/voltagent
Length of output: 133
Remove the entire "Replay Workflow" section (lines 513-613) — this endpoint does not exist in the server implementation.
The documented POST /workflows/:id/executions/:executionId/replay endpoint has no corresponding route handler in packages/server-core/src/routes/definitions.ts or packages/server-elysia/src/routes/workflow.routes.ts. The server implements only these workflow execution endpoints: execute, stream, suspend, cancel, resume, and state retrieval. The timeTravelStream method introduced in this PR exists as an SDK-only API (packages/core/src/workflow/core.ts), not as a REST endpoint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@website/docs/api/endpoints/workflows.md` around lines 513 - 613, Remove the
entire "Replay Workflow" documentation block that declares the POST
/workflows/:id/executions/:executionId/replay endpoint because there is no
server route for it; instead, update docs to reflect that replay/time-travel
functionality is SDK-only (referencing the timeTravelStream API) and ensure any
examples, request/response schemas, and error cases for that endpoint are
deleted so the docs only show implemented REST endpoints (execute, stream,
suspend, cancel, resume, state retrieval).
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
packages/server-core/src/schemas/agent.schemas.ts (1)
459-460: Consider adding.min(1)to thestepIdschema field.The handler rejects empty
stepIdwith a 400 (line 841–847 ofworkflow.handlers.ts), but the schema allows it — so an emptystepIdpasses schema validation and is rejected only at the handler level. Adding.min(1)to the schema would surface the error earlier and produce a more informative validation message.- stepId: z.string().describe("Step ID to replay from"), + stepId: z.string().min(1).describe("Step ID to replay from"),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/server-core/src/schemas/agent.schemas.ts` around lines 459 - 460, The WorkflowReplayRequestSchema currently allows empty stepId values; update the stepId validator in WorkflowReplayRequestSchema (the z.string().describe("Step ID to replay from") entry) to enforce non-empty input by adding .min(1) so schema validation rejects empty strings earlier and returns a proper validation error instead of relying on the handler-level 400 check.packages/core/src/workflow/time-travel.spec.ts (1)
10-13: Test cleanup should also clearactiveExecutionsto prevent cross-test interference in the sequential suite.The test's
beforeEachonly clears theworkflowsMap, but the registry has a second stateful property:public activeExecutions(line 49 of registry.ts), which tracks active workflow suspensions and is actively modified during execution. Since the suite usesdescribe.sequential(), tests run one after another, and stale entries inactiveExecutionsfrom a previous test could interfere with subsequent tests.Add
(registry as any).activeExecutions.clear()to the beforeEach, or expose a public reset method inWorkflowRegistryto clean both maps without type-casting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/time-travel.spec.ts` around lines 10 - 13, The beforeEach in the test clears only the workflows Map on WorkflowRegistry, but leaves the registry's public activeExecutions state intact which can leak between sequential tests; update the beforeEach to also clear activeExecutions by calling (registry as any).activeExecutions.clear(), or preferably add and call a new public reset/clear method on WorkflowRegistry that clears both workflows and activeExecutions so tests start from a clean registry state before each test.packages/server-core/src/handlers/workflow.handlers.ts (1)
858-872: Import and use the exportedWorkflowTimeTravelOptionstype instead of the inline definition.The inline type at lines 859–864 is a simplified replica of the exported
WorkflowTimeTravelOptionsfrom@voltagent/core. The differences include:
inputDataandresumeDatashould use the typedDangerouslyAllowAnyinstead ofunknownworkflowStateOverrideshould referenceWorkflowStateStoreinstead of a rawRecord<string, unknown>Additionally, the return type (lines 865–871) is overly generic and differs from the actual
WorkflowExecutionResult<RESULT_SCHEMA, RESUME_SCHEMA>signature—it includes an undefinedstartAtfield and loses type safety forstatusandresult.Import
WorkflowTimeTravelOptionsfrom@voltagent/coreand apply it to the type cast to maintain consistency with the library's type definitions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/server-core/src/handlers/workflow.handlers.ts` around lines 858 - 872, Replace the inline time-travel type on workflowWithReplay with the library's exported types: import WorkflowTimeTravelOptions (and related types like WorkflowExecutionResult, DangerouslyAllowAny, WorkflowStateStore) from `@voltagent/core` and use WorkflowTimeTravelOptions as the parameter type for timeTravel on registeredWorkflow.workflow; also update the timeTravel return type to the proper WorkflowExecutionResult generic signature instead of the ad-hoc object so the method signature on workflowWithReplay matches the core definitions (refer to workflowWithReplay and registeredWorkflow.workflow to locate the cast).packages/core/src/workflow/core.ts (2)
2947-2960: The trailing.catch()is dead code.
.then(onFulfilled, onRejected)already handles both settlement cases — exactly one handler is called. Since neither theonFulfillednor theonRejectedbranch here throws or returns a rejected promise, the promise returned by.then()always fulfills, and the trailing.catch(() => {})never fires.♻️ Remove the unreachable catch
replayPromise .then( (result) => { if (result.status !== "suspended") { streamController.close(); } }, () => { streamController.close(); }, - ) - .catch(() => { - // Error is surfaced through promise-backed fields on stream result. - }); + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/core.ts` around lines 2947 - 2960, The trailing .catch on replayPromise is dead code; remove the final .catch(() => {}) and keep the existing .then(result => { if (result.status !== "suspended") streamController.close(); }, () => { streamController.close(); }) so both fulfillment and rejection cases close the streamController; locate the call by the unique symbols replayPromise and streamController in the file and delete only the trailing .catch handler (or replace the whole chain with a .finally that closes streamController if you prefer a single lifecycle handler).
1038-1097: Minor observability gap: logs at lines 1061–1066 lack execution context.The
loggerused here is the workflow-levelLoggerProxywithoutexecutionIdortraceId, becauserunLoggerisn't created until line 1123. The warn on line 1063 ("No source trace IDs found in replay workflow state metadata") won't carry the replay execution ID, making it hard to correlate with a specific replay when debugging production issues.The lookup logic and
replayedFromconstruction are otherwise correct.♻️ Consider propagating executionId in the log call
- logger.warn("No source trace IDs found in replay workflow state metadata"); + logger.warn("No source trace IDs found in replay workflow state metadata", { + replayExecutionId: executionId, + sourceExecutionId: options.replayFrom.executionId, + });- logger.warn("Failed to get source trace IDs for replay:", { error }); + logger.warn("Failed to get source trace IDs for replay:", { + error, + replayExecutionId: executionId, + sourceExecutionId: options.replayFrom.executionId, + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/core.ts` around lines 1038 - 1097, The warning logs when resolving replay/resume trace IDs (the logger.warn calls in the replay branch around the handling of options.replayFrom.executionId and the corresponding catch) lack execution context; update those logger.warn invocations to include the replay executionId (options.replayFrom.executionId) and any available execution identifiers (e.g., executionId) in the metadata/object passed to the logger so the message can be correlated to the specific replay (refer to replayedFrom, executionMemory.getWorkflowState, and logger); do the same for the catch logger.warn that logs the error (include the executionId value in the logged object) and mirror this small change in the resume branch warnings so all trace-resolution warnings include executionId context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2629-2644: The fallback matching is ambiguous because
event.from/event.name can equal a non-unique step.name; update the matching
logic in the sourceStepCompleteEvents.find call to first attempt an unambiguous
match using a stable identifier (prefer event.stepIndex === step.index if
stepIndex is present on events, or event.stepId === step.id if you emit stepId)
before falling back to name-based checks; adjust the search to check
event.stepIndex/event.stepId first, then only use event.from === step.id or
event.name === step.name as last resorts, and ensure replayStepData assignment
(and the call to toWorkflowStepStatus) uses the matched event.
- Around line 3048-3054: The resume path for time-travel streams currently
delegates from resumedStreamResult.resume to streamResult.resume which rebuilds
resume options from the original replay's metadata
(replayResult.suspension.checkpoint) causing second suspensions to replay from
the wrong checkpoint; change resumedStreamResult.resume (and the analogous logic
in the stream method) to construct resumeOptions from the current suspension
metadata (nextResult.suspension) instead of relying on
replayPromise/replayResult, i.e., use nextResult.suspension.checkpoint and
nextResult.suspension.metadata when calling resume so that subsequent resumes
start from the most recent suspension rather than the original one.
In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Around line 905-912: The httpStatus decision incorrectly treats any error
message containing "step" as a 400; update the conditional logic around
normalizedMessage in the httpStatus assignment (the variable named httpStatus
using normalizedMessage.includes(...)) to remove the broad
normalizedMessage.includes("step") check and either delete that clause entirely
or replace it with a precise matcher for missing-step cases (e.g., only match
"step not found" or a regex like /step.*not found/ or /step.*'[^']*' not found/)
so only genuine "step not found" errors map to 404/400 as intended and other
"step" mentions continue to result in 500.
---
Nitpick comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2947-2960: The trailing .catch on replayPromise is dead code;
remove the final .catch(() => {}) and keep the existing .then(result => { if
(result.status !== "suspended") streamController.close(); }, () => {
streamController.close(); }) so both fulfillment and rejection cases close the
streamController; locate the call by the unique symbols replayPromise and
streamController in the file and delete only the trailing .catch handler (or
replace the whole chain with a .finally that closes streamController if you
prefer a single lifecycle handler).
- Around line 1038-1097: The warning logs when resolving replay/resume trace IDs
(the logger.warn calls in the replay branch around the handling of
options.replayFrom.executionId and the corresponding catch) lack execution
context; update those logger.warn invocations to include the replay executionId
(options.replayFrom.executionId) and any available execution identifiers (e.g.,
executionId) in the metadata/object passed to the logger so the message can be
correlated to the specific replay (refer to replayedFrom,
executionMemory.getWorkflowState, and logger); do the same for the catch
logger.warn that logs the error (include the executionId value in the logged
object) and mirror this small change in the resume branch warnings so all
trace-resolution warnings include executionId context.
In `@packages/core/src/workflow/time-travel.spec.ts`:
- Around line 10-13: The beforeEach in the test clears only the workflows Map on
WorkflowRegistry, but leaves the registry's public activeExecutions state intact
which can leak between sequential tests; update the beforeEach to also clear
activeExecutions by calling (registry as any).activeExecutions.clear(), or
preferably add and call a new public reset/clear method on WorkflowRegistry that
clears both workflows and activeExecutions so tests start from a clean registry
state before each test.
In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Around line 858-872: Replace the inline time-travel type on workflowWithReplay
with the library's exported types: import WorkflowTimeTravelOptions (and related
types like WorkflowExecutionResult, DangerouslyAllowAny, WorkflowStateStore)
from `@voltagent/core` and use WorkflowTimeTravelOptions as the parameter type for
timeTravel on registeredWorkflow.workflow; also update the timeTravel return
type to the proper WorkflowExecutionResult generic signature instead of the
ad-hoc object so the method signature on workflowWithReplay matches the core
definitions (refer to workflowWithReplay and registeredWorkflow.workflow to
locate the cast).
In `@packages/server-core/src/schemas/agent.schemas.ts`:
- Around line 459-460: The WorkflowReplayRequestSchema currently allows empty
stepId values; update the stepId validator in WorkflowReplayRequestSchema (the
z.string().describe("Step ID to replay from") entry) to enforce non-empty input
by adding .min(1) so schema validation rejects empty strings earlier and returns
a proper validation error instead of relying on the handler-level 400 check.
There was a problem hiding this comment.
2 issues found across 5 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="website/docs/api/endpoints/workflows.md">
<violation number="1" location="website/docs/api/endpoints/workflows.md:650">
P3: `replayedFromExecutionId` is documented elsewhere as the original execution ID; this example value is labeled like a replay execution ID, which inverts the lineage and can mislead users. Use an original/source execution ID in the example.</violation>
</file>
<file name="packages/core/src/workflow/core.ts">
<violation number="1" location="packages/core/src/workflow/core.ts:158">
P1: `Number("")` returns `0` which passes `Number.isInteger`, so an empty-string `metadata.stepIndex` is silently treated as step index 0. This can cause a false match against step 0 in the time-travel fallback logic, loading incorrect historical data into the replay. Add a guard for non-empty strings and non-negative values.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| return metadataStepIndex; | ||
| } | ||
|
|
||
| if (typeof metadataStepIndex === "string") { |
There was a problem hiding this comment.
P1: Number("") returns 0 which passes Number.isInteger, so an empty-string metadata.stepIndex is silently treated as step index 0. This can cause a false match against step 0 in the time-travel fallback logic, loading incorrect historical data into the replay. Add a guard for non-empty strings and non-negative values.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/workflow/core.ts, line 158:
<comment>`Number("")` returns `0` which passes `Number.isInteger`, so an empty-string `metadata.stepIndex` is silently treated as step index 0. This can cause a false match against step 0 in the time-travel fallback logic, loading incorrect historical data into the replay. Add a guard for non-empty strings and non-negative values.</comment>
<file context>
@@ -139,6 +139,49 @@ const toWorkflowStepStatus = (
+ return metadataStepIndex;
+ }
+
+ if (typeof metadataStepIndex === "string") {
+ const parsed = Number(metadataStepIndex);
+ if (Number.isInteger(parsed)) {
</file context>
| "executionId": "exec_1234567890_abc123", | ||
| "workflowId": "order-approval", | ||
| "status": "suspended", | ||
| "replayedFromExecutionId": "exec_0987654321_replay", |
There was a problem hiding this comment.
P3: replayedFromExecutionId is documented elsewhere as the original execution ID; this example value is labeled like a replay execution ID, which inverts the lineage and can mislead users. Use an original/source execution ID in the example.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At website/docs/api/endpoints/workflows.md, line 650:
<comment>`replayedFromExecutionId` is documented elsewhere as the original execution ID; this example value is labeled like a replay execution ID, which inverts the lineage and can mislead users. Use an original/source execution ID in the example.</comment>
<file context>
@@ -627,6 +647,8 @@ Retrieve the current state of a workflow execution.
"executionId": "exec_1234567890_abc123",
"workflowId": "order-approval",
"status": "suspended",
+ "replayedFromExecutionId": "exec_0987654321_replay",
+ "replayFromStepId": "step-approval-1",
"startAt": "2024-01-15T10:00:00.000Z",
</file context>
| "replayedFromExecutionId": "exec_0987654321_replay", | |
| "replayedFromExecutionId": "exec_0987654321_original", |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/core/src/workflow/time-travel.spec.ts (1)
10-14: Registry cleanup accesses private members viaanycast.This pattern (
(registry as any).workflows.clear()) is fragile — if the internal property is renamed or restructured, the test will silently stop cleaning up. Consider whetherWorkflowRegistrycould expose areset()orclear()method for test use.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/time-travel.spec.ts` around lines 10 - 14, The test is reaching into WorkflowRegistry internals by casting to any and calling (registry as any).workflows.clear() and activeExecutions.clear(), which is brittle; add a public test-only method on WorkflowRegistry such as reset() or clear() that encapsulates clearing its internal maps, then update the spec to call WorkflowRegistry.getInstance().reset() (or .clear()) instead of accessing workflows/activeExecutions directly; reference the existing WorkflowRegistry class and its getInstance method and the internal maps named workflows and activeExecutions when implementing the new method.packages/server-core/src/handlers/workflow.handlers.ts (1)
848-848:bodyparameter is typed asany— consider using the Zod-inferred type.Since
WorkflowReplayRequestSchemaexists in the schemas package, you could typebodyasz.infer<typeof WorkflowReplayRequestSchema>for stronger compile-time guarantees. This matches how the schema is enforced at the route layer and would surface mismatches earlier.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/server-core/src/handlers/workflow.handlers.ts` at line 848, The handler currently destructures properties from an untyped body; update the handler parameter type to z.infer<typeof WorkflowReplayRequestSchema> (importing WorkflowReplayRequestSchema from the schemas package and z from 'zod' if needed) so that body is strongly typed, then keep the existing destructuring of stepId, inputData, resumeData, workflowStateOverride from body; this will align the handler with the route validation and surface type mismatches at compile time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2627-2820: prepareTimeTravelExecution always uses defaultMemory
which breaks replays for executions stored in a custom memory adapter; add an
optional memory?: MemoryAdapter (or appropriate type) to
WorkflowTimeTravelOptions and update prepareTimeTravelExecution signature to
accept/use a memory parameter (falling back to defaultMemory) so calls to
getWorkflowState and setWorkflowState use that adapter; update callers
timeTravel and timeTravelStream to pass options.memory through (mirror how run()
and restartExecution() handle options?.memory) and ensure
restartCheckpoint/getRestartCheckpointFromMetadata logic still works with the
chosen adapter.
---
Duplicate comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2686-2712: The name-based fallback can produce false matches when
multiple steps share the same step.name; change the fallback to only perform
name-based matching when the name is unambiguous: before evaluating the
name-based branch in the fallbackEvent search (the code that references
getEventStepIndex, getEventStepId, event.from, event.name, step.id, step.name on
sourceStepCompleteEvents), check that no other workflow step shares step.name
(i.e., verify uniqueness among the steps collection) and only then allow
matching by name; if the name is not unique, skip the name checks so
fallbackEvent only matches by step index or step id, then continue populating
replayStepData[step.id] with input/output/status via toWorkflowStepStatus as
before.
---
Nitpick comments:
In `@packages/core/src/workflow/time-travel.spec.ts`:
- Around line 10-14: The test is reaching into WorkflowRegistry internals by
casting to any and calling (registry as any).workflows.clear() and
activeExecutions.clear(), which is brittle; add a public test-only method on
WorkflowRegistry such as reset() or clear() that encapsulates clearing its
internal maps, then update the spec to call
WorkflowRegistry.getInstance().reset() (or .clear()) instead of accessing
workflows/activeExecutions directly; reference the existing WorkflowRegistry
class and its getInstance method and the internal maps named workflows and
activeExecutions when implementing the new method.
In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Line 848: The handler currently destructures properties from an untyped body;
update the handler parameter type to z.infer<typeof WorkflowReplayRequestSchema>
(importing WorkflowReplayRequestSchema from the schemas package and z from 'zod'
if needed) so that body is strongly typed, then keep the existing destructuring
of stepId, inputData, resumeData, workflowStateOverride from body; this will
align the handler with the route validation and surface type mismatches at
compile time.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 3070-3081: The resumeOptions object created for replay/stream
resumes (the one that sets executionId and resumeFrom) fails to include the
effective memory adapter, so resumes use defaultMemory and lose state; update
the resumeOptions construction in the resume path (where
resumedSuspendController is created) to include the same memory adapter used by
the original run (e.g., add memory: effectiveMemory or memoryAdapter/the
workflow run's memory property) so the resumed WorkflowRunOptions carries the
same memory instance for reads/writes during replay/stream resumes.
In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Around line 904-912: The current httpStatus classification using
message/normalizedMessage in workflow.handlers.ts assigns many replay-prep
errors a 500; update the logic that sets httpStatus (the normalizedMessage
checks that produce 404/400/500) to treat user-correctable replay-prep
messages—e.g., phrases like "missing historical snapshots", "missing snapshot",
"missing input", "no historical", or "missing history"—as 400 (bad request)
instead of 500 while keeping the existing "not found" -> 404 mapping; modify the
conditional that computes httpStatus (refer to the variables message,
normalizedMessage, and httpStatus) to include these additional
normalizedMessage.includes(...) checks so those errors return 400.
| const resumedSuspendController = createDefaultSuspendController(); | ||
| const resumeOptions: WorkflowRunOptions = { | ||
| executionId: suspendedResult.executionId, | ||
| resumeFrom: { | ||
| executionId: suspendedResult.executionId, | ||
| checkpoint: suspendedResult.suspension.checkpoint, | ||
| resumeStepIndex, | ||
| resumeData: resumeInput, | ||
| }, | ||
| suspendController: resumedSuspendController, | ||
| }; | ||
|
|
There was a problem hiding this comment.
Propagate the effective memory adapter into replay/stream resumes.
Line 3072–3079 and Line 3248–3254: resume options omit memory, so a replay/stream that started with a custom memory adapter will resume against defaultMemory and fail to find state. Carry the same memory through resume to keep state reads/writes consistent.
🧩 Suggested fix
timeTravelStream: (timeTravelOptions: WorkflowTimeTravelOptions) => {
const streamController = new WorkflowStreamController();
const executionId = randomUUID();
const startAt = new Date();
+ const replayMemory = timeTravelOptions.memory ?? defaultMemory;
const suspendController = createDefaultSuspendController();
@@
- const replayExecutionOptions: WorkflowRunOptions = {
- ...preparedReplay.executionOptions,
- suspendController,
- };
+ const replayExecutionOptions: WorkflowRunOptions = {
+ ...preparedReplay.executionOptions,
+ memory: replayMemory,
+ suspendController,
+ };
@@
- const resumeOptions: WorkflowRunOptions = {
- executionId: suspendedResult.executionId,
- resumeFrom: {
- executionId: suspendedResult.executionId,
- checkpoint: suspendedResult.suspension.checkpoint,
- resumeStepIndex,
- resumeData: resumeInput,
- },
- suspendController: resumedSuspendController,
- };
+ const resumeOptions: WorkflowRunOptions = {
+ executionId: suspendedResult.executionId,
+ memory: replayMemory,
+ resumeFrom: {
+ executionId: suspendedResult.executionId,
+ checkpoint: suspendedResult.suspension.checkpoint,
+ resumeStepIndex,
+ resumeData: resumeInput,
+ },
+ suspendController: resumedSuspendController,
+ };
@@
stream: (input: WorkflowInput<INPUT_SCHEMA>, options?: WorkflowRunOptions) => {
const streamController = new WorkflowStreamController();
const executionId = options?.executionId || crypto.randomUUID();
+ const executionMemory = options?.memory ?? defaultMemory;
@@
const executionOptions: WorkflowRunOptions = {
...options,
executionId,
suspendController,
+ memory: executionMemory,
};
@@
- const resumeOptions: WorkflowRunOptions = {
- executionId: suspendedResult.executionId,
- resumeFrom: {
- executionId: suspendedResult.executionId,
- checkpoint: suspendedResult.suspension.checkpoint,
- resumeStepIndex,
- resumeData: resumeInput,
- },
- suspendController: resumedSuspendController,
- };
+ const resumeOptions: WorkflowRunOptions = {
+ executionId: suspendedResult.executionId,
+ memory: executionMemory,
+ resumeFrom: {
+ executionId: suspendedResult.executionId,
+ checkpoint: suspendedResult.suspension.checkpoint,
+ resumeStepIndex,
+ resumeData: resumeInput,
+ },
+ suspendController: resumedSuspendController,
+ };Also applies to: 3246-3256
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/src/workflow/core.ts` around lines 3070 - 3081, The
resumeOptions object created for replay/stream resumes (the one that sets
executionId and resumeFrom) fails to include the effective memory adapter, so
resumes use defaultMemory and lose state; update the resumeOptions construction
in the resume path (where resumedSuspendController is created) to include the
same memory adapter used by the original run (e.g., add memory: effectiveMemory
or memoryAdapter/the workflow run's memory property) so the resumed
WorkflowRunOptions carries the same memory instance for reads/writes during
replay/stream resumes.
| const message = error instanceof Error ? error.message : "Failed to replay workflow"; | ||
| const normalizedMessage = message.toLowerCase(); | ||
| const httpStatus = normalizedMessage.includes("not found") | ||
| ? 404 | ||
| : normalizedMessage.includes("cannot time travel") || | ||
| normalizedMessage.includes("still running") || | ||
| normalizedMessage.includes("belongs to workflow") | ||
| ? 400 | ||
| : 500; |
There was a problem hiding this comment.
Classify replay-prep errors as 400 instead of 500.
Line 906–912: errors like “missing historical snapshots/input” are user-correctable; returning 500 hides actionable feedback. Consider mapping those to 400.
🧩 Suggested update
- const httpStatus = normalizedMessage.includes("not found")
+ const httpStatus = normalizedMessage.includes("not found")
? 404
- : normalizedMessage.includes("cannot time travel") ||
- normalizedMessage.includes("still running") ||
- normalizedMessage.includes("belongs to workflow")
+ : normalizedMessage.includes("cannot time travel") ||
+ normalizedMessage.includes("still running") ||
+ normalizedMessage.includes("belongs to workflow") ||
+ normalizedMessage.includes("missing historical") ||
+ normalizedMessage.includes("missing persisted workflow input") ||
+ normalizedMessage.includes("missing historical input data")
? 400
: 500;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/server-core/src/handlers/workflow.handlers.ts` around lines 904 -
912, The current httpStatus classification using message/normalizedMessage in
workflow.handlers.ts assigns many replay-prep errors a 500; update the logic
that sets httpStatus (the normalizedMessage checks that produce 404/400/500) to
treat user-correctable replay-prep messages—e.g., phrases like "missing
historical snapshots", "missing snapshot", "missing input", "no historical", or
"missing history"—as 400 (bad request) instead of 500 while keeping the existing
"not found" -> 404 mapping; modify the conditional that computes httpStatus
(refer to the variables message, normalizedMessage, and httpStatus) to include
these additional normalizedMessage.includes(...) checks so those errors return
400.
PR Checklist
Please check if your PR fulfills the following requirements:
Bugs / Features
What is the current behavior?
VoltAgent workflows support restart/crash recovery and async start, but do not expose a deterministic replay API for re-running historical executions from a selected step.
What is the new behavior?
Adds deterministic replay APIs for workflow and chain surfaces:
workflow.timeTravel(options)workflow.timeTravelStream(options)workflowChain.timeTravel(options)workflowChain.timeTravelStream(options)Also adds:
WorkflowTimeTravelOptionspublic exportreplayedFromExecutionIdreplayFromStepIdwebsite/docs/workflows/overview.mdwebsite/docs/workflows/suspend-resume.mdwebsite/docs/workflows/streaming.md@voltagent/coreminor releaseSmoke tests run:
pnpm --filter @voltagent/core test:single src/workflow/time-travel.spec.tspnpm --filter @voltagent/core test:single src/workflow/chain.spec.tspnpm --filter @voltagent/core typecheckfixes (issue)
N/A
Notes for reviewers
timeTravelrejectsrunningsource executions; crash recovery remainsrestart(...)scope.Summary by cubic
Adds deterministic time-travel replay for workflows and chains, letting you re-run a past execution from a specific step into a new execution while keeping the original run unchanged. Includes a replay REST endpoint and OpenTelemetry lineage linking for better debugging.
New Features
Bug Fixes
Written for commit 6829072. Summary will update on new commits.
Summary by CodeRabbit
New Features
SDK
API
Observability
Documentation
Tests
Chores