feat(core): add workflow time-travel deterministic replay APIs by omeraplak · Pull Request #1099 · VoltAgent/voltagent

omeraplak · 2026-02-22T02:39:57Z

PR Checklist

Please check if your PR fulfills the following requirements:

The commit message follows our guidelines: https://voltagent.dev/docs/community/contributing/#commit-convention

Bugs / Features

Related issue(s) linked
Tests for the changes have been added
Docs have been added / updated
Changesets have been added https://voltagent.dev/docs/community/contributing/#creating-a-changeset

What is the current behavior?

VoltAgent workflows support restart/crash recovery and async start, but do not expose a deterministic replay API for re-running historical executions from a selected step.

What is the new behavior?

Adds deterministic replay APIs for workflow and chain surfaces:

workflow.timeTravel(options)
workflow.timeTravelStream(options)
workflowChain.timeTravel(options)
workflowChain.timeTravelStream(options)

Also adds:

WorkflowTimeTravelOptions public export
Replay lineage fields persisted on workflow state:
- replayedFromExecutionId
- replayFromStepId
Documentation updates with usage examples in:
- website/docs/workflows/overview.md
- website/docs/workflows/suspend-resume.md
- website/docs/workflows/streaming.md
Changeset for @voltagent/core minor release

Smoke tests run:

pnpm --filter @voltagent/core test:single src/workflow/time-travel.spec.ts
pnpm --filter @voltagent/core test:single src/workflow/chain.spec.ts
pnpm --filter @voltagent/core typecheck

fixes (issue)

N/A

Notes for reviewers

timeTravel rejects running source executions; crash recovery remains restart(...) scope.
Replay writes a new execution state and preserves original run history.

Summary by cubic

Adds deterministic time-travel replay for workflows and chains, letting you re-run a past execution from a specific step into a new execution while keeping the original run unchanged. Includes a replay REST endpoint and OpenTelemetry lineage linking for better debugging.

New Features
- APIs: workflow.timeTravel/timeTravelStream and workflowChain.timeTravel/timeTravelStream; export WorkflowTimeTravelOptions.
- Replay engine: validates workflow/step, blocks running sources; reconstructs prior steps from checkpoints and step-complete events (by id/index/name); infers selected-step input with optional inputData/resumeData/workflowStateOverride; writes a new execution with lineage fields.
- Streaming: supports timeTravelStream with resume() and optional stepId override during replay; same stream interface as .stream().
- Observability: persists replayedFromExecutionId/replayFromStepId on workflow state; adds OTEL replay links and attributes (skips resume attrs for replays).
- REST: POST /workflows/:id/executions/:executionId/replay with typed request/response and error mapping (400/404/500); protected route wired in Elysia, Hono, and serverless-Hono.
- Docs & tests: endpoint docs with cURL/JS, updated workflow guides (overview/streaming/suspend-resume), new time-travel and OTEL trace-context tests; registry reset utility for deterministic tests.
Bug Fixes
- Streaming resumes now preserve the configured memory adapter across replay/resume calls.
- Server maps replay preparation issues (for example, missing historical snapshots) to HTTP 400 for clearer errors.

^{Written for commit 6829072. Summary will update on new commits.}

Summary by CodeRabbit

New Features
- Deterministic "time travel" replay for workflows (standard + streaming), with optional input/resume/state overrides and persisted replay lineage.
SDK
- New workflow and workflow-chain methods to initiate deterministic replay and streaming replay; new public type to configure replays.
API
- New REST endpoint to create deterministic replay executions; request/response schemas updated to include replay options and lineage fields.
Observability
- Added replay-related trace/span attributes for provenance.
Documentation
- Comprehensive docs and examples for time travel, streaming, and suspend/resume.
Tests
- End-to-end tests covering replay flows, overrides, errors, and streaming events.
Chores
- Registry reset utility added for deterministic test isolation.

changeset-bot · 2026-02-22T02:40:02Z

🦋 Changeset detected

Latest commit: 6829072

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@voltagent/core	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

coderabbitai · 2026-02-22T02:40:16Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds deterministic "time travel" replay APIs to Workflow and WorkflowChain, implements streaming and resume semantics, persists replay lineage metadata on workflow state, updates observability/tracing for replay provenance, exposes public types, adds server endpoints/handlers, tests, and docs.

Changes

Cohort / File(s)	Summary
Public types & memory `packages/core/src/workflow/types.ts`, `packages/core/src/workflow/index.ts`, `packages/core/src/memory/types.ts`	Introduce `WorkflowTimeTravelOptions`, `WorkflowReplayOptions`, export the new type, and add `replayedFromExecutionId` / `replayFromStepId` to in-memory `WorkflowStateEntry`.
Core workflow logic `packages/core/src/workflow/core.ts`, `packages/core/src/workflow/types.ts`	Implement `timeTravel()` and `timeTravelStream()` with preparation/validation, reconstruction of input/state, replay metadata, streaming/resume mechanics, and helper utilities; add internal preparation types and extend run options for replay lineage.
WorkflowChain integration `packages/core/src/workflow/chain.ts`	Add `timeTravel()` and `timeTravelStream()` helpers that recreate a Workflow instance and delegate to workflow APIs.
Observability / tracing `packages/core/src/workflow/open-telemetry/trace-context.ts`, `packages/core/src/workflow/open-telemetry/trace-context.spec.ts`, `packages/core/src/observability/types.ts`	Add replay lineage links/attributes and span attributes for replay provenance; update tests to assert replay attributes and increase retry timing.
Server API & handlers `packages/server-core/src/handlers/workflow.handlers.ts`, `packages/server-core/src/auth/defaults.ts`, `packages/server-core/src/routes/definitions.ts`, `packages/server-core/src/schemas/agent.schemas.ts`	New handler `handleReplayWorkflow`, request/response schemas, route definition for POST /workflows/:id/executions/:executionId/replay, and protect the route in auth defaults.
Server frameworks — route wiring `packages/server-elysia/src/routes/workflow.routes.ts`, `packages/server-elysia/src/schemas.ts`, `packages/server-hono/src/routes/agent.routes.ts`, `packages/server-hono/src/routes/index.ts`, `packages/serverless-hono/src/routes.ts`	Register replay route across adapters and expose schemas/handlers; some files contain duplicate route blocks to review.
Tests `packages/core/src/workflow/time-travel.spec.ts`	Add Vitest suite covering replay from steps, input/resume overrides, error cases, preservation of original history, and streaming event assertions.
Docs — user guides & API docs `website/docs/workflows/overview.md`, `website/docs/workflows/streaming.md`, `website/docs/workflows/suspend-resume.md`, `website/docs/api/endpoints/workflows.md`	Add Time Travel / Deterministic Replay guides, streaming examples, and REST API docs; some duplicated sections/blocks appear and should be deduplicated.
Misc / infra & registry `.changeset/four-cobras-yawn.md`, `packages/core/src/workflow/registry.ts`, test retry tuning	Changelog entry, add `WorkflowRegistry.reset()` for test isolation, adjust test retry timing, and small manifest updates.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Workflow
    participant MemoryV2 as Memory (V2)
    participant Engine as ExecutionEngine

    Client->>Workflow: timeTravel({executionId, stepId, ...})
    activate Workflow
    Workflow->>MemoryV2: fetch source execution + per-step data
    activate MemoryV2
    MemoryV2-->>Workflow: execution record & step snapshots
    deactivate MemoryV2

    Workflow->>Workflow: validate, locate target step, reconstruct input/state, build lineage metadata
    Workflow->>Engine: start replay execution (new executionId) from target step
    activate Engine
    Engine->>MemoryV2: persist replay execution, steps, lineage metadata
    activate MemoryV2
    MemoryV2-->>Engine: persisted
    deactivate MemoryV2
    Engine-->>Workflow: emit replay events / suspension / completion
    deactivate Engine

    Workflow-->>Client: return WorkflowExecutionResult or stream events
    deactivate Workflow

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

fix: preserve request correlation with root workflow and agent spans #1060 — overlaps changes to OpenTelemetry trace-context (span links/attributes) and tests for replay lineage.
fix(memory): add workflow state parity for SQL adapters #1082 — modifies workflow state model and persisted state fields, closely related to added replay state fields.
fix(core): persist workflow context mutations across steps #1078 — touches workflow core/state reconstruction and context propagation, related to time-travel preparation and state initialization.

Poem

🐇 I hopped back through steps and tracked each tiny clue,

A new run took the carrot while the old one stayed true,
Inputs nudged like treats, state tucked in neat,
Lineage stamped and tidy, every hop a beat,
Streams chimed each bounce — replay, precise and new.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main feature being added: workflow time-travel deterministic replay APIs for the core module.
Description check	✅ Passed	The PR description is comprehensive and complete, with all required sections properly filled and detailed information about the new feature.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/workflow-time-travel-deterministic-replay-clean

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cubic-dev-ai

2 issues found across 10 files

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/src/workflow/core.ts">

<violation number="1" location="packages/core/src/workflow/core.ts:126">
P2: Silently defaulting an unrecognized step status to `"success"` can mask data corruption during replay. Consider throwing an error (or at least logging a warning) when the status is invalid, rather than silently promoting it to success.</violation>

<violation number="2" location="packages/core/src/workflow/core.ts:2587">
P1: Using `.find()` returns the first matching step-complete event instead of the last. For retried steps, this retrieves data from the initial (failed) attempt rather than the final (successful) one. Use a reverse search to get the most recent event for the step.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-02-22T02:44:09Z

+        continue;
+      }
+
+      const fallbackEvent = sourceStepCompleteEvents.find(


P1: Using .find() returns the first matching step-complete event instead of the last. For retried steps, this retrieves data from the initial (failed) attempt rather than the final (successful) one. Use a reverse search to get the most recent event for the step.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/workflow/core.ts, line 2587: <comment>Using `.find()` returns the first matching step-complete event instead of the last. For retried steps, this retrieves data from the initial (failed) attempt rather than the final (successful) one. Use a reverse search to get the most recent event for the step.</comment> <file context> @@ -2493,6 +2518,193 @@ export function createWorkflow< + continue; + } + + const fallbackEvent = sourceStepCompleteEvents.find( + (event) => + event.from === step.id || </file context>

cloudflare-workers-and-pages · 2026-02-22T02:46:57Z

Deploying voltagent with Cloudflare Pages

Latest commit:	`6829072`
Status:	✅ Deploy successful!
Preview URL:	https://078a49e8.voltagent.pages.dev
Branch Preview URL:	https://feat-workflow-time-travel-de.voltagent.pages.dev

View logs

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

packages/core/src/workflow/chain.ts (1)

991-1023: Consider adding the same memory-requirement note as restart to both JSDoc comments.

timeTravel and timeTravelStream need access to the source execution state in persistent memory, just like restart. The existing restart and restartAllActive methods already document this explicitly (lines 1025–1030 and 1047–1050). Without the same caveat, callers using an ephemeral/default in-memory store may get a confusing runtime error.

📝 Proposed JSDoc additions

-  /**
-   * Replay a historical execution from the selected step
-   */
+  /**
+   * Replay a historical execution from the selected step.
+   * This recreates a workflow instance via `createWorkflow(...)` on each call.
+   * Use persistent/shared memory (or register the workflow) so prior execution state is discoverable.
+   * For ephemeral setup patterns, prefer `chain.toWorkflow().timeTravel(...)` and reuse that instance.
+   */
   async timeTravel(

-  /**
-   * Stream a historical replay from the selected step
-   */
+  /**
+   * Stream a historical replay from the selected step.
+   * This recreates a workflow instance via `createWorkflow(...)` on each call.
+   * Use persistent/shared memory (or register the workflow) so prior execution state is discoverable.
+   * For ephemeral setup patterns, prefer `chain.toWorkflow().timeTravelStream(...)` and reuse that instance.
+   */
   timeTravelStream(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflow/chain.ts` around lines 991 - 1023, Add the same
persistent-memory requirement note used on restart/restartAllActive to the JSDoc
for timeTravel and timeTravelStream: update the comments for the timeTravel and
timeTravelStream methods in chain.ts (methods named timeTravel and
timeTravelStream) to explicitly state that these operations require access to
the source execution state in a persistent store (not the ephemeral/default
in-memory store) so callers know to configure a persistent memory backend to
avoid runtime errors.

packages/core/src/workflow/time-travel.spec.ts (1)

9-13: Add a test for the explicitly documented running-state rejection.

The PR notes call out: "timeTravel rejects source executions in the running state (crash recovery continues to use restart(...))", and the overview docs repeat this under "Notes". This is a load-bearing invariant that currently has no test coverage.

A test for a non-existent executionId would also be worth adding as a companion edge case.

✅ Suggested test additions

it("should reject time travel on a running execution", async () => {
  const memory = new Memory({ storage: new InMemoryStorageAdapter() });

  const workflow = createWorkflow(
    {
      id: "time-travel-running-rejection",
      name: "Time Travel Running Rejection",
      input: z.object({ value: z.number() }),
      result: z.object({ value: z.number() }),
      memory,
    },
    andThen({ id: "step-1", execute: async ({ data }) => data }),
  );

  const registry = WorkflowRegistry.getInstance();
  registry.registerWorkflow(workflow);

  // Manually persist a "running" state to simulate an in-flight execution
  const runningId = "exec-running-test";
  await memory.setWorkflowState(runningId, {
    id: runningId,
    workflowId: "time-travel-running-rejection",
    workflowName: "Time Travel Running Rejection",
    status: "running",
    createdAt: new Date(),
    updatedAt: new Date(),
  });

  await expect(
    workflow.timeTravel({ executionId: runningId, stepId: "step-1" }),
  ).rejects.toThrow(/running/i);
});

it("should fail with actionable error when execution does not exist", async () => {
  const memory = new Memory({ storage: new InMemoryStorageAdapter() });

  const workflow = createWorkflow(
    {
      id: "time-travel-missing-exec",
      name: "Time Travel Missing Exec",
      input: z.object({ value: z.number() }),
      result: z.object({ value: z.number() }),
      memory,
    },
    andThen({ id: "step-1", execute: async ({ data }) => data }),
  );

  const registry = WorkflowRegistry.getInstance();
  registry.registerWorkflow(workflow);

  await expect(
    workflow.timeTravel({ executionId: "non-existent-id", stepId: "step-1" }),
  ).rejects.toThrow();
});

Want me to open a new issue to track these missing test cases?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflow/time-travel.spec.ts` around lines 9 - 13, Add two
unit tests to packages/core/src/workflow/time-travel.spec.ts: one that persists
a workflow state with status "running" (use Memory with InMemoryStorageAdapter,
createWorkflow and WorkflowRegistry.registerWorkflow) and asserts that
workflow.timeTravel({ executionId: runningId, stepId }) rejects with an error
mentioning "running"; and a second that calls workflow.timeTravel with a
non-existent executionId and asserts it rejects with an actionable error. Use
the existing createWorkflow/andThen helpers and Memory/WorkflowRegistry symbols
so the tests simulate the running/inexistent states without starting actual
executions.

packages/core/src/workflow/core.ts (1)

122-127: Silent default to "success" for unrecognized statuses.

This is used when reconstructing historical step data for replay. If a step had an unexpected status value (e.g., data corruption), it would silently be treated as "success". Consider logging a warning when the fallback is used, so replay anomalies are observable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflow/core.ts` around lines 122 - 127, The
toWorkflowStepStatus function silently defaults unrecognized values to
"success", which can hide replay anomalies; update toWorkflowStepStatus to log a
warning (including the raw value) whenever isWorkflowStepStatus(value) is false
before returning "success" — use the project's preferred logger (or console.warn
if none available) and reference toWorkflowStepStatus, isWorkflowStepStatus, and
WorkflowStepData["status"] so the warning makes replay data issues observable.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@website/docs/workflows/streaming.md`:
- Around line 45-50: The docs list of VoltAgent methods omits the new
.timeTravel() API; update the methods list in
website/docs/workflows/streaming.md to include `.timeTravel()` alongside
`.timeTravelStream()`, `.stream()`, `.run()`, and `.startAsync()` so the list
reflects all five public methods and their brief descriptions.

---

Nitpick comments:
In `@packages/core/src/workflow/chain.ts`:
- Around line 991-1023: Add the same persistent-memory requirement note used on
restart/restartAllActive to the JSDoc for timeTravel and timeTravelStream:
update the comments for the timeTravel and timeTravelStream methods in chain.ts
(methods named timeTravel and timeTravelStream) to explicitly state that these
operations require access to the source execution state in a persistent store
(not the ephemeral/default in-memory store) so callers know to configure a
persistent memory backend to avoid runtime errors.

In `@packages/core/src/workflow/core.ts`:
- Around line 122-127: The toWorkflowStepStatus function silently defaults
unrecognized values to "success", which can hide replay anomalies; update
toWorkflowStepStatus to log a warning (including the raw value) whenever
isWorkflowStepStatus(value) is false before returning "success" — use the
project's preferred logger (or console.warn if none available) and reference
toWorkflowStepStatus, isWorkflowStepStatus, and WorkflowStepData["status"] so
the warning makes replay data issues observable.

In `@packages/core/src/workflow/time-travel.spec.ts`:
- Around line 9-13: Add two unit tests to
packages/core/src/workflow/time-travel.spec.ts: one that persists a workflow
state with status "running" (use Memory with InMemoryStorageAdapter,
createWorkflow and WorkflowRegistry.registerWorkflow) and asserts that
workflow.timeTravel({ executionId: runningId, stepId }) rejects with an error
mentioning "running"; and a second that calls workflow.timeTravel with a
non-existent executionId and asserts it rejects with an actionable error. Use
the existing createWorkflow/andThen helpers and Memory/WorkflowRegistry symbols
so the tests simulate the running/inexistent states without starting actual
executions.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

website/docs/api/endpoints/workflows.md (1)
623-647: ⚠️ Potential issue | 🟡 Minor

Replay lineage fields (replayedFromExecutionId, replayFromStepId) absent from state response example.

The changeset documents that workflow state records now carry replayedFromExecutionId and replayFromStepId. API consumers polling the state endpoint on a replay execution will receive these fields with no reference in the docs.
📝 Proposed addition to the state response example
     "context": {
       "userId": "user-123",
       "priority": "high"
-    }
+    },
+    "replayedFromExecutionId": "exec_1234567890_abc123",
+    "replayFromStepId": "approval-required"
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@website/docs/api/endpoints/workflows.md` around lines 623 - 647, Update the
state response example to include the new replay lineage fields by adding
"replayedFromExecutionId" and "replayFromStepId" into the top-level "data"
object of the JSON example (use realistic sample values like
"exec_0987654321_replay" and "step-approval-1"); ensure these keys appear
alongside existing fields such as "executionId", "workflowId", "status", and
"suspension" so API consumers polling a replay execution will see the replay
lineage in the documented response.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@website/docs/api/endpoints/workflows.md`:
- Line 566: Fix the grammar in the error description string "400 - Invalid
replay parameters (for example invalid `stepId` or source execution still
running)" by inserting a comma after "for example" so it reads "...(for example,
invalid `stepId` or source execution still running)"; update that exact line in
workflows.md accordingly.
- Around line 546-562: Add the missing "suspended" response variant to the
replay response docs for the Execute endpoint: after the existing completed
example include a JSON example where "status": "suspended", "endAt": null,
"result": null and a "suspension" object containing "suspendedAt", "reason", and
"suspendedStepIndex" (keep executionId/startAt values consistent with the
example); ensure field names match the existing schema (executionId, startAt,
endAt, status, result, suspension) so replay documentation covers both completed
and suspended outcomes.
- Around line 513-613: Remove the entire "Replay Workflow" documentation block
that declares the POST /workflows/:id/executions/:executionId/replay endpoint
because there is no server route for it; instead, update docs to reflect that
replay/time-travel functionality is SDK-only (referencing the timeTravelStream
API) and ensure any examples, request/response schemas, and error cases for that
endpoint are deleted so the docs only show implemented REST endpoints (execute,
stream, suspend, cancel, resume, state retrieval).

---

Outside diff comments:
In `@website/docs/api/endpoints/workflows.md`:
- Around line 623-647: Update the state response example to include the new
replay lineage fields by adding "replayedFromExecutionId" and "replayFromStepId"
into the top-level "data" object of the JSON example (use realistic sample
values like "exec_0987654321_replay" and "step-approval-1"); ensure these keys
appear alongside existing fields such as "executionId", "workflowId", "status",
and "suspension" so API consumers polling a replay execution will see the replay
lineage in the documented response.

coderabbitai · 2026-02-22T05:52:02Z

+## Replay Workflow
+
+Create a deterministic replay execution from a historical run and selected step.
+
+**Endpoint:** `POST /workflows/:id/executions/:executionId/replay`
+
+**Request Body:**
+
+```json
+{
+  "stepId": "approval-step",
+  "inputData": {
+    "amount": 2500
+  },
+  "resumeData": {
+    "approved": true,
+    "approvedBy": "ops-user-1"
+  },
+  "workflowStateOverride": {
+    "replayReason": "incident-1234"
+  }
+}
+```
+
+**Parameters:**
+
+| Field                   | Type   | Description                             |
+| ----------------------- | ------ | --------------------------------------- |
+| `stepId`                | string | Historical step ID to replay from       |
+| `inputData`             | any    | Optional selected-step input override   |
+| `resumeData`            | any    | Optional resume payload override        |
+| `workflowStateOverride` | object | Optional shared workflow state override |
+
+**Response:**
+
+```json
+{
+  "success": true,
+  "data": {
+    "executionId": "exec_replay_123",
+    "startAt": "2024-01-15T11:00:00.000Z",
+    "endAt": "2024-01-15T11:00:02.250Z",
+    "status": "completed",
+    "result": {
+      "approved": true,
+      "finalAmount": 2500
+    }
+  }
+}
+```
+
+**Error Cases:**
+
+- `400` - Invalid replay parameters (for example invalid `stepId` or source execution still running)
+- `404` - Workflow or source execution not found
+- `500` - Replay failed due to server error
+
+**cURL Example (Default Replay):**
+
+```bash
+curl -X POST http://localhost:3141/workflows/order-approval/executions/exec_123/replay \
+  -H "Content-Type: application/json" \
+  -d '{
+    "stepId": "approval-step"
+  }'
+```
+
+**cURL Example (Replay With Overrides):**
+
+```bash
+curl -X POST http://localhost:3141/workflows/order-approval/executions/exec_123/replay \
+  -H "Content-Type: application/json" \
+  -d '{
+    "stepId": "approval-step",
+    "inputData": { "amount": 2500 },
+    "resumeData": { "approved": true, "approvedBy": "ops-user-1" },
+    "workflowStateOverride": { "replayReason": "incident-1234" }
+  }'
+```
+
+**JavaScript Example:**
+
+```javascript
+const response = await fetch(
+  "http://localhost:3141/workflows/order-approval/executions/exec_123/replay",
+  {
+    method: "POST",
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify({
+      stepId: "approval-step",
+      inputData: { amount: 2500 },
+      resumeData: { approved: true, approvedBy: "ops-user-1" },
+      workflowStateOverride: { replayReason: "incident-1234" },
+    }),
+  }
+);
+
+const replay = await response.json();
+console.log("Replay execution ID:", replay.data.executionId);
+console.log("Replay status:", replay.data.status);
+```


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n --type ts -C3 "replay.*stream|stream.*replay" --ignore-case -g '!*.spec.ts' -g '!*.md'

Repository: VoltAgent/voltagent

Length of output: 6140

🏁 Script executed:

# Also search for route/endpoint definitions related to replay rg -n --type ts -C2 "POST.*replay|replay.*endpoint" --ignore-case -g '!*.spec.ts' -g '!*.md'

Repository: VoltAgent/voltagent

Length of output: 45

🏁 Script executed:

# Search for timeTravelStream method definitions rg -n "timeTravelStream" --type ts -g '!*.spec.ts' -g '!*.md' -C3

Repository: VoltAgent/voltagent

Length of output: 2446

🏁 Script executed:

# Look for streaming endpoint definitions (like /stream endpoints) to understand the pattern rg -n "\/stream|streaming.*endpoint" --type ts -C2 -g '!*.spec.ts' -g '!*.md' | head -100

Repository: VoltAgent/voltagent

Length of output: 157

🏁 Script executed:

# Look for all replay route definitions rg -n "replay" --type ts packages/server-core/src/routes/ -C3

Repository: VoltAgent/voltagent

Length of output: 856

🏁 Script executed:

# Check if there are any timeTravelStream route handlers rg -n "timeTravelStream" --type ts packages/server-elysia/src/routes/ packages/server-core/src/routes/ -C3

Repository: VoltAgent/voltagent

Length of output: 45

🏁 Script executed:

# Find all POST endpoint definitions in routes rg -n "POST|method.*POST" --type ts packages/server-core/src/routes/definitions.ts -C2 | grep -A2 "replay"

Repository: VoltAgent/voltagent

Length of output: 45

🏁 Script executed:

# Find all route/path definitions in the routes files rg -n "path.*replay|replay" --type ts packages/server-core/src/routes/definitions.ts packages/server-elysia/src/routes/ -C2

Repository: VoltAgent/voltagent

Length of output: 1196

🏁 Script executed:

# Check the actual route file to see all registered paths head -150 packages/server-core/src/routes/definitions.ts | tail -100

Repository: VoltAgent/voltagent

Length of output: 3727

🏁 Script executed:

# List all routes in routes/definitions.ts to see the complete API rg "path:" --type ts packages/server-core/src/routes/definitions.ts | head -20

Repository: VoltAgent/voltagent

Length of output: 737

🏁 Script executed:

# Get the complete list of paths to see all workflow-related routes rg "path: \"/workflows" --type ts packages/server-core/src/routes/definitions.ts

Repository: VoltAgent/voltagent

Length of output: 515

🏁 Script executed:

# Search more broadly in the entire routes definitions file wc -l packages/server-core/src/routes/definitions.ts

Repository: VoltAgent/voltagent

Length of output: 114

🏁 Script executed:

# Look for the replay endpoint in the entire file cat packages/server-core/src/routes/definitions.ts | grep -n "replay" -i

Repository: VoltAgent/voltagent

Length of output: 282

🏁 Script executed:

# Check if there's a replay endpoint implementation elsewhere rg -n "replay" --type ts packages/server-core/src/ packages/server-elysia/src/ -C2 | grep -i "post\|method\|handler\|route"

Repository: VoltAgent/voltagent

Length of output: 5370

🏁 Script executed:

# Check the handlers directory for replay-related code ls -la packages/server-core/src/handlers/

Repository: VoltAgent/voltagent

Length of output: 1320

🏁 Script executed:

# Look at the workflow handlers to see what operations are exposed rg -n "export.*function|export.*const" packages/server-core/src/handlers/workflow.handlers.ts | head -30

Repository: VoltAgent/voltagent

Length of output: 550

🏁 Script executed:

# Check if there's a replay handler defined somewhere rg -n "handleReplay\|replay.*handler" --type ts packages/server-core/src/ packages/server-elysia/src/

Repository: VoltAgent/voltagent

Length of output: 45

🏁 Script executed:

# Get the actual list of all endpoints in routes/definitions.ts with method and path rg "method:|path:" --type ts packages/server-core/src/routes/definitions.ts | grep -A1 "workflow" | head -50

Repository: VoltAgent/voltagent

Length of output: 800

🏁 Script executed:

# Search in the elysia routes file for any replay endpoint cat packages/server-elysia/src/routes/workflow.routes.ts | grep -n "replay\|time.*travel" -i

Repository: VoltAgent/voltagent

Length of output: 265

🏁 Script executed:

# Double-check by searching the entire routes definitions for "replay" with full context rg -n "replay" --type ts packages/server-core/src/routes/definitions.ts -B5 -A5

Repository: VoltAgent/voltagent

Length of output: 679

🏁 Script executed:

# Check the elysia routes file more comprehensively cat packages/server-elysia/src/routes/workflow.routes.ts | grep -n "post\|get\|put\|delete" | head -40

Repository: VoltAgent/voltagent

Length of output: 423

🏁 Script executed:

# Check if the /replay endpoint exists in the route definitions at all cat packages/server-core/src/routes/definitions.ts | grep -n "/replay" -i

Repository: VoltAgent/voltagent

Length of output: 45

🏁 Script executed:

# Search the entire codebase for any /replay route rg "executions.*replay|/replay" --type ts -i

Repository: VoltAgent/voltagent

Length of output: 144

🏁 Script executed:

# Check if there's a handler for the replay endpoint rg -n "handleReplay\|replayWorkflow" --type ts

Repository: VoltAgent/voltagent

Length of output: 45

🏁 Script executed:

# Check the elysia routes file for all POST endpoints to see if replay is there cat packages/server-elysia/src/routes/workflow.routes.ts | grep -B2 "app.post"

Repository: VoltAgent/voltagent

Length of output: 505

🏁 Script executed:

# Check the actual documentation file to see what endpoints are documented cat website/docs/api/endpoints/workflows.md | sed -n '513,613p'

Repository: VoltAgent/voltagent

Length of output: 2715

🏁 Script executed:

# Check if there are ANY replay endpoints documented further in the file cat website/docs/api/endpoints/workflows.md | grep -n "replay\|time.*travel" -i

Repository: VoltAgent/voltagent

Length of output: 1213

🏁 Script executed:

# Verify that the /replay endpoint is actually NOT in the routes wc -l packages/server-core/src/routes/definitions.ts && \ cat packages/server-core/src/routes/definitions.ts | grep -c "path:"

Repository: VoltAgent/voltagent

Length of output: 117

🏁 Script executed:

# Get a complete list of all endpoint definitions rg "^\s+[a-zA-Z]+WorkflowOperations?:" --type ts packages/server-core/src/routes/definitions.ts | head -20

Repository: VoltAgent/voltagent

Length of output: 45

🏁 Script executed:

# Check if replay endpoint might be registered separately find packages/server-elysia/src/routes -name "*.ts" -exec grep -l "replay" {} \;

Repository: VoltAgent/voltagent

Length of output: 115

🏁 Script executed:

# Search for any route handler that might implement replay rg -n "replay" --type ts packages/server-elysia/src/routes/workflow.routes.ts

Repository: VoltAgent/voltagent

Length of output: 181

🏁 Script executed:

# Final verification: list all endpoint keys in definitions.ts rg "^\s+[a-zA-Z]+Workflow[a-zA-Z]*:\s*\{" --type ts packages/server-core/src/routes/definitions.ts

Repository: VoltAgent/voltagent

Length of output: 270

🏁 Script executed:

# Check the documentation to confirm the endpoint claim is actually in the docs grep -n "POST /workflows/:id/executions/:executionId/replay" website/docs/api/endpoints/workflows.md

Repository: VoltAgent/voltagent

Length of output: 133

Remove the entire "Replay Workflow" section (lines 513-613) — this endpoint does not exist in the server implementation.

The documented POST /workflows/:id/executions/:executionId/replay endpoint has no corresponding route handler in packages/server-core/src/routes/definitions.ts or packages/server-elysia/src/routes/workflow.routes.ts. The server implements only these workflow execution endpoints: execute, stream, suspend, cancel, resume, and state retrieval. The timeTravelStream method introduced in this PR exists as an SDK-only API (packages/core/src/workflow/core.ts), not as a REST endpoint.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@website/docs/api/endpoints/workflows.md` around lines 513 - 613, Remove the entire "Replay Workflow" documentation block that declares the POST /workflows/:id/executions/:executionId/replay endpoint because there is no server route for it; instead, update docs to reflect that replay/time-travel functionality is SDK-only (referencing the timeTravelStream API) and ensure any examples, request/response schemas, and error cases for that endpoint are deleted so the docs only show implemented REST endpoints (execute, stream, suspend, cancel, resume, state retrieval).

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (5)

packages/server-core/src/schemas/agent.schemas.ts (1)
459-460: Consider adding .min(1) to the stepId schema field.

The handler rejects empty stepId with a 400 (line 841–847 of workflow.handlers.ts), but the schema allows it — so an empty stepId passes schema validation and is rejected only at the handler level. Adding .min(1) to the schema would surface the error earlier and produce a more informative validation message.
-  stepId: z.string().describe("Step ID to replay from"),
+  stepId: z.string().min(1).describe("Step ID to replay from"),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server-core/src/schemas/agent.schemas.ts` around lines 459 - 460,
The WorkflowReplayRequestSchema currently allows empty stepId values; update the
stepId validator in WorkflowReplayRequestSchema (the z.string().describe("Step
ID to replay from") entry) to enforce non-empty input by adding .min(1) so
schema validation rejects empty strings earlier and returns a proper validation
error instead of relying on the handler-level 400 check.
packages/core/src/workflow/time-travel.spec.ts (1)
10-13: Test cleanup should also clear activeExecutions to prevent cross-test interference in the sequential suite.

The test's beforeEach only clears the workflows Map, but the registry has a second stateful property: public activeExecutions (line 49 of registry.ts), which tracks active workflow suspensions and is actively modified during execution. Since the suite uses describe.sequential(), tests run one after another, and stale entries in activeExecutions from a previous test could interfere with subsequent tests.

Add (registry as any).activeExecutions.clear() to the beforeEach, or expose a public reset method in WorkflowRegistry to clean both maps without type-casting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflow/time-travel.spec.ts` around lines 10 - 13, The
beforeEach in the test clears only the workflows Map on WorkflowRegistry, but
leaves the registry's public activeExecutions state intact which can leak
between sequential tests; update the beforeEach to also clear activeExecutions
by calling (registry as any).activeExecutions.clear(), or preferably add and
call a new public reset/clear method on WorkflowRegistry that clears both
workflows and activeExecutions so tests start from a clean registry state before
each test.
packages/server-core/src/handlers/workflow.handlers.ts (1)
858-872: Import and use the exported WorkflowTimeTravelOptions type instead of the inline definition.

The inline type at lines 859–864 is a simplified replica of the exported WorkflowTimeTravelOptions from @voltagent/core. The differences include:

inputData and resumeData should use the typed DangerouslyAllowAny instead of unknown

workflowStateOverride should reference WorkflowStateStore instead of a raw Record<string, unknown>

Additionally, the return type (lines 865–871) is overly generic and differs from the actual WorkflowExecutionResult<RESULT_SCHEMA, RESUME_SCHEMA> signature—it includes an undefined startAt field and loses type safety for status and result.

Import WorkflowTimeTravelOptions from @voltagent/core and apply it to the type cast to maintain consistency with the library's type definitions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server-core/src/handlers/workflow.handlers.ts` around lines 858 -
872, Replace the inline time-travel type on workflowWithReplay with the
library's exported types: import WorkflowTimeTravelOptions (and related types
like WorkflowExecutionResult, DangerouslyAllowAny, WorkflowStateStore) from
`@voltagent/core` and use WorkflowTimeTravelOptions as the parameter type for
timeTravel on registeredWorkflow.workflow; also update the timeTravel return
type to the proper WorkflowExecutionResult generic signature instead of the
ad-hoc object so the method signature on workflowWithReplay matches the core
definitions (refer to workflowWithReplay and registeredWorkflow.workflow to
locate the cast).
packages/core/src/workflow/core.ts (2)
2947-2960: The trailing .catch() is dead code.

.then(onFulfilled, onRejected) already handles both settlement cases — exactly one handler is called. Since neither the onFulfilled nor the onRejected branch here throws or returns a rejected promise, the promise returned by .then() always fulfills, and the trailing .catch(() => {}) never fires.
♻️ Remove the unreachable catch
  replayPromise
    .then(
      (result) => {
        if (result.status !== "suspended") {
          streamController.close();
        }
      },
      () => {
        streamController.close();
      },
-   )
-   .catch(() => {
-     // Error is surfaced through promise-backed fields on stream result.
-   });
+   );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflow/core.ts` around lines 2947 - 2960, The trailing
.catch on replayPromise is dead code; remove the final .catch(() => {}) and keep
the existing .then(result => { if (result.status !== "suspended")
streamController.close(); }, () => { streamController.close(); }) so both
fulfillment and rejection cases close the streamController; locate the call by
the unique symbols replayPromise and streamController in the file and delete
only the trailing .catch handler (or replace the whole chain with a .finally
that closes streamController if you prefer a single lifecycle handler).
1038-1097: Minor observability gap: logs at lines 1061–1066 lack execution context.

The logger used here is the workflow-level LoggerProxy without executionId or traceId, because runLogger isn't created until line 1123. The warn on line 1063 ("No source trace IDs found in replay workflow state metadata") won't carry the replay execution ID, making it hard to correlate with a specific replay when debugging production issues.

The lookup logic and replayedFrom construction are otherwise correct.
♻️ Consider propagating executionId in the log call
-      logger.warn("No source trace IDs found in replay workflow state metadata");
+      logger.warn("No source trace IDs found in replay workflow state metadata", {
+        replayExecutionId: executionId,
+        sourceExecutionId: options.replayFrom.executionId,
+      });
-      logger.warn("Failed to get source trace IDs for replay:", { error });
+      logger.warn("Failed to get source trace IDs for replay:", {
+        error,
+        replayExecutionId: executionId,
+        sourceExecutionId: options.replayFrom.executionId,
+      });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflow/core.ts` around lines 1038 - 1097, The warning
logs when resolving replay/resume trace IDs (the logger.warn calls in the replay
branch around the handling of options.replayFrom.executionId and the
corresponding catch) lack execution context; update those logger.warn
invocations to include the replay executionId (options.replayFrom.executionId)
and any available execution identifiers (e.g., executionId) in the
metadata/object passed to the logger so the message can be correlated to the
specific replay (refer to replayedFrom, executionMemory.getWorkflowState, and
logger); do the same for the catch logger.warn that logs the error (include the
executionId value in the logged object) and mirror this small change in the
resume branch warnings so all trace-resolution warnings include executionId
context.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2629-2644: The fallback matching is ambiguous because
event.from/event.name can equal a non-unique step.name; update the matching
logic in the sourceStepCompleteEvents.find call to first attempt an unambiguous
match using a stable identifier (prefer event.stepIndex === step.index if
stepIndex is present on events, or event.stepId === step.id if you emit stepId)
before falling back to name-based checks; adjust the search to check
event.stepIndex/event.stepId first, then only use event.from === step.id or
event.name === step.name as last resorts, and ensure replayStepData assignment
(and the call to toWorkflowStepStatus) uses the matched event.
- Around line 3048-3054: The resume path for time-travel streams currently
delegates from resumedStreamResult.resume to streamResult.resume which rebuilds
resume options from the original replay's metadata
(replayResult.suspension.checkpoint) causing second suspensions to replay from
the wrong checkpoint; change resumedStreamResult.resume (and the analogous logic
in the stream method) to construct resumeOptions from the current suspension
metadata (nextResult.suspension) instead of relying on
replayPromise/replayResult, i.e., use nextResult.suspension.checkpoint and
nextResult.suspension.metadata when calling resume so that subsequent resumes
start from the most recent suspension rather than the original one.

In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Around line 905-912: The httpStatus decision incorrectly treats any error
message containing "step" as a 400; update the conditional logic around
normalizedMessage in the httpStatus assignment (the variable named httpStatus
using normalizedMessage.includes(...)) to remove the broad
normalizedMessage.includes("step") check and either delete that clause entirely
or replace it with a precise matcher for missing-step cases (e.g., only match
"step not found" or a regex like /step.*not found/ or /step.*'[^']*' not found/)
so only genuine "step not found" errors map to 404/400 as intended and other
"step" mentions continue to result in 500.

---

Nitpick comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2947-2960: The trailing .catch on replayPromise is dead code;
remove the final .catch(() => {}) and keep the existing .then(result => { if
(result.status !== "suspended") streamController.close(); }, () => {
streamController.close(); }) so both fulfillment and rejection cases close the
streamController; locate the call by the unique symbols replayPromise and
streamController in the file and delete only the trailing .catch handler (or
replace the whole chain with a .finally that closes streamController if you
prefer a single lifecycle handler).
- Around line 1038-1097: The warning logs when resolving replay/resume trace IDs
(the logger.warn calls in the replay branch around the handling of
options.replayFrom.executionId and the corresponding catch) lack execution
context; update those logger.warn invocations to include the replay executionId
(options.replayFrom.executionId) and any available execution identifiers (e.g.,
executionId) in the metadata/object passed to the logger so the message can be
correlated to the specific replay (refer to replayedFrom,
executionMemory.getWorkflowState, and logger); do the same for the catch
logger.warn that logs the error (include the executionId value in the logged
object) and mirror this small change in the resume branch warnings so all
trace-resolution warnings include executionId context.

In `@packages/core/src/workflow/time-travel.spec.ts`:
- Around line 10-13: The beforeEach in the test clears only the workflows Map on
WorkflowRegistry, but leaves the registry's public activeExecutions state intact
which can leak between sequential tests; update the beforeEach to also clear
activeExecutions by calling (registry as any).activeExecutions.clear(), or
preferably add and call a new public reset/clear method on WorkflowRegistry that
clears both workflows and activeExecutions so tests start from a clean registry
state before each test.

In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Around line 858-872: Replace the inline time-travel type on workflowWithReplay
with the library's exported types: import WorkflowTimeTravelOptions (and related
types like WorkflowExecutionResult, DangerouslyAllowAny, WorkflowStateStore)
from `@voltagent/core` and use WorkflowTimeTravelOptions as the parameter type for
timeTravel on registeredWorkflow.workflow; also update the timeTravel return
type to the proper WorkflowExecutionResult generic signature instead of the
ad-hoc object so the method signature on workflowWithReplay matches the core
definitions (refer to workflowWithReplay and registeredWorkflow.workflow to
locate the cast).

In `@packages/server-core/src/schemas/agent.schemas.ts`:
- Around line 459-460: The WorkflowReplayRequestSchema currently allows empty
stepId values; update the stepId validator in WorkflowReplayRequestSchema (the
z.string().describe("Step ID to replay from") entry) to enforce non-empty input
by adding .min(1) so schema validation rejects empty strings earlier and returns
a proper validation error instead of relying on the handler-level 400 check.

cubic-dev-ai

2 issues found across 5 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="website/docs/api/endpoints/workflows.md">

<violation number="1" location="website/docs/api/endpoints/workflows.md:650">
P3: `replayedFromExecutionId` is documented elsewhere as the original execution ID; this example value is labeled like a replay execution ID, which inverts the lineage and can mislead users. Use an original/source execution ID in the example.</violation>
</file>

<file name="packages/core/src/workflow/core.ts">

<violation number="1" location="packages/core/src/workflow/core.ts:158">
P1: `Number("")` returns `0` which passes `Number.isInteger`, so an empty-string `metadata.stepIndex` is silently treated as step index 0. This can cause a false match against step 0 in the time-travel fallback logic, loading incorrect historical data into the replay. Add a guard for non-empty strings and non-negative values.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-02-22T06:37:23Z

+    return metadataStepIndex;
+  }
+
+  if (typeof metadataStepIndex === "string") {


P1: Number("") returns 0 which passes Number.isInteger, so an empty-string metadata.stepIndex is silently treated as step index 0. This can cause a false match against step 0 in the time-travel fallback logic, loading incorrect historical data into the replay. Add a guard for non-empty strings and non-negative values.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/src/workflow/core.ts, line 158: <comment>`Number("")` returns `0` which passes `Number.isInteger`, so an empty-string `metadata.stepIndex` is silently treated as step index 0. This can cause a false match against step 0 in the time-travel fallback logic, loading incorrect historical data into the replay. Add a guard for non-empty strings and non-negative values.</comment> <file context> @@ -139,6 +139,49 @@ const toWorkflowStepStatus = ( + return metadataStepIndex; + } + + if (typeof metadataStepIndex === "string") { + const parsed = Number(metadataStepIndex); + if (Number.isInteger(parsed)) { </file context>

cubic-dev-ai · 2026-02-22T06:37:23Z

    "executionId": "exec_1234567890_abc123",
    "workflowId": "order-approval",
    "status": "suspended",
+    "replayedFromExecutionId": "exec_0987654321_replay",


P3: replayedFromExecutionId is documented elsewhere as the original execution ID; this example value is labeled like a replay execution ID, which inverts the lineage and can mislead users. Use an original/source execution ID in the example.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At website/docs/api/endpoints/workflows.md, line 650: <comment>`replayedFromExecutionId` is documented elsewhere as the original execution ID; this example value is labeled like a replay execution ID, which inverts the lineage and can mislead users. Use an original/source execution ID in the example.</comment> <file context> @@ -627,6 +647,8 @@ Retrieve the current state of a workflow execution. "executionId": "exec_1234567890_abc123", "workflowId": "order-approval", "status": "suspended", + "replayedFromExecutionId": "exec_0987654321_replay", + "replayFromStepId": "step-approval-1", "startAt": "2024-01-15T10:00:00.000Z", </file context>

Suggested change

"replayedFromExecutionId": "exec_0987654321_replay",

"replayedFromExecutionId": "exec_0987654321_original",

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

packages/core/src/workflow/time-travel.spec.ts (1)
10-14: Registry cleanup accesses private members via any cast.

This pattern ((registry as any).workflows.clear()) is fragile — if the internal property is renamed or restructured, the test will silently stop cleaning up. Consider whether WorkflowRegistry could expose a reset() or clear() method for test use.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflow/time-travel.spec.ts` around lines 10 - 14, The
test is reaching into WorkflowRegistry internals by casting to any and calling
(registry as any).workflows.clear() and activeExecutions.clear(), which is
brittle; add a public test-only method on WorkflowRegistry such as reset() or
clear() that encapsulates clearing its internal maps, then update the spec to
call WorkflowRegistry.getInstance().reset() (or .clear()) instead of accessing
workflows/activeExecutions directly; reference the existing WorkflowRegistry
class and its getInstance method and the internal maps named workflows and
activeExecutions when implementing the new method.
packages/server-core/src/handlers/workflow.handlers.ts (1)
848-848: body parameter is typed as any — consider using the Zod-inferred type.

Since WorkflowReplayRequestSchema exists in the schemas package, you could type body as z.infer<typeof WorkflowReplayRequestSchema> for stronger compile-time guarantees. This matches how the schema is enforced at the route layer and would surface mismatches earlier.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server-core/src/handlers/workflow.handlers.ts` at line 848, The
handler currently destructures properties from an untyped body; update the
handler parameter type to z.infer<typeof WorkflowReplayRequestSchema> (importing
WorkflowReplayRequestSchema from the schemas package and z from 'zod' if needed)
so that body is strongly typed, then keep the existing destructuring of stepId,
inputData, resumeData, workflowStateOverride from body; this will align the
handler with the route validation and surface type mismatches at compile time.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2627-2820: prepareTimeTravelExecution always uses defaultMemory
which breaks replays for executions stored in a custom memory adapter; add an
optional memory?: MemoryAdapter (or appropriate type) to
WorkflowTimeTravelOptions and update prepareTimeTravelExecution signature to
accept/use a memory parameter (falling back to defaultMemory) so calls to
getWorkflowState and setWorkflowState use that adapter; update callers
timeTravel and timeTravelStream to pass options.memory through (mirror how run()
and restartExecution() handle options?.memory) and ensure
restartCheckpoint/getRestartCheckpointFromMetadata logic still works with the
chosen adapter.

---

Duplicate comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 2686-2712: The name-based fallback can produce false matches when
multiple steps share the same step.name; change the fallback to only perform
name-based matching when the name is unambiguous: before evaluating the
name-based branch in the fallbackEvent search (the code that references
getEventStepIndex, getEventStepId, event.from, event.name, step.id, step.name on
sourceStepCompleteEvents), check that no other workflow step shares step.name
(i.e., verify uniqueness among the steps collection) and only then allow
matching by name; if the name is not unique, skip the name checks so
fallbackEvent only matches by step index or step id, then continue populating
replayStepData[step.id] with input/output/status via toWorkflowStepStatus as
before.

---

Nitpick comments:
In `@packages/core/src/workflow/time-travel.spec.ts`:
- Around line 10-14: The test is reaching into WorkflowRegistry internals by
casting to any and calling (registry as any).workflows.clear() and
activeExecutions.clear(), which is brittle; add a public test-only method on
WorkflowRegistry such as reset() or clear() that encapsulates clearing its
internal maps, then update the spec to call
WorkflowRegistry.getInstance().reset() (or .clear()) instead of accessing
workflows/activeExecutions directly; reference the existing WorkflowRegistry
class and its getInstance method and the internal maps named workflows and
activeExecutions when implementing the new method.

In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Line 848: The handler currently destructures properties from an untyped body;
update the handler parameter type to z.infer<typeof WorkflowReplayRequestSchema>
(importing WorkflowReplayRequestSchema from the schemas package and z from 'zod'
if needed) so that body is strongly typed, then keep the existing destructuring
of stepId, inputData, resumeData, workflowStateOverride from body; this will
align the handler with the route validation and surface type mismatches at
compile time.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/src/workflow/core.ts`:
- Around line 3070-3081: The resumeOptions object created for replay/stream
resumes (the one that sets executionId and resumeFrom) fails to include the
effective memory adapter, so resumes use defaultMemory and lose state; update
the resumeOptions construction in the resume path (where
resumedSuspendController is created) to include the same memory adapter used by
the original run (e.g., add memory: effectiveMemory or memoryAdapter/the
workflow run's memory property) so the resumed WorkflowRunOptions carries the
same memory instance for reads/writes during replay/stream resumes.

In `@packages/server-core/src/handlers/workflow.handlers.ts`:
- Around line 904-912: The current httpStatus classification using
message/normalizedMessage in workflow.handlers.ts assigns many replay-prep
errors a 500; update the logic that sets httpStatus (the normalizedMessage
checks that produce 404/400/500) to treat user-correctable replay-prep
messages—e.g., phrases like "missing historical snapshots", "missing snapshot",
"missing input", "no historical", or "missing history"—as 400 (bad request)
instead of 500 while keeping the existing "not found" -> 404 mapping; modify the
conditional that computes httpStatus (refer to the variables message,
normalizedMessage, and httpStatus) to include these additional
normalizedMessage.includes(...) checks so those errors return 400.

coderabbitai · 2026-02-22T06:56:11Z

+        const resumedSuspendController = createDefaultSuspendController();
+        const resumeOptions: WorkflowRunOptions = {
+          executionId: suspendedResult.executionId,
+          resumeFrom: {
+            executionId: suspendedResult.executionId,
+            checkpoint: suspendedResult.suspension.checkpoint,
+            resumeStepIndex,
+            resumeData: resumeInput,
+          },
+          suspendController: resumedSuspendController,
+        };
+


⚠️ Potential issue | 🟠 Major

Propagate the effective memory adapter into replay/stream resumes.

Line 3072–3079 and Line 3248–3254: resume options omit memory, so a replay/stream that started with a custom memory adapter will resume against defaultMemory and fail to find state. Carry the same memory through resume to keep state reads/writes consistent.

🧩 Suggested fix

timeTravelStream: (timeTravelOptions: WorkflowTimeTravelOptions) => { const streamController = new WorkflowStreamController(); const executionId = randomUUID(); const startAt = new Date(); + const replayMemory = timeTravelOptions.memory ?? defaultMemory; const suspendController = createDefaultSuspendController(); @@ - const replayExecutionOptions: WorkflowRunOptions = { - ...preparedReplay.executionOptions, - suspendController, - }; + const replayExecutionOptions: WorkflowRunOptions = { + ...preparedReplay.executionOptions, + memory: replayMemory, + suspendController, + }; @@ - const resumeOptions: WorkflowRunOptions = { - executionId: suspendedResult.executionId, - resumeFrom: { - executionId: suspendedResult.executionId, - checkpoint: suspendedResult.suspension.checkpoint, - resumeStepIndex, - resumeData: resumeInput, - }, - suspendController: resumedSuspendController, - }; + const resumeOptions: WorkflowRunOptions = { + executionId: suspendedResult.executionId, + memory: replayMemory, + resumeFrom: { + executionId: suspendedResult.executionId, + checkpoint: suspendedResult.suspension.checkpoint, + resumeStepIndex, + resumeData: resumeInput, + }, + suspendController: resumedSuspendController, + }; @@ stream: (input: WorkflowInput<INPUT_SCHEMA>, options?: WorkflowRunOptions) => { const streamController = new WorkflowStreamController(); const executionId = options?.executionId || crypto.randomUUID(); + const executionMemory = options?.memory ?? defaultMemory; @@ const executionOptions: WorkflowRunOptions = { ...options, executionId, suspendController, + memory: executionMemory, }; @@ - const resumeOptions: WorkflowRunOptions = { - executionId: suspendedResult.executionId, - resumeFrom: { - executionId: suspendedResult.executionId, - checkpoint: suspendedResult.suspension.checkpoint, - resumeStepIndex, - resumeData: resumeInput, - }, - suspendController: resumedSuspendController, - }; + const resumeOptions: WorkflowRunOptions = { + executionId: suspendedResult.executionId, + memory: executionMemory, + resumeFrom: { + executionId: suspendedResult.executionId, + checkpoint: suspendedResult.suspension.checkpoint, + resumeStepIndex, + resumeData: resumeInput, + }, + suspendController: resumedSuspendController, + };

Also applies to: 3246-3256

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/core/src/workflow/core.ts` around lines 3070 - 3081, The resumeOptions object created for replay/stream resumes (the one that sets executionId and resumeFrom) fails to include the effective memory adapter, so resumes use defaultMemory and lose state; update the resumeOptions construction in the resume path (where resumedSuspendController is created) to include the same memory adapter used by the original run (e.g., add memory: effectiveMemory or memoryAdapter/the workflow run's memory property) so the resumed WorkflowRunOptions carries the same memory instance for reads/writes during replay/stream resumes.

coderabbitai · 2026-02-22T06:56:11Z

+    const message = error instanceof Error ? error.message : "Failed to replay workflow";
+    const normalizedMessage = message.toLowerCase();
+    const httpStatus = normalizedMessage.includes("not found")
+      ? 404
+      : normalizedMessage.includes("cannot time travel") ||
+          normalizedMessage.includes("still running") ||
+          normalizedMessage.includes("belongs to workflow")
+        ? 400
+        : 500;


⚠️ Potential issue | 🟡 Minor

Classify replay-prep errors as 400 instead of 500.

Line 906–912: errors like “missing historical snapshots/input” are user-correctable; returning 500 hides actionable feedback. Consider mapping those to 400.

🧩 Suggested update

- const httpStatus = normalizedMessage.includes("not found") + const httpStatus = normalizedMessage.includes("not found") ? 404 - : normalizedMessage.includes("cannot time travel") || - normalizedMessage.includes("still running") || - normalizedMessage.includes("belongs to workflow") + : normalizedMessage.includes("cannot time travel") || + normalizedMessage.includes("still running") || + normalizedMessage.includes("belongs to workflow") || + normalizedMessage.includes("missing historical") || + normalizedMessage.includes("missing persisted workflow input") || + normalizedMessage.includes("missing historical input data") ? 400 : 500;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/server-core/src/handlers/workflow.handlers.ts` around lines 904 - 912, The current httpStatus classification using message/normalizedMessage in workflow.handlers.ts assigns many replay-prep errors a 500; update the logic that sets httpStatus (the normalizedMessage checks that produce 404/400/500) to treat user-correctable replay-prep messages—e.g., phrases like "missing historical snapshots", "missing snapshot", "missing input", "no historical", or "missing history"—as 400 (bad request) instead of 500 while keeping the existing "not found" -> 404 mapping; modify the conditional that computes httpStatus (refer to the variables message, normalizedMessage, and httpStatus) to include these additional normalizedMessage.includes(...) checks so those errors return 400.

…rors

feat(core): add workflow time-travel replay APIs

c187e24

This comment has been minimized.

Sign in to view

cubic-dev-ai bot reviewed Feb 22, 2026

View reviewed changes