Skip to content

fix(sdk): make --resume work for script workflows#725

Merged
khaliqgant merged 2 commits into
mainfrom
fix/workflow-resume
Apr 13, 2026
Merged

fix(sdk): make --resume work for script workflows#725
khaliqgant merged 2 commits into
mainfrom
fix/workflow-resume

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Apr 13, 2026

Summary

Script workflows (TS files that call workflow(...).run()) could not be resumed. Three linked bugs in packages/sdk/src/workflows/builder.ts:

  1. WorkflowBuilder.run() constructed a WorkflowRunner without a db option, so it silently defaulted to InMemoryWorkflowDb and .agent-relay/workflow-runs.jsonl was never written for script workflows. --resume then found no db entry.
  2. The cache-reconstruction fallback (reconstructRunFromCache) bails out for script workflows because it looks for relay.yaml on disk when no config is passed — and script workflows do not have a relay.yaml.
  3. Both runner.resume() call sites in builder.ts only pass (resumeRunId, options.vars) — never forwarding the config the script just built. Even if (2) were fixed, the config still would not reach the fallback.

Net effect: a script workflow that fails mid-DAG with cached step outputs on disk still threw Run "<id>" not found (no database entry or cached step outputs) on resume.

Fix

  • Default WorkflowBuilder.run() to JsonFileWorkflowDb pointed at <cwd>/.agent-relay/workflow-runs.jsonl — matching the YAML CLI code path in cli.ts:338.
  • Forward the built config as the third arg into both runner.resume() call sites so reconstructRunFromCache can use it without needing relay.yaml.

Test plan

  • New vitest coverage: builder-resume-persistence.test.ts with two tests — "run state persisted to JSONL by default" and "resume reconstructs from cached step outputs when jsonl is missing"
  • Full packages/sdk test suite green (regression sweep covers resume-fallback, builder-deterministic, start-from)
  • Dist-level verification: rebuilt packages/sdk and grepped the compiled dist/workflows/builder.js for JsonFileWorkflowDb and the 3-arg runner.resume(...) call

Repro before this fix

  1. Run any script workflow: agent-relay run workflows/my.ts
  2. Workflow fails at some step; note the printed run id
  3. agent-relay run workflows/my.ts --resume <run-id> → throws Run "<id>" not found (no database entry or cached step outputs) even though .agent-relay/step-outputs/<run-id>/ has cached .md files on disk

🤖 Generated with Claude Code


Open with Devin

khaliqgant and others added 2 commits April 12, 2026 22:12
Script workflows (TS files that call workflow(...).run()) could not
be resumed. Two linked bugs:

1. WorkflowBuilder.run() constructed a WorkflowRunner without a db
   option, so it silently defaulted to InMemoryWorkflowDb and the
   .agent-relay/workflow-runs.jsonl file was never written. --resume
   then found no db entry.

2. The cache-reconstruction fallback (reconstructRunFromCache) bailed
   out for script workflows because builder.run() called
   runner.resume(runId, vars) with only two args, never forwarding
   the config it just built. reconstructRunFromCache then tried to
   read relay.yaml from disk (which does not exist for script
   workflows) and returned null, triggering the "Run not found
   (no database entry or cached step outputs)" error even when
   step outputs WERE cached on disk.

Fix both by:
- defaulting WorkflowBuilder.run() to JsonFileWorkflowDb pointed
  at {cwd}/.agent-relay/workflow-runs.jsonl, matching the YAML
  CLI code path in cli.ts
- forwarding the built config into both runner.resume() call sites

Adds vitest coverage in builder-resume-persistence.test.ts for both
the "run state persisted" and "resume from cache without jsonl"
paths. Full sdk test suite remains green.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@khaliqgant khaliqgant merged commit 285656b into main Apr 13, 2026
38 checks passed
@khaliqgant khaliqgant deleted the fix/workflow-resume branch April 13, 2026 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant