Skip to content

Add @rivetkit/world-workflow package for Vercel Workflow SDK#4699

Open
jog1t wants to merge 2 commits intomainfrom
claude/rivet-world-workflow-pdBpl
Open

Add @rivetkit/world-workflow package for Vercel Workflow SDK#4699
jog1t wants to merge 2 commits intomainfrom
claude/rivet-world-workflow-pdBpl

Conversation

@jog1t
Copy link
Copy Markdown
Contributor

@jog1t jog1t commented Apr 22, 2026

Description

This PR introduces @rivetkit/world-workflow, a new package that implements the Vercel Workflow SDK's World interface using Rivet Actors as the backing storage and queue infrastructure.

The implementation replaces traditional Postgres + queue infrastructure with three Rivet actors:

  • workflowRunActor — One per workflow run (keyed by runId). Owns:

    • Append-only event log for the run
    • Materialized run, steps, and hooks state
    • Named streams keyed by stream name
    • All mutations go through createEvent to maintain the event log as the source of truth
  • coordinatorActor — Singleton (keyed by ["coordinator"]). Serves as a cross-run index for:

    • Paginated runs.list() queries
    • Global hook token lookup via hooks.getByToken()
    • Hook token uniqueness enforcement
  • queueActor — One per ValidQueueName (e.g., __wkf_workflow_0). Handles:

    • Message enqueueing with idempotency and retry tracking
    • Message dispatch to user-provided handlers
    • Retry backoff based on configurable policies

The package exports:

  • createRivetWorld(config) — Factory function to create a World instance
  • registry — Pre-configured RivetKit registry containing all three actors
  • Type definitions mirroring the Vercel Workflow SDK's public contract

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

The package compiles successfully as a standalone workspace member. The implementation provides working implementations for the Storage, Queue, and Streamer interfaces with the following known limitations documented in the README:

  • readFromStream live streaming is not yet wired to the streamAppended actor event
  • hooks.get(hookId) lacks a global index; callers must use hooks.getByToken
  • events.listByCorrelationId returns empty results (correlation IDs are not globally indexed)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • Package compiles and type-checks successfully

https://claude.ai/code/session_014t3My5a42LHJ1umwV3vGgh

Adds a new package implementing the Vercel Workflow SDK World interface
backed by three Rivet actors: a per-run workflow actor holding the
event log and materialized state, a singleton coordinator for cross-run
indexes and hook token uniqueness, and a per-queue runner with retry
and idempotency handling. Includes local World type definitions,
createRivetWorld factory, registry, and README documenting current
gaps (readFromStream live streaming, global hook lookup, correlation
id index).
@railway-app
Copy link
Copy Markdown

railway-app Bot commented Apr 22, 2026

🚅 Deployed to the rivet-pr-4699 environment in rivet-frontend

Service Status Web Updated (UTC)
frontend-inspector ❌ Build Failed (View Logs) Web Apr 22, 2026 at 10:42 pm
kitchen-sink ❌ Build Failed (View Logs) Web Apr 22, 2026 at 10:42 pm
frontend-cloud ❌ Build Failed (View Logs) Web Apr 22, 2026 at 10:42 pm
website ❌ Build Failed (View Logs) Web Apr 22, 2026 at 10:22 pm
mcp-hub ✅ Success (View Logs) Web Apr 22, 2026 at 10:21 pm
ladle ❌ Build Failed (View Logs) Web Apr 22, 2026 at 10:21 pm

@claude
Copy link
Copy Markdown

claude Bot commented Apr 22, 2026

Code Review: @rivetkit/world-workflow (PR #4699)

This is a solid architectural foundation. The three-actor decomposition (event-sourced run actor, singleton coordinator index, per-queue actor) is a natural fit for the Rivet Actor model and maps cleanly onto the Vercel Workflow SDK World interface. The event-log-as-source-of-truth pattern in workflowRunActor is good. Notes below range from critical bugs to minor observations.


Critical Bugs

1. In-flight messages become permanently stuck after actor restart (queue.ts)

inFlight is persisted in actor state, but claimNext only scans pending, not inFlight. If the actor hibernates or restarts while a message is claimed, that message is orphaned and never retried. This is a durability hole in a system otherwise designed to be durable.

Fix: on actor startup (or at the top of claimNext), requeue any messages that have been in inFlight for longer than their expected dispatch window.

2. Date fields survive as strings over the RPC boundary, not Date objects

runToPublic, stepToPublic, etc. inside the actor construct new Date(timestamp). When serialized over the RPC boundary to the client, Date instances become ISO 8601 strings. The caller receives strings where the World interface declares Date.

revivePaginated in world.ts casts unknown[] as T[] without any deserialization, so steps.list, events.list, and hooks.list all return objects with string dates where Date objects are expected.

Fix: actor-side *ToPublic helpers should return plain millisecond timestamps, and world.ts should convert them to Date objects after the RPC call.


Significant Issues

3. Unbounded state growth across all three actors

  • workflowRunActor: c.state.events is an unbounded append-only array with no pruning path.
  • workflowRunActor: c.state.idempotencyKeys is acknowledged as unbounded in a code comment.
  • coordinatorActor: c.state.runs and c.state.hookTokens hold every run and hook ever created. As a singleton, this is the biggest accumulator in the system.
  • queueActor: idempotencyKeys has the same issue.

These are acceptable in an early scaffold, but each should have a tracking issue.

4. Offset-based cursor pagination is not stable under concurrent writes (both actors)

The cursor is a stringified integer array index. New items inserted between pages shift all subsequent pages. A keyset cursor would give stable pages. This is especially problematic in coordinator.listRuns where new runs register concurrently with list calls.

5. events.listByCorrelationId silently returns empty

Returning { data: [], cursor: null, hasMore: false } with no indication that the operation is unimplemented will cause silent failures when the SDK exercises this path. Throwing a NotImplementedError or logging a warning would surface the gap clearly.

6. dispatchMessage errors are fully swallowed

Failed dispatches are completely invisible (dispatchMessage(queueName).catch(() => {})). At minimum, log the error so operators know when queue dispatch is failing.

7. Partial failure can leak a hook token in the coordinator (world.ts)

In events.create for hook_created, the coordinator registers the token first, then the run actor writes the event. If the run actor write fails, the coordinator holds a live token reservation but the run actor has no matching hook. The token is leaked until disposeHookTokensForRun clears it at run completion.


Minor Issues

8. Dead code in shared.ts

serializeDates, deserializeDates, toDate, and RUN_DATE_FIELDS are exported but not imported anywhere in the package. Remove them or use them.

9. coordinatorHandle called multiple times in one request path (world.ts)

In the run-status-update branch of events.create, coordinatorHandle(client) is called up to three times. Assign it to a const once at the top of the branch.

10. nowMs() adds no value (shared.ts)

It is a one-liner wrapper over Date.now() with no additional behavior. Inline it.

11. No tests included

The PR description states the goal is to pass 84 SDK E2E tests, but no test files are included. Stub tests for the three documented gaps (live streaming, hooks.get, events.listByCorrelationId) would help track progress and catch regressions as gaps are closed.


Style Notes

  • README bullet points use em dashes. Project conventions (CLAUDE.md) specify no em dashes; use periods to separate sentences instead.
  • The package correctly uses rivet.dev throughout.

Summary

The overall design is sound and the event-sourcing model in workflowRunActor is a good fit for Rivet Actors. The two critical bugs (stuck in-flight messages on actor restart, Date serialization mismatch across the RPC boundary) will cause hard-to-diagnose test failures and should be addressed before running against the SDK test suite. The unbounded state growth and offset-cursor stability issues are important to document as tracked follow-ups if not fixed immediately.

…nt readFromStream and queue dispatch

Major changes:
- Rewrite types.ts to match vercel/workflow packages/world/src exactly:
  runId/stepId/hookId fields, eventType/eventData naming, Streamer.streams
  namespace, Wait type, specVersion support, HealthCheckPayload
- Implement readFromStream via actor event subscription (streams.get
  drains existing chunks then subscribes to streamAppended broadcast)
- Add in-process queue dispatch: createQueueHandler registers handler,
  queue() dispatches immediately after enqueue via fire-and-forget
- Add coordinator hookId index for hooks.get(hookId) lookups
- Update workflow-run actor for new event types (hook_received, wait_*)
  and field renaming (completedAt, correlationId-based step/hook keying)
- All files pass tsc --noEmit

https://claude.ai/code/session_014t3My5a42LHJ1umwV3vGgh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants