Skip to content

refactor(workflows): unify workflow state with the RuntimeEvent stream #6

@lespaceman

Description

@lespaceman

Problem

The workflow state machine (`idle → starting → running → completing → done`, ~2.4k LOC in `src/core/workflows/`) is persisted as a separate state stream from `RuntimeEvent`s. The SQLite schema has both:

  • `runtime_events` / `feed_events` — the canonical event log, replayed by `FeedMapper.bootstrap()` on resume.
  • `workflow_runs` — workflow metadata and `latest_state` snapshots, written via `persistRunState(snapshot)` from `workflowRunner`.

On resume, `useWorkflowSessionController` must manually `getLatestRun()` and hydrate the runner separately from the event stream. Run identity (in `RunLifecycle`) and workflow identity (in `workflowRunner`) are tracked by two parallel state machines.

Why this matters

  • Two-tier state model: cannot trivially answer "what is the active workflow context for this permission request?" without reading from two modules.
  • Workflow-aware features (sub-run branching, workflow-aware permissions, automatic retry-on-permission-deny) hit this seam.
  • A future agent reading the codebase has to learn two persistence stories; one would be enough.

Proposal sketch (needs grilling)

  • Define new `RuntimeEvent` kinds: `workflow.start`, `workflow.transition`, `workflow.iteration`, `workflow.end`.
  • `workflowRunner` shifts from imperative `persistRunState(snapshot)` to declarative event emission via the `RuntimeEvent` channel.
  • `FeedMapper` adds an internal seam (`WorkflowLifecycle`?) that owns workflow state derivation alongside `RunLifecycle`.
  • The `workflow_runs` table is either removed (state derived from events) or becomes a read-only projection.

Files involved

  • `src/core/workflows/workflowRunner.ts` (~320 LOC)
  • `src/infra/sessions/store.ts` (~450 LOC)
  • `src/infra/sessions/schema.ts` (~273 LOC)
  • `src/app/providers/RuntimeProvider.tsx`
  • `src/core/feed/internals/` (new `workflowLifecycle.ts`?)

Open questions

  • Is workflow state truly derivable from events, or are there imperative-only transitions (timer-driven, external signals) that don't fit?
  • Schema migration: keep `workflow_runs` as a denormalized cache, or fully eliminate it?
  • Does `WorkflowLifecycle` belong in `FeedMapper` (today event normalization, not business logic)?

Blast radius

Large. SQLite schema changes ripple to all resume paths; `FeedMapper` learns a new responsibility; bootstrap path is touched.

Pre-req

Recommended to land #5 (RelayCoordinator) first — the workflow runner also fires relay-shaped requests and would benefit from the coordinator existing.

Provenance

Surfaced by `/improve-codebase-architecture` + `/zoom-out` analysis. Ranked #2 of three candidates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions