Skip to content

feat(gastown): Workflows — visual DAG builder with scheduled and event-triggered execution #1943

@jrf0110

Description

@jrf0110

Summary

Add a workflow system to Gastown that lets users define multi-step automation pipelines. Workflows can run on a schedule (cron), on events (bead closed, convoy landed, PR merged), on webhooks, or manually. Each node in the workflow is an action — sling a bead, run a convoy, call an API, send a notification, run a script, or talk to the Mayor.

This system is built around the canonical Gastown/Gascity mental model: formula (authored reusable definition) → molecule (durable runtime instance) → step runs (per-step tracked work). The implementation treats the existing in-repo molecule internals (createMolecule, advanceMoleculeStep, rig_molecules table) as disposable legacy and rebuilds from a clean definition/run model.

Architecture Review: This spec addresses all 15 findings from the Architecture Review comment. See the Critique Findings Tracker at the bottom for a mapping of each finding to its resolution.


Motivation

Convoys are great for one-shot batches of related work. But many real-world patterns are recurring or event-driven:

  • "Every Monday, scan for dependency updates and create PRs for outdated packages"
  • "When a PR is merged to main, run integration tests and deploy to staging"
  • "Every night, generate a changelog from the last 24h of merged PRs"
  • "When an issue is labeled gastown, sling it as a bead automatically"
  • "Weekly: audit code for security vulnerabilities, create beads for anything critical"

Workflows turn Gastown from a task executor into an automation platform.


Naming / Canonical Compatibility

Canonical Term Cloud Product Label Data Model Name Description
Formula Workflow Definition workflow_definition Authored, reusable, versioned graph definition
Molecule Workflow Run workflow_run Instantiated durable execution of a definition revision
Step Run Step / Task step_run Per-step runtime state within a run
Wisp Ephemeral Run (future) Lightweight run that doesn't persist long-term
Digest Run Summary workflow_run.summary Final output/result of a completed run

Important: Avoid bare workflow as the sole object name in APIs where we actually mean definition or run — that ambiguity compounds quickly. Use definition vs run explicitly in the data model and API names.

Dual-label in UI: e.g. "Workflow Definitions (Formulas)", "Run Details (Molecule)". Preserve Gastown aliases in docs/tooltips/API descriptions.


Architecture Decision: One WorkflowDO Per Town

[Addresses P0-1] — Revised from "one WorkflowDO per definition" to "one WorkflowDO per town."

A new WorkflowDO Durable Object namespace. One WorkflowDO instance per town, keyed by townId. All workflow definitions, runs, and step states for a town live in a single DO's SQLite database.

Routing:  getWorkflowDOStub(env, townId)  // trivial — same as TownDO lookup

Why one-per-town, not one-per-definition?

  • Event matching is local: When TownDO writes a durable notification (bead closed, convoy landed), WorkflowDO reads it on the next alarm tick via a local SQLite query over all definitions' trigger configs. No fan-out RPC to N definition-scoped DOs.
  • Trivial routing: tRPC handlers just need townId to find the right WorkflowDO. One-per-definition would require a lookup table or definition→town mapping on every request.
  • SQLite is more than enough: Even a busy town will have dozens of definitions and thousands of run records — well within DO SQLite capacity.
  • Single alarm loop: One alarm manages all cron evaluations, event matching, and step progression for the town. No per-definition alarm overhead.

Why not TownDO? The TownDO alarm is already doing too much (#1855 — 47+ ops/tick). Workflows have fundamentally different scheduling needs (cron-based, potentially long-running waits). A dedicated DO keeps concerns separated and lets workflows scale independently.


Cross-DO Communication: Durable Outbox Pattern

[Addresses P0-2] — Replaced lossy cross-DO RPC with a durable outbox, mirroring the town_events pattern.

Cross-DO RPC is fire-and-forget. If the call fails, or either DO is evicted between the event and the notification, the workflow step stays waiting forever. We do not use direct RPC for event delivery.

Instead, we use a durable outbox pattern — the same pattern used by town_events for the TownDO reconciler (see events.ts: insertEventdrainEventsmarkProcessed).

How it works

TownDO side — when a workflow-relevant event fires (bead closed, convoy landed, PR merged), TownDO writes a row into a workflow_notifications table in its own SQLite:

INSERT INTO workflow_notifications (
  notification_id, event_type, town_id, bead_id, convoy_id, payload, created_at, processed_at
) VALUES (?, ?, ?, ?, ?, ?, ?, NULL)

This is a synchronous SQL write inside the existing alarm tick — zero latency, zero failure risk. TownDO also re-arms the WorkflowDO alarm to ensure timely processing.

WorkflowDO side — on each alarm tick, the engine drains unprocessed notifications:

SELECT * FROM workflow_notifications WHERE processed_at IS NULL ORDER BY created_at ASC

For each notification, it checks all enabled definitions' trigger configs for a match. On match, it starts a run. After processing, it marks the notification as processed (markProcessed pattern).

Sequence

TownDO                                    WorkflowDO
  │                                          │
  ├── bead.closed fires                      │
  ├── INSERT INTO workflow_notifications     │
  ├── workflowStub.ping() ─────────────────→│  (best-effort, just re-arms alarm)
  │                                          │
  │                                          ├── alarm tick
  │                                          ├── drain workflow_notifications
  │                                          ├── match event → definition triggers
  │                                          ├── start run(s)
  │                                          ├── mark notification processed
  │                                          └── re-arm alarm

The ping() RPC is purely an optimization to reduce latency — it re-arms the WorkflowDO alarm. If it fails, the WorkflowDO's own alarm (ticking every 60s when idle) will pick up the notification on the next cycle. No data is lost.

Notification table (in TownDO's SQLite)

CREATE TABLE IF NOT EXISTS workflow_notifications (
  notification_id TEXT PRIMARY KEY,
  event_type TEXT NOT NULL,            -- 'bead_closed', 'convoy_landed', 'pr_merged', etc.
  town_id TEXT NOT NULL,
  bead_id TEXT,
  convoy_id TEXT,
  payload TEXT NOT NULL DEFAULT '{}',  -- JSON: additional context
  created_at TEXT NOT NULL,
  processed_at TEXT                     -- NULL = unprocessed
);
CREATE INDEX idx_wf_notif_unprocessed ON workflow_notifications (processed_at) WHERE processed_at IS NULL;

Pruning: Processed notifications older than 7 days are cleaned up in TownDO's housekeeping phase (same pattern as pruneOldEvents).


Alarm Loop: Async Execution Model

[Addresses P0-3] — Node execution is async, not blocking. Mirrors TownDO's Phase 1 (SQL) → Phase 2 (side effects) separation.

Cloudflare DOs have a 30-second execution limit per alarm invocation. Node execution (Mayor Prompts, HTTP calls, script runs) must not run synchronously in the alarm tick.

The WorkflowDO alarm follows the same proven pattern as TownDO's reconciler:

Phase 0: Drain notifications

Read unprocessed notifications from TownDO (via shared SQLite or via the notification drain). Match against enabled definitions' triggers. For matches, create new workflow_run records and seed initial step states.

Phase 1: Reconcile — compute ready nodes, apply SQL mutations

Walk all active runs. For each run, evaluate the DAG:

  • Identify nodes whose dependencies are satisfied (all upstream nodes completed)
  • Transition ready nodes to dispatched status (SQL mutation)
  • Transition completed wait nodes based on received events
  • Collect async side effects as deferred functions (same as applyAction returning (() => Promise<void>) | null)

All SQL mutations happen synchronously in this phase. No I/O.

Phase 2: Execute side effects (async, best-effort)

// Identical to TownDO's Phase 2 pattern
if (sideEffects.length > 0) {
  const results = await Promise.allSettled(sideEffects.map(fn => fn()));
  for (const r of results) {
    if (r.status === 'fulfilled') metrics.sideEffectsSucceeded++;
    else metrics.sideEffectsFailed++;
  }
}

Side effects include:

  • townStub.slingBead(...) — create a bead on TownDO for action_sling_bead nodes
  • fetch(url, ...) — execute HTTP calls for action_http nodes
  • mayorStub.prompt(...) — invoke Mayor for action_mayor_prompt nodes

Phase 3: Housekeeping

Prune old completed runs, prune processed notifications, emit metrics.

Result delivery: self-notification pattern

When a side effect completes, it writes the result back as a workflow event (into WorkflowDO's own workflow_events table) and re-arms the alarm:

// Inside a side effect for an action_http node:
async () => {
  try {
    const response = await fetch(url, options);
    const body = await response.text();
    workflowEvents.insertEvent(sql, 'step_completed', {
      run_id: runId,
      step_run_id: stepRunId,
      payload: { status: response.status, body },
    });
  } catch (err) {
    workflowEvents.insertEvent(sql, 'step_failed', {
      run_id: runId,
      step_run_id: stepRunId,
      payload: { error: String(err) },
    });
  }
  // Re-arm alarm so next tick processes the result
  ctx.storage.setAlarm(Date.now() + 100);
}

The next alarm tick drains these events, updates step status, and continues DAG traversal. This means:

  • No blocking: The alarm tick returns quickly. Side effects run concurrently.
  • Durable: Results are written to SQLite. If the DO is evicted mid-execution, the next alarm retries any dispatched steps that never wrote a result (with exponential backoff).
  • Observable: Every state transition is recorded.

Step status lifecycle

pending → ready → dispatched → completed
                            → failed
                            → timed_out
         ready → skipped    (upstream failure with on_failure: 'skip')

Failure Handling: on_failure Semantics

[Addresses P0-4] — Diamond pattern and configurable failure propagation.

Each edge in the graph can specify an on_failure policy (default: 'skip'):

type WorkflowEdge = {
  id: string;
  source: string;       // source node ID
  target: string;       // target node ID
  condition?: EdgeCondition | null;
  on_failure: 'skip' | 'continue' | 'fail_run';
};

Semantics:

Policy Behavior
skip (default) If the source node failed, downstream nodes reachable only through this edge are marked skipped. Other paths to the same node are unaffected.
continue Downstream nodes execute regardless. The failed step's output is null.
fail_run The entire run is marked failed immediately. All pending/ready/dispatched steps are cancelled.

Diamond resolution: In a diamond (A→B, A→C, B→D, C→D), if B completes and C fails:

  • Edge C→D has on_failure: 'skip' → D is skipped (both paths must succeed)
  • Edge C→D has on_failure: 'continue' → D executes with C's output as null
  • Edge C→D has on_failure: 'fail_run' → entire run fails

The engine evaluates all incoming edges to a node. A node is ready when:

  • All incoming edges with on_failure: 'skip' or 'continue' have their source completed or failed
  • No incoming edge with on_failure: 'fail_run' has a failed source
  • At least one incoming edge has a completed (not failed) source, OR all failed sources have on_failure: 'continue'

Wrangler Configuration

Add to wrangler.jsonc:

// In durable_objects.bindings:
{ "name": "WORKFLOW", "class_name": "WorkflowDO" }

// In migrations (new tag):
{ "tag": "v3", "new_sqlite_classes": ["WorkflowDO"] }

File: src/dos/Workflow.do.ts with sub-modules in src/dos/workflow/:

dos/
  Workflow.do.ts              # Class definition, RPC surface, alarm loop
  workflow/
    definitions.ts            # Definition CRUD, versioning
    runs.ts                   # Run lifecycle, step state management
    engine.ts                 # DAG traversal, node readiness, side effect dispatch
    scheduling.ts             # Cron evaluation, trigger matching
    nodes.ts                  # Node type implementations (side effect factories)
    events.ts                 # Workflow-scoped event recording (outbox pattern)

SQLite Schema (WorkflowDO)

workflow_definitions

CREATE TABLE IF NOT EXISTS workflow_definitions (
  definition_id TEXT PRIMARY KEY,
  town_id TEXT NOT NULL,
  name TEXT NOT NULL,
  description TEXT DEFAULT '',
  revision INTEGER NOT NULL DEFAULT 1,
  enabled INTEGER NOT NULL DEFAULT 0,           -- 0 = disabled, 1 = enabled
  graph TEXT NOT NULL DEFAULT '{}',              -- JSON: nodes (without positions), edges (with on_failure)
  layout TEXT NOT NULL DEFAULT '{}',             -- JSON: React Flow positions only (visual, non-execution)
  variables_schema TEXT NOT NULL DEFAULT '{}',   -- JSON Schema for run-time variables
  triggers TEXT NOT NULL DEFAULT '[]',           -- JSON array of trigger configs
  schedule TEXT,                                 -- Cron expression (null = no schedule)
  timezone TEXT NOT NULL DEFAULT 'UTC',          -- IANA timezone for cron evaluation (SINGLE source of truth)
  durability TEXT NOT NULL DEFAULT 'standard' CHECK(durability IN ('standard', 'ephemeral')),
  max_concurrent_runs INTEGER NOT NULL DEFAULT 5,
  last_run_id TEXT,
  last_run_status TEXT,
  last_run_at TEXT,                              -- [P1-5] Timestamp of last run start
  created_at TEXT NOT NULL,
  updated_at TEXT NOT NULL
);
CREATE INDEX idx_wf_def_town ON workflow_definitions (town_id);
CREATE INDEX idx_wf_def_enabled ON workflow_definitions (enabled) WHERE enabled = 1;
CREATE INDEX idx_wf_def_schedule ON workflow_definitions (schedule) WHERE schedule IS NOT NULL;

[P1-5]: last_run_at column added.
[P1-7]: graph and layout are separate columns. graph contains execution-relevant data (nodes without positions, edges with on_failure config). layout contains React Flow position data. Layout-only changes don't bump the revision.
[P2-15]: timezone is defined at the definition level only (single source of truth). Removed from trigger config.

workflow_definition_revisions — deferred

[Addresses P1-6]: Revision history is deferred to Phase 3 (not MVP). For now, only the current graph/layout is stored. The revision field tracks the version number. When we add revision history, it will be:

-- DEFERRED: not in MVP or Phase 2
CREATE TABLE IF NOT EXISTS workflow_definition_revisions (
  revision_id TEXT PRIMARY KEY,
  definition_id TEXT NOT NULL,
  revision INTEGER NOT NULL,
  graph TEXT NOT NULL,
  layout TEXT NOT NULL,
  variables_schema TEXT NOT NULL,
  triggers TEXT NOT NULL,
  schedule TEXT,
  created_at TEXT NOT NULL,
  created_by TEXT,                              -- user/agent who made the change
  UNIQUE(definition_id, revision)
);

workflow_runs

CREATE TABLE IF NOT EXISTS workflow_runs (
  run_id TEXT PRIMARY KEY,
  definition_id TEXT NOT NULL,
  definition_revision INTEGER NOT NULL,         -- pinned at run creation
  graph_snapshot TEXT NOT NULL,                  -- full graph at time of run (immutable)
  status TEXT NOT NULL DEFAULT 'pending'
    CHECK(status IN ('pending', 'running', 'completed', 'failed', 'cancelled', 'timed_out')),
  trigger_type TEXT NOT NULL,                   -- 'manual', 'schedule', 'event', 'webhook'
  trigger_data TEXT NOT NULL DEFAULT '{}',       -- JSON: trigger context
  variables TEXT NOT NULL DEFAULT '{}',          -- JSON: resolved runtime variables
  summary TEXT,                                  -- Final output/digest
  error TEXT,                                    -- Error message if failed
  started_at TEXT NOT NULL,
  completed_at TEXT
);
CREATE INDEX idx_wf_run_def ON workflow_runs (definition_id);
CREATE INDEX idx_wf_run_status ON workflow_runs (status) WHERE status IN ('pending', 'running');

workflow_step_runs

CREATE TABLE IF NOT EXISTS workflow_step_runs (
  step_run_id TEXT PRIMARY KEY,
  run_id TEXT NOT NULL,
  node_id TEXT NOT NULL,                        -- reference to graph node
  status TEXT NOT NULL DEFAULT 'pending'
    CHECK(status IN ('pending', 'ready', 'dispatched', 'completed', 'failed', 'skipped', 'timed_out')),
  output TEXT,                                   -- JSON: step output
  error TEXT,
  bead_id TEXT,                                  -- if step created a bead
  dispatch_attempts INTEGER NOT NULL DEFAULT 0,
  last_dispatch_at TEXT,
  started_at TEXT,
  completed_at TEXT
);
CREATE INDEX idx_wf_step_run ON workflow_step_runs (run_id);
CREATE INDEX idx_wf_step_bead ON workflow_step_runs (bead_id) WHERE bead_id IS NOT NULL;
CREATE INDEX idx_wf_step_status ON workflow_step_runs (run_id, status);

workflow_events (WorkflowDO internal outbox)

CREATE TABLE IF NOT EXISTS workflow_events (
  event_id TEXT PRIMARY KEY,
  event_type TEXT NOT NULL,                     -- 'step_completed', 'step_failed', 'notification_received', etc.
  run_id TEXT,
  step_run_id TEXT,
  payload TEXT NOT NULL DEFAULT '{}',
  created_at TEXT NOT NULL,
  processed_at TEXT
);
CREATE INDEX idx_wf_evt_unprocessed ON workflow_events (processed_at) WHERE processed_at IS NULL;

Graph Definition Model

Separate graph from layout

[Addresses P1-7]

The graph column stores execution-relevant structure:

type WorkflowGraph = {
  nodes: WorkflowNode[];    // type, config — NO position data
  edges: WorkflowEdge[];    // source, target, condition, on_failure
};

The layout column stores visual-only data:

type WorkflowLayout = {
  positions: Record<string, { x: number; y: number }>;  // nodeId → position
  viewport?: { x: number; y: number; zoom: number };
};

Layout changes (dragging nodes, zooming) update layout without bumping revision. Graph changes (adding/removing nodes, changing edges, editing node config) bump revision.

Node types

Phase 1 (MVP): 3 node types

  • trigger_manual — manual run trigger
  • action_sling_bead — create a bead on TownDO
  • control_condition — if/else branching

Phase 2: +5 node types

  • trigger_schedule — cron-based trigger
  • trigger_event — event-based trigger (bead closed, convoy landed)
  • action_sling_convoy — create a convoy
  • control_wait_bead — wait for a bead to reach a status
  • control_delay — wait for a duration

Phase 3: +7 node types

  • trigger_webhook — HTTP webhook trigger
  • action_mayor_prompt — LLM inference via Mayor
  • action_http — arbitrary HTTP request
  • action_script — run a script (see security note below)
  • control_wait_convoy — wait for convoy to land
  • output_notify — send a notification
  • output_update_bead — update an existing bead

Node Type Config Shapes

Each node type has a specific config schema validated at save time:

Node Type Config Fields
trigger_schedule { cron: string } (timezone is definition-level)
trigger_event { event_type: 'bead_closed' | 'convoy_landed' | 'pr_merged', filter?: Record<string, string> }
trigger_webhook { secret?: string } (URL is auto-generated)
trigger_manual { input_schema?: Record<string, VariableSpec> }
action_sling_bead { rig_id: string, title: string, body?: string, priority?: string, labels?: string[] }
action_sling_convoy { rig_id: string, title: string, beads: ConvoyBeadSpec[] }
action_mayor_prompt { prompt_template: string } — supports {{step.<nodeId>.output}} interpolation
action_http { url: string, method: string, headers?: Record<string,string>, body_template?: string, allowed_domains?: string[] }
action_script { script: string, timeout_ms?: number } — runs in own container
control_wait_bead { bead_ref: string, target_status: 'closed' | 'failed', timeout_ms?: number }
control_wait_convoy { convoy_ref: string, timeout_ms?: number }
control_condition { expression: string } — evaluates to boolean, branches via edge conditions
control_delay { duration_ms: number }
output_notify { channel: 'slack' | 'webhook', target: string, message_template: string }
output_update_bead { bead_ref: string, fields: Record<string, unknown> }

Template interpolation

Use {{step.<nodeId>.output.<path>}} syntax for referencing outputs from upstream steps.

[Addresses P2-13]: Template interpolation in HTTP URL and body templates could exfiltrate data. Mitigation (deferred to implementation):

  • action_http node config includes an optional allowed_domains list. If set, the resolved URL must match.
  • Template values are sanitized (no control characters, length limits).
  • Full sandboxing of template interpolation is tracked as a follow-up item.

action_script security

[Addresses P2-14]: action_script nodes run in their own short-lived container invocation, NOT in an agent's container. They get a fresh, isolated environment with:

  • Read-only access to the repo
  • No access to agent state or other containers
  • Strict timeout (30s default, configurable up to 5min)
  • Resource limits (memory, CPU)

This node type is deferred to Phase 4. The container strategy will be finalized during implementation.


DAG Execution Engine

Readiness evaluation

On each alarm tick, for each active run:

  1. Query all step_run records for the run
  2. For each pending step, check if all incoming edges' source nodes are in a terminal state (completed, failed, skipped)
  3. Apply on_failure semantics per edge to determine if the step should be ready, skipped, or if the run should fail
  4. Transition ready steps to dispatched (SQL mutation)
  5. Return side effect functions for dispatched steps

Parallel execution

Steps with no dependency between them execute in parallel — they're all transitioned to dispatched in the same alarm tick, and their side effects run concurrently via Promise.allSettled.

Phase 1 (MVP): Linear execution only (no parallel branches). Simplifies the engine significantly.

Phase 2+: Full parallel branch support.

Cycle prevention

  • At save time: Topological sort validation. If the graph has a cycle, the save is rejected with a validation error.
  • At runtime: Max step count guard (1000 steps per run). If exceeded, run is marked timed_out.

Run cancellation

When a run is cancelled:

  1. All pending, ready, and dispatched steps are marked skipped
  2. In-flight side effects (already dispatched) will complete but their results are ignored (step is already skipped)
  3. Run status → cancelled

Concurrent run limits

max_concurrent_runs (default: 5) on the definition. If a trigger fires while at the limit, the run is queued with pending status and starts when a slot opens.


Cron Scheduling

Evaluation

On each alarm tick, the engine queries all enabled definitions with a non-null schedule:

SELECT * FROM workflow_definitions
WHERE enabled = 1 AND schedule IS NOT NULL

For each, evaluate the cron expression against the definition's timezone. If the current time matches the cron and last_run_at is before the previous matching tick, start a new run.

Catch-up policy

[Addresses P2-11]: Skip missed ticks. On DO re-wake after eviction, only fire if the NEXT cron tick is in the future and last_run_at is before it. Do not backfill. This prevents a DO waking up after 4 hours of eviction from firing 240 runs for a per-minute schedule.

function shouldFireCron(schedule: string, timezone: string, lastRunAt: string | null): boolean {
  const now = Date.now();
  const prevTick = getPreviousCronTick(schedule, timezone, now);
  const nextTick = getNextCronTick(schedule, timezone, now);

  // Only fire if we haven't already run for this tick window
  if (lastRunAt && new Date(lastRunAt).getTime() >= prevTick) return false;

  // Only fire if the next tick is in the future (we're not waking up to ancient backlog)
  return nextTick > now;
}

Permissions

[Addresses P1-9]

Follows the resolveTownOwnership pattern:

Action Who
Create / edit / delete definitions Town owner, org admin
Enable / disable definitions Town owner, org admin
Trigger manual runs Town owner, org admin, org members
View definitions and runs All org members
Webhook triggers Authenticated via webhook secret (per-definition), rate-limited

Webhook secrets are generated per-definition and stored in the triggers JSON config. The webhook endpoint validates the secret before creating a run.


Billing & Rate Limiting

[Addresses P1-10]

Limit Free Pro Enterprise
Workflow definitions per town 3 20 Unlimited
Runs per month 50 1,000 10,000
Steps per run 10 50 200
LLM tokens per mayor_prompt step 4K 16K 64K
LLM token budget per workflow per month 100K 1M 10M
Min cron interval 1 hour 5 min 1 min
Webhook triggers per minute 5 30 100
Concurrent runs per definition 1 5 20

Rate limiting is enforced at:

  1. Run creation: Check monthly run count before starting
  2. Step dispatch: Check step count per run before transitioning to dispatched
  3. Webhook endpoint: Token bucket rate limiter per definition
  4. Cron evaluation: Min interval check against tier

Tier info is read from the town's billing context (same pattern as existing feature gates).


Auto-save & Draft Model

[Addresses P2-12]

The visual editor uses a draft model:

  • Auto-save writes to a draft_graph and draft_layout column (or a separate workflow_drafts table). No revision bump, no validation. The editor reads from the draft on load.
  • "Save & Publish" validates the graph (cycle detection, required fields, node config validation), copies draft → graph/layout, bumps revision, and optionally updates the enabled state.
  • If there's no draft (draft columns are null), the editor loads from the published graph/layout.

This prevents half-edited graphs from executing while still giving users a seamless editing experience.


tRPC API

Definition Management

workflows.definitions.list    // { townId } → WorkflowDefinition[]
workflows.definitions.get     // { townId, definitionId } → WorkflowDefinition
workflows.definitions.create  // { townId, name, graph, layout?, triggers?, schedule? } → WorkflowDefinition
workflows.definitions.update  // { townId, definitionId, ...partial } → WorkflowDefinition
workflows.definitions.delete  // { townId, definitionId } → void
workflows.definitions.setEnabled  // { townId, definitionId, enabled } → void
workflows.definitions.validate    // { townId, graph } → ValidationResult
workflows.definitions.saveDraft   // { townId, definitionId, graph, layout } → void
workflows.definitions.publishDraft // { townId, definitionId } → WorkflowDefinition (validates + bumps revision)

Run Management

workflows.runs.list     // { townId, definitionId?, status? } → WorkflowRun[]
workflows.runs.get      // { townId, runId } → WorkflowRun & { steps: StepRun[] }
workflows.runs.trigger  // { townId, definitionId, variables? } → WorkflowRun
workflows.runs.cancel   // { townId, runId } → void

Webhook HTTP endpoint

POST /api/webhooks/workflow/:definitionId
Headers: X-Webhook-Secret: <secret>
Body: JSON payload (passed as trigger_data)

UI: Phased Approach

Phase 1 (MVP): Form-based editor

4 routes:

  • /towns/:townId/workflows — list of workflow definitions with status badges
  • /towns/:townId/workflows/new — create form (name, description, linear step list)
  • /towns/:townId/workflows/:defId/edit — edit form (same as create, plus enable/disable toggle)
  • /towns/:townId/workflows/:defId/runs — run history list with status, duration, trigger type

The MVP editor is a form-based linear step editor, not React Flow. Users add steps in sequence. Each step has a type selector and config form. This is sufficient for linear workflows and dramatically reduces implementation complexity.

Phase 3: React Flow Visual Editor

Upgrade to a full visual DAG editor:

  • Three-panel layout: node palette (left), canvas (center), inspector (right)
  • Custom node components with status badges
  • Drag-and-drop from palette
  • Edge drawing with on_failure config in the inspector
  • Auto-save → draft, explicit publish
  • Run visualization overlay (step statuses on the graph)

Event Triggers / TownDO Integration

When Gastown events occur, TownDO writes them to the workflow_notifications outbox table (see Cross-DO Communication above). WorkflowDO drains these on each alarm tick and matches against trigger configs.

Event Types Available as Triggers

Using the existing TownEventType enum plus new workflow-specific events:

Event Payload When
bead_closed { bead_id, bead_type, rig_id, title } Any bead reaches closed status
bead_failed { bead_id, bead_type, rig_id, title, error } Any bead reaches failed status
convoy_landed { convoy_id, title, bead_count } All beads in a convoy complete
convoy_started { convoy_id, title } Convoy begins execution
pr_merged { bead_id, rig_id, pr_url } PR associated with a bead is merged
pr_created { bead_id, rig_id, pr_url } PR is created for a bead

Agent Ergonomics / Compatibility Bridge

Preserve gt_mol_current / gt_mol_advance semantics, backed by the new run model:

  • gt_mol_current → query WorkflowDO for the current ready step(s) in the active run
  • gt_mol_advance → mark a step as completed and advance the DAG

These are compatibility shims maintained as plugin tools, NOT structural constraints on the data model. The new API is the primary interface; these exist for agents that already use the molecule protocol.


Implementation Plan

Phase 0: Legacy Cleanup (1 week)

Goal: Mark current molecule internals as disposable legacy.

  • Audit all references to rig_molecules table, createMolecule, advanceMoleculeStep, MoleculeStepper.tsx, FormulaLibrary
  • Classify each as: keep (conceptually), shim (temporarily), or delete
  • Fix the known formula shape mismatch bug: createMolecule() treats formula as array but API sends { steps: [...] }
  • Remove or mark stale schema/docs as superseded
  • Document the canonical mapping (formula→definition, molecule→run, step→step_run) in AGENTS.md or a spec doc

Phase 1: MVP (2 weeks)

[Addresses P1-8]: Scoped to a realistic 2-week deliverable.

Goal: End-to-end workflow execution with manual trigger, linear steps, form-based editor.

Backend:

  • WorkflowDO skeleton (one per town, keyed by townId)
  • workflow_definitions table + CRUD
  • workflow_runs + workflow_step_runs tables
  • workflow_events table (internal outbox)
  • Alarm loop with Phase 0/1/2 pattern
  • Linear DAG engine (no parallel branches)
  • Manual trigger only
  • 3 node types: trigger_manual, action_sling_bead, control_condition
  • tRPC API: definition CRUD + run trigger/list/get/cancel
  • workflow_notifications table in TownDO + write path for bead_closed events (wiring only — event triggers activate in Phase 2)

Frontend:

  • Workflow list page with create button
  • Form-based linear editor (add/remove/reorder steps)
  • Run list with status badges
  • Run detail view (step list with status, output, errors)

Not in Phase 1: Cron, event triggers, webhooks, parallel branches, React Flow, wait nodes, Mayor/HTTP/script nodes, revision history, draft model.

Phase 2: Event Triggers & Scheduling (2-3 weeks)

  • Event trigger support (bead_closed, convoy_landed, pr_merged)
  • Cron scheduling with timezone support + catch-up policy (skip missed ticks)
  • trigger_schedule and trigger_event node types
  • control_wait_bead, control_delay node types
  • action_sling_convoy node type
  • Parallel branch execution in the DAG engine
  • Concurrent run limiting
  • Billing/rate limiting enforcement
  • Permissions enforcement

Phase 3: Visual Editor & Advanced Nodes (3-4 weeks)

  • React Flow visual editor (three-panel layout)
  • Auto-save draft model + publish flow
  • trigger_webhook + webhook HTTP endpoint
  • action_mayor_prompt node type
  • action_http node type (with allowed_domains)
  • output_notify, output_update_bead node types
  • Revision history (workflow_definition_revisions table)
  • Run visualization overlay on the graph

Phase 4: Polish & Templates (1-2 weeks)

  • action_script node type (own container, sandboxed)
  • Workflow templates (pre-built definitions for common patterns)
  • Import/export definitions as JSON
  • Run summary/digest generation
  • Metrics dashboard (runs/day, success rate, avg duration)
  • gt_mol_current / gt_mol_advance compatibility shims

Total realistic estimate: 9-12 weeks (including Phase 0)

[Addresses P1-8]: Previous estimate of 5-6 weeks was too aggressive. The reconciler (simpler scope) took ~3 weeks. This system adds a new DO + DAG engine + editor + cron + cross-DO communication + 15 node types + tRPC API + run visualization.

Future scope (not in this issue)

  • Wisps (ephemeral runs that don't persist)
  • Sub-workflow composition (a node that triggers another workflow)
  • Approval gates (human-in-the-loop nodes)
  • Workflow marketplace / sharing

References


Critique Findings Tracker

All 15 findings from the Architecture Review comment have been addressed:

# Priority Finding Resolution Section
1 P0 Instance model: one WorkflowDO per town, not per definition ✅ Rewritten — one WorkflowDO per town, keyed by townId Architecture Decision
2 P0 Event delivery: durable outbox, not lossy RPC ✅ Rewritten — workflow_notifications outbox in TownDO, drain pattern Cross-DO Communication
3 P0 Alarm loop: async execution, no blocking I/O ✅ Rewritten — Phase 0/1/2 pattern, side effects via Promise.allSettled, self-notification Alarm Loop
4 P0 Diamond failure: on_failure semantics ✅ Added — skip, continue, fail_run per edge Failure Handling
5 P1 Schema missing last_run_at ✅ Added to workflow_definitions schema SQLite Schema
6 P1 No revision history storage ✅ Acknowledged — deferred to Phase 3, schema sketched SQLite Schema
7 P1 Layout leaks into execution graph ✅ Separated — graph and layout are distinct columns Graph Definition Model
8 P1 Effort estimates too aggressive ✅ Revised — 9-12 weeks total, 2-week MVP scoped Implementation Plan
9 P1 Permissions model missing ✅ Added — follows resolveTownOwnership pattern Permissions
10 P1 Billing/rate limiting missing ✅ Added — per-tier limits table, enforcement points Billing & Rate Limiting
11 P2 Cron catch-up policy ✅ Explicit "skip missed ticks" with pseudocode Cron Scheduling
12 P2 Auto-save + validation conflict ✅ Draft model — auto-save writes draft, publish validates Auto-save & Draft Model
13 P2 Template interpolation security ✅ Acknowledged — allowed_domains, deferred to impl Graph Definition Model
14 P2 action_script container strategy ✅ Stated — own container, deferred to Phase 4 Graph Definition Model
15 P2 Timezone in two places ✅ Fixed — definition-level only, removed from triggers SQLite Schema

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Post-launchenhancementNew feature or requestgt:coreReconciler, state machine, bead lifecycle, convoy flowgt:uiDashboard, settings, terminal, drawers

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions