Summary
Add a workflow system to Gastown that lets users define multi-step automation pipelines. Workflows can run on a schedule (cron), on events (bead closed, convoy landed, PR merged), on webhooks, or manually. Each node in the workflow is an action — sling a bead, run a convoy, call an API, send a notification, run a script, or talk to the Mayor.
This system is built around the canonical Gastown/Gascity mental model: formula (authored reusable definition) → molecule (durable runtime instance) → step runs (per-step tracked work). The implementation treats the existing in-repo molecule internals (createMolecule, advanceMoleculeStep, rig_molecules table) as disposable legacy and rebuilds from a clean definition/run model.
Architecture Review: This spec addresses all 15 findings from the Architecture Review comment. See the Critique Findings Tracker at the bottom for a mapping of each finding to its resolution.
Motivation
Convoys are great for one-shot batches of related work. But many real-world patterns are recurring or event-driven:
- "Every Monday, scan for dependency updates and create PRs for outdated packages"
- "When a PR is merged to main, run integration tests and deploy to staging"
- "Every night, generate a changelog from the last 24h of merged PRs"
- "When an issue is labeled
gastown, sling it as a bead automatically"
- "Weekly: audit code for security vulnerabilities, create beads for anything critical"
Workflows turn Gastown from a task executor into an automation platform.
Naming / Canonical Compatibility
| Canonical Term |
Cloud Product Label |
Data Model Name |
Description |
| Formula |
Workflow Definition |
workflow_definition |
Authored, reusable, versioned graph definition |
| Molecule |
Workflow Run |
workflow_run |
Instantiated durable execution of a definition revision |
| Step |
Run Step / Task |
step_run |
Per-step runtime state within a run |
| Wisp |
Ephemeral Run |
(future) |
Lightweight run that doesn't persist long-term |
| Digest |
Run Summary |
workflow_run.summary |
Final output/result of a completed run |
Important: Avoid bare workflow as the sole object name in APIs where we actually mean definition or run — that ambiguity compounds quickly. Use definition vs run explicitly in the data model and API names.
Dual-label in UI: e.g. "Workflow Definitions (Formulas)", "Run Details (Molecule)". Preserve Gastown aliases in docs/tooltips/API descriptions.
Architecture Decision: One WorkflowDO Per Town
[Addresses P0-1] — Revised from "one WorkflowDO per definition" to "one WorkflowDO per town."
A new WorkflowDO Durable Object namespace. One WorkflowDO instance per town, keyed by townId. All workflow definitions, runs, and step states for a town live in a single DO's SQLite database.
Routing: getWorkflowDOStub(env, townId) // trivial — same as TownDO lookup
Why one-per-town, not one-per-definition?
- Event matching is local: When TownDO writes a durable notification (bead closed, convoy landed), WorkflowDO reads it on the next alarm tick via a local SQLite query over all definitions' trigger configs. No fan-out RPC to N definition-scoped DOs.
- Trivial routing: tRPC handlers just need
townId to find the right WorkflowDO. One-per-definition would require a lookup table or definition→town mapping on every request.
- SQLite is more than enough: Even a busy town will have dozens of definitions and thousands of run records — well within DO SQLite capacity.
- Single alarm loop: One alarm manages all cron evaluations, event matching, and step progression for the town. No per-definition alarm overhead.
Why not TownDO? The TownDO alarm is already doing too much (#1855 — 47+ ops/tick). Workflows have fundamentally different scheduling needs (cron-based, potentially long-running waits). A dedicated DO keeps concerns separated and lets workflows scale independently.
Cross-DO Communication: Durable Outbox Pattern
[Addresses P0-2] — Replaced lossy cross-DO RPC with a durable outbox, mirroring the town_events pattern.
Cross-DO RPC is fire-and-forget. If the call fails, or either DO is evicted between the event and the notification, the workflow step stays waiting forever. We do not use direct RPC for event delivery.
Instead, we use a durable outbox pattern — the same pattern used by town_events for the TownDO reconciler (see events.ts: insertEvent → drainEvents → markProcessed).
How it works
TownDO side — when a workflow-relevant event fires (bead closed, convoy landed, PR merged), TownDO writes a row into a workflow_notifications table in its own SQLite:
INSERT INTO workflow_notifications (
notification_id, event_type, town_id, bead_id, convoy_id, payload, created_at, processed_at
) VALUES (?, ?, ?, ?, ?, ?, ?, NULL)
This is a synchronous SQL write inside the existing alarm tick — zero latency, zero failure risk. TownDO also re-arms the WorkflowDO alarm to ensure timely processing.
WorkflowDO side — on each alarm tick, the engine drains unprocessed notifications:
SELECT * FROM workflow_notifications WHERE processed_at IS NULL ORDER BY created_at ASC
For each notification, it checks all enabled definitions' trigger configs for a match. On match, it starts a run. After processing, it marks the notification as processed (markProcessed pattern).
Sequence
TownDO WorkflowDO
│ │
├── bead.closed fires │
├── INSERT INTO workflow_notifications │
├── workflowStub.ping() ─────────────────→│ (best-effort, just re-arms alarm)
│ │
│ ├── alarm tick
│ ├── drain workflow_notifications
│ ├── match event → definition triggers
│ ├── start run(s)
│ ├── mark notification processed
│ └── re-arm alarm
The ping() RPC is purely an optimization to reduce latency — it re-arms the WorkflowDO alarm. If it fails, the WorkflowDO's own alarm (ticking every 60s when idle) will pick up the notification on the next cycle. No data is lost.
Notification table (in TownDO's SQLite)
CREATE TABLE IF NOT EXISTS workflow_notifications (
notification_id TEXT PRIMARY KEY,
event_type TEXT NOT NULL, -- 'bead_closed', 'convoy_landed', 'pr_merged', etc.
town_id TEXT NOT NULL,
bead_id TEXT,
convoy_id TEXT,
payload TEXT NOT NULL DEFAULT '{}', -- JSON: additional context
created_at TEXT NOT NULL,
processed_at TEXT -- NULL = unprocessed
);
CREATE INDEX idx_wf_notif_unprocessed ON workflow_notifications (processed_at) WHERE processed_at IS NULL;
Pruning: Processed notifications older than 7 days are cleaned up in TownDO's housekeeping phase (same pattern as pruneOldEvents).
Alarm Loop: Async Execution Model
[Addresses P0-3] — Node execution is async, not blocking. Mirrors TownDO's Phase 1 (SQL) → Phase 2 (side effects) separation.
Cloudflare DOs have a 30-second execution limit per alarm invocation. Node execution (Mayor Prompts, HTTP calls, script runs) must not run synchronously in the alarm tick.
The WorkflowDO alarm follows the same proven pattern as TownDO's reconciler:
Phase 0: Drain notifications
Read unprocessed notifications from TownDO (via shared SQLite or via the notification drain). Match against enabled definitions' triggers. For matches, create new workflow_run records and seed initial step states.
Phase 1: Reconcile — compute ready nodes, apply SQL mutations
Walk all active runs. For each run, evaluate the DAG:
- Identify nodes whose dependencies are satisfied (all upstream nodes completed)
- Transition ready nodes to
dispatched status (SQL mutation)
- Transition completed wait nodes based on received events
- Collect async side effects as deferred functions (same as
applyAction returning (() => Promise<void>) | null)
All SQL mutations happen synchronously in this phase. No I/O.
Phase 2: Execute side effects (async, best-effort)
// Identical to TownDO's Phase 2 pattern
if (sideEffects.length > 0) {
const results = await Promise.allSettled(sideEffects.map(fn => fn()));
for (const r of results) {
if (r.status === 'fulfilled') metrics.sideEffectsSucceeded++;
else metrics.sideEffectsFailed++;
}
}
Side effects include:
townStub.slingBead(...) — create a bead on TownDO for action_sling_bead nodes
fetch(url, ...) — execute HTTP calls for action_http nodes
mayorStub.prompt(...) — invoke Mayor for action_mayor_prompt nodes
Phase 3: Housekeeping
Prune old completed runs, prune processed notifications, emit metrics.
Result delivery: self-notification pattern
When a side effect completes, it writes the result back as a workflow event (into WorkflowDO's own workflow_events table) and re-arms the alarm:
// Inside a side effect for an action_http node:
async () => {
try {
const response = await fetch(url, options);
const body = await response.text();
workflowEvents.insertEvent(sql, 'step_completed', {
run_id: runId,
step_run_id: stepRunId,
payload: { status: response.status, body },
});
} catch (err) {
workflowEvents.insertEvent(sql, 'step_failed', {
run_id: runId,
step_run_id: stepRunId,
payload: { error: String(err) },
});
}
// Re-arm alarm so next tick processes the result
ctx.storage.setAlarm(Date.now() + 100);
}
The next alarm tick drains these events, updates step status, and continues DAG traversal. This means:
- No blocking: The alarm tick returns quickly. Side effects run concurrently.
- Durable: Results are written to SQLite. If the DO is evicted mid-execution, the next alarm retries any
dispatched steps that never wrote a result (with exponential backoff).
- Observable: Every state transition is recorded.
Step status lifecycle
pending → ready → dispatched → completed
→ failed
→ timed_out
ready → skipped (upstream failure with on_failure: 'skip')
Failure Handling: on_failure Semantics
[Addresses P0-4] — Diamond pattern and configurable failure propagation.
Each edge in the graph can specify an on_failure policy (default: 'skip'):
type WorkflowEdge = {
id: string;
source: string; // source node ID
target: string; // target node ID
condition?: EdgeCondition | null;
on_failure: 'skip' | 'continue' | 'fail_run';
};
Semantics:
| Policy |
Behavior |
skip (default) |
If the source node failed, downstream nodes reachable only through this edge are marked skipped. Other paths to the same node are unaffected. |
continue |
Downstream nodes execute regardless. The failed step's output is null. |
fail_run |
The entire run is marked failed immediately. All pending/ready/dispatched steps are cancelled. |
Diamond resolution: In a diamond (A→B, A→C, B→D, C→D), if B completes and C fails:
- Edge C→D has
on_failure: 'skip' → D is skipped (both paths must succeed)
- Edge C→D has
on_failure: 'continue' → D executes with C's output as null
- Edge C→D has
on_failure: 'fail_run' → entire run fails
The engine evaluates all incoming edges to a node. A node is ready when:
- All incoming edges with
on_failure: 'skip' or 'continue' have their source completed or failed
- No incoming edge with
on_failure: 'fail_run' has a failed source
- At least one incoming edge has a completed (not failed) source, OR all failed sources have
on_failure: 'continue'
Wrangler Configuration
Add to wrangler.jsonc:
File: src/dos/Workflow.do.ts with sub-modules in src/dos/workflow/:
dos/
Workflow.do.ts # Class definition, RPC surface, alarm loop
workflow/
definitions.ts # Definition CRUD, versioning
runs.ts # Run lifecycle, step state management
engine.ts # DAG traversal, node readiness, side effect dispatch
scheduling.ts # Cron evaluation, trigger matching
nodes.ts # Node type implementations (side effect factories)
events.ts # Workflow-scoped event recording (outbox pattern)
SQLite Schema (WorkflowDO)
workflow_definitions
CREATE TABLE IF NOT EXISTS workflow_definitions (
definition_id TEXT PRIMARY KEY,
town_id TEXT NOT NULL,
name TEXT NOT NULL,
description TEXT DEFAULT '',
revision INTEGER NOT NULL DEFAULT 1,
enabled INTEGER NOT NULL DEFAULT 0, -- 0 = disabled, 1 = enabled
graph TEXT NOT NULL DEFAULT '{}', -- JSON: nodes (without positions), edges (with on_failure)
layout TEXT NOT NULL DEFAULT '{}', -- JSON: React Flow positions only (visual, non-execution)
variables_schema TEXT NOT NULL DEFAULT '{}', -- JSON Schema for run-time variables
triggers TEXT NOT NULL DEFAULT '[]', -- JSON array of trigger configs
schedule TEXT, -- Cron expression (null = no schedule)
timezone TEXT NOT NULL DEFAULT 'UTC', -- IANA timezone for cron evaluation (SINGLE source of truth)
durability TEXT NOT NULL DEFAULT 'standard' CHECK(durability IN ('standard', 'ephemeral')),
max_concurrent_runs INTEGER NOT NULL DEFAULT 5,
last_run_id TEXT,
last_run_status TEXT,
last_run_at TEXT, -- [P1-5] Timestamp of last run start
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE INDEX idx_wf_def_town ON workflow_definitions (town_id);
CREATE INDEX idx_wf_def_enabled ON workflow_definitions (enabled) WHERE enabled = 1;
CREATE INDEX idx_wf_def_schedule ON workflow_definitions (schedule) WHERE schedule IS NOT NULL;
[P1-5]: last_run_at column added.
[P1-7]: graph and layout are separate columns. graph contains execution-relevant data (nodes without positions, edges with on_failure config). layout contains React Flow position data. Layout-only changes don't bump the revision.
[P2-15]: timezone is defined at the definition level only (single source of truth). Removed from trigger config.
workflow_definition_revisions — deferred
[Addresses P1-6]: Revision history is deferred to Phase 3 (not MVP). For now, only the current graph/layout is stored. The revision field tracks the version number. When we add revision history, it will be:
-- DEFERRED: not in MVP or Phase 2
CREATE TABLE IF NOT EXISTS workflow_definition_revisions (
revision_id TEXT PRIMARY KEY,
definition_id TEXT NOT NULL,
revision INTEGER NOT NULL,
graph TEXT NOT NULL,
layout TEXT NOT NULL,
variables_schema TEXT NOT NULL,
triggers TEXT NOT NULL,
schedule TEXT,
created_at TEXT NOT NULL,
created_by TEXT, -- user/agent who made the change
UNIQUE(definition_id, revision)
);
workflow_runs
CREATE TABLE IF NOT EXISTS workflow_runs (
run_id TEXT PRIMARY KEY,
definition_id TEXT NOT NULL,
definition_revision INTEGER NOT NULL, -- pinned at run creation
graph_snapshot TEXT NOT NULL, -- full graph at time of run (immutable)
status TEXT NOT NULL DEFAULT 'pending'
CHECK(status IN ('pending', 'running', 'completed', 'failed', 'cancelled', 'timed_out')),
trigger_type TEXT NOT NULL, -- 'manual', 'schedule', 'event', 'webhook'
trigger_data TEXT NOT NULL DEFAULT '{}', -- JSON: trigger context
variables TEXT NOT NULL DEFAULT '{}', -- JSON: resolved runtime variables
summary TEXT, -- Final output/digest
error TEXT, -- Error message if failed
started_at TEXT NOT NULL,
completed_at TEXT
);
CREATE INDEX idx_wf_run_def ON workflow_runs (definition_id);
CREATE INDEX idx_wf_run_status ON workflow_runs (status) WHERE status IN ('pending', 'running');
workflow_step_runs
CREATE TABLE IF NOT EXISTS workflow_step_runs (
step_run_id TEXT PRIMARY KEY,
run_id TEXT NOT NULL,
node_id TEXT NOT NULL, -- reference to graph node
status TEXT NOT NULL DEFAULT 'pending'
CHECK(status IN ('pending', 'ready', 'dispatched', 'completed', 'failed', 'skipped', 'timed_out')),
output TEXT, -- JSON: step output
error TEXT,
bead_id TEXT, -- if step created a bead
dispatch_attempts INTEGER NOT NULL DEFAULT 0,
last_dispatch_at TEXT,
started_at TEXT,
completed_at TEXT
);
CREATE INDEX idx_wf_step_run ON workflow_step_runs (run_id);
CREATE INDEX idx_wf_step_bead ON workflow_step_runs (bead_id) WHERE bead_id IS NOT NULL;
CREATE INDEX idx_wf_step_status ON workflow_step_runs (run_id, status);
workflow_events (WorkflowDO internal outbox)
CREATE TABLE IF NOT EXISTS workflow_events (
event_id TEXT PRIMARY KEY,
event_type TEXT NOT NULL, -- 'step_completed', 'step_failed', 'notification_received', etc.
run_id TEXT,
step_run_id TEXT,
payload TEXT NOT NULL DEFAULT '{}',
created_at TEXT NOT NULL,
processed_at TEXT
);
CREATE INDEX idx_wf_evt_unprocessed ON workflow_events (processed_at) WHERE processed_at IS NULL;
Graph Definition Model
Separate graph from layout
[Addresses P1-7]
The graph column stores execution-relevant structure:
type WorkflowGraph = {
nodes: WorkflowNode[]; // type, config — NO position data
edges: WorkflowEdge[]; // source, target, condition, on_failure
};
The layout column stores visual-only data:
type WorkflowLayout = {
positions: Record<string, { x: number; y: number }>; // nodeId → position
viewport?: { x: number; y: number; zoom: number };
};
Layout changes (dragging nodes, zooming) update layout without bumping revision. Graph changes (adding/removing nodes, changing edges, editing node config) bump revision.
Node types
Phase 1 (MVP): 3 node types
trigger_manual — manual run trigger
action_sling_bead — create a bead on TownDO
control_condition — if/else branching
Phase 2: +5 node types
trigger_schedule — cron-based trigger
trigger_event — event-based trigger (bead closed, convoy landed)
action_sling_convoy — create a convoy
control_wait_bead — wait for a bead to reach a status
control_delay — wait for a duration
Phase 3: +7 node types
trigger_webhook — HTTP webhook trigger
action_mayor_prompt — LLM inference via Mayor
action_http — arbitrary HTTP request
action_script — run a script (see security note below)
control_wait_convoy — wait for convoy to land
output_notify — send a notification
output_update_bead — update an existing bead
Node Type Config Shapes
Each node type has a specific config schema validated at save time:
| Node Type |
Config Fields |
trigger_schedule |
{ cron: string } (timezone is definition-level) |
trigger_event |
{ event_type: 'bead_closed' | 'convoy_landed' | 'pr_merged', filter?: Record<string, string> } |
trigger_webhook |
{ secret?: string } (URL is auto-generated) |
trigger_manual |
{ input_schema?: Record<string, VariableSpec> } |
action_sling_bead |
{ rig_id: string, title: string, body?: string, priority?: string, labels?: string[] } |
action_sling_convoy |
{ rig_id: string, title: string, beads: ConvoyBeadSpec[] } |
action_mayor_prompt |
{ prompt_template: string } — supports {{step.<nodeId>.output}} interpolation |
action_http |
{ url: string, method: string, headers?: Record<string,string>, body_template?: string, allowed_domains?: string[] } |
action_script |
{ script: string, timeout_ms?: number } — runs in own container |
control_wait_bead |
{ bead_ref: string, target_status: 'closed' | 'failed', timeout_ms?: number } |
control_wait_convoy |
{ convoy_ref: string, timeout_ms?: number } |
control_condition |
{ expression: string } — evaluates to boolean, branches via edge conditions |
control_delay |
{ duration_ms: number } |
output_notify |
{ channel: 'slack' | 'webhook', target: string, message_template: string } |
output_update_bead |
{ bead_ref: string, fields: Record<string, unknown> } |
Template interpolation
Use {{step.<nodeId>.output.<path>}} syntax for referencing outputs from upstream steps.
[Addresses P2-13]: Template interpolation in HTTP URL and body templates could exfiltrate data. Mitigation (deferred to implementation):
action_http node config includes an optional allowed_domains list. If set, the resolved URL must match.
- Template values are sanitized (no control characters, length limits).
- Full sandboxing of template interpolation is tracked as a follow-up item.
action_script security
[Addresses P2-14]: action_script nodes run in their own short-lived container invocation, NOT in an agent's container. They get a fresh, isolated environment with:
- Read-only access to the repo
- No access to agent state or other containers
- Strict timeout (30s default, configurable up to 5min)
- Resource limits (memory, CPU)
This node type is deferred to Phase 4. The container strategy will be finalized during implementation.
DAG Execution Engine
Readiness evaluation
On each alarm tick, for each active run:
- Query all
step_run records for the run
- For each
pending step, check if all incoming edges' source nodes are in a terminal state (completed, failed, skipped)
- Apply
on_failure semantics per edge to determine if the step should be ready, skipped, or if the run should fail
- Transition ready steps to
dispatched (SQL mutation)
- Return side effect functions for dispatched steps
Parallel execution
Steps with no dependency between them execute in parallel — they're all transitioned to dispatched in the same alarm tick, and their side effects run concurrently via Promise.allSettled.
Phase 1 (MVP): Linear execution only (no parallel branches). Simplifies the engine significantly.
Phase 2+: Full parallel branch support.
Cycle prevention
- At save time: Topological sort validation. If the graph has a cycle, the save is rejected with a validation error.
- At runtime: Max step count guard (1000 steps per run). If exceeded, run is marked
timed_out.
Run cancellation
When a run is cancelled:
- All
pending, ready, and dispatched steps are marked skipped
- In-flight side effects (already dispatched) will complete but their results are ignored (step is already
skipped)
- Run status →
cancelled
Concurrent run limits
max_concurrent_runs (default: 5) on the definition. If a trigger fires while at the limit, the run is queued with pending status and starts when a slot opens.
Cron Scheduling
Evaluation
On each alarm tick, the engine queries all enabled definitions with a non-null schedule:
SELECT * FROM workflow_definitions
WHERE enabled = 1 AND schedule IS NOT NULL
For each, evaluate the cron expression against the definition's timezone. If the current time matches the cron and last_run_at is before the previous matching tick, start a new run.
Catch-up policy
[Addresses P2-11]: Skip missed ticks. On DO re-wake after eviction, only fire if the NEXT cron tick is in the future and last_run_at is before it. Do not backfill. This prevents a DO waking up after 4 hours of eviction from firing 240 runs for a per-minute schedule.
function shouldFireCron(schedule: string, timezone: string, lastRunAt: string | null): boolean {
const now = Date.now();
const prevTick = getPreviousCronTick(schedule, timezone, now);
const nextTick = getNextCronTick(schedule, timezone, now);
// Only fire if we haven't already run for this tick window
if (lastRunAt && new Date(lastRunAt).getTime() >= prevTick) return false;
// Only fire if the next tick is in the future (we're not waking up to ancient backlog)
return nextTick > now;
}
Permissions
[Addresses P1-9]
Follows the resolveTownOwnership pattern:
| Action |
Who |
| Create / edit / delete definitions |
Town owner, org admin |
| Enable / disable definitions |
Town owner, org admin |
| Trigger manual runs |
Town owner, org admin, org members |
| View definitions and runs |
All org members |
| Webhook triggers |
Authenticated via webhook secret (per-definition), rate-limited |
Webhook secrets are generated per-definition and stored in the triggers JSON config. The webhook endpoint validates the secret before creating a run.
Billing & Rate Limiting
[Addresses P1-10]
| Limit |
Free |
Pro |
Enterprise |
| Workflow definitions per town |
3 |
20 |
Unlimited |
| Runs per month |
50 |
1,000 |
10,000 |
| Steps per run |
10 |
50 |
200 |
LLM tokens per mayor_prompt step |
4K |
16K |
64K |
| LLM token budget per workflow per month |
100K |
1M |
10M |
| Min cron interval |
1 hour |
5 min |
1 min |
| Webhook triggers per minute |
5 |
30 |
100 |
| Concurrent runs per definition |
1 |
5 |
20 |
Rate limiting is enforced at:
- Run creation: Check monthly run count before starting
- Step dispatch: Check step count per run before transitioning to
dispatched
- Webhook endpoint: Token bucket rate limiter per definition
- Cron evaluation: Min interval check against tier
Tier info is read from the town's billing context (same pattern as existing feature gates).
Auto-save & Draft Model
[Addresses P2-12]
The visual editor uses a draft model:
- Auto-save writes to a
draft_graph and draft_layout column (or a separate workflow_drafts table). No revision bump, no validation. The editor reads from the draft on load.
- "Save & Publish" validates the graph (cycle detection, required fields, node config validation), copies draft →
graph/layout, bumps revision, and optionally updates the enabled state.
- If there's no draft (draft columns are null), the editor loads from the published
graph/layout.
This prevents half-edited graphs from executing while still giving users a seamless editing experience.
tRPC API
Definition Management
workflows.definitions.list // { townId } → WorkflowDefinition[]
workflows.definitions.get // { townId, definitionId } → WorkflowDefinition
workflows.definitions.create // { townId, name, graph, layout?, triggers?, schedule? } → WorkflowDefinition
workflows.definitions.update // { townId, definitionId, ...partial } → WorkflowDefinition
workflows.definitions.delete // { townId, definitionId } → void
workflows.definitions.setEnabled // { townId, definitionId, enabled } → void
workflows.definitions.validate // { townId, graph } → ValidationResult
workflows.definitions.saveDraft // { townId, definitionId, graph, layout } → void
workflows.definitions.publishDraft // { townId, definitionId } → WorkflowDefinition (validates + bumps revision)
Run Management
workflows.runs.list // { townId, definitionId?, status? } → WorkflowRun[]
workflows.runs.get // { townId, runId } → WorkflowRun & { steps: StepRun[] }
workflows.runs.trigger // { townId, definitionId, variables? } → WorkflowRun
workflows.runs.cancel // { townId, runId } → void
Webhook HTTP endpoint
POST /api/webhooks/workflow/:definitionId
Headers: X-Webhook-Secret: <secret>
Body: JSON payload (passed as trigger_data)
UI: Phased Approach
Phase 1 (MVP): Form-based editor
4 routes:
/towns/:townId/workflows — list of workflow definitions with status badges
/towns/:townId/workflows/new — create form (name, description, linear step list)
/towns/:townId/workflows/:defId/edit — edit form (same as create, plus enable/disable toggle)
/towns/:townId/workflows/:defId/runs — run history list with status, duration, trigger type
The MVP editor is a form-based linear step editor, not React Flow. Users add steps in sequence. Each step has a type selector and config form. This is sufficient for linear workflows and dramatically reduces implementation complexity.
Phase 3: React Flow Visual Editor
Upgrade to a full visual DAG editor:
- Three-panel layout: node palette (left), canvas (center), inspector (right)
- Custom node components with status badges
- Drag-and-drop from palette
- Edge drawing with
on_failure config in the inspector
- Auto-save → draft, explicit publish
- Run visualization overlay (step statuses on the graph)
Event Triggers / TownDO Integration
When Gastown events occur, TownDO writes them to the workflow_notifications outbox table (see Cross-DO Communication above). WorkflowDO drains these on each alarm tick and matches against trigger configs.
Event Types Available as Triggers
Using the existing TownEventType enum plus new workflow-specific events:
| Event |
Payload |
When |
bead_closed |
{ bead_id, bead_type, rig_id, title } |
Any bead reaches closed status |
bead_failed |
{ bead_id, bead_type, rig_id, title, error } |
Any bead reaches failed status |
convoy_landed |
{ convoy_id, title, bead_count } |
All beads in a convoy complete |
convoy_started |
{ convoy_id, title } |
Convoy begins execution |
pr_merged |
{ bead_id, rig_id, pr_url } |
PR associated with a bead is merged |
pr_created |
{ bead_id, rig_id, pr_url } |
PR is created for a bead |
Agent Ergonomics / Compatibility Bridge
Preserve gt_mol_current / gt_mol_advance semantics, backed by the new run model:
gt_mol_current → query WorkflowDO for the current ready step(s) in the active run
gt_mol_advance → mark a step as completed and advance the DAG
These are compatibility shims maintained as plugin tools, NOT structural constraints on the data model. The new API is the primary interface; these exist for agents that already use the molecule protocol.
Implementation Plan
Phase 0: Legacy Cleanup (1 week)
Goal: Mark current molecule internals as disposable legacy.
Phase 1: MVP (2 weeks)
[Addresses P1-8]: Scoped to a realistic 2-week deliverable.
Goal: End-to-end workflow execution with manual trigger, linear steps, form-based editor.
Backend:
Frontend:
Not in Phase 1: Cron, event triggers, webhooks, parallel branches, React Flow, wait nodes, Mayor/HTTP/script nodes, revision history, draft model.
Phase 2: Event Triggers & Scheduling (2-3 weeks)
Phase 3: Visual Editor & Advanced Nodes (3-4 weeks)
Phase 4: Polish & Templates (1-2 weeks)
Total realistic estimate: 9-12 weeks (including Phase 0)
[Addresses P1-8]: Previous estimate of 5-6 weeks was too aggressive. The reconciler (simpler scope) took ~3 weeks. This system adds a new DO + DAG engine + editor + cron + cross-DO communication + 15 node types + tRPC API + run visualization.
Future scope (not in this issue)
- Wisps (ephemeral runs that don't persist)
- Sub-workflow composition (a node that triggers another workflow)
- Approval gates (human-in-the-loop nodes)
- Workflow marketplace / sharing
References
Critique Findings Tracker
All 15 findings from the Architecture Review comment have been addressed:
| # |
Priority |
Finding |
Resolution |
Section |
| 1 |
P0 |
Instance model: one WorkflowDO per town, not per definition |
✅ Rewritten — one WorkflowDO per town, keyed by townId |
Architecture Decision |
| 2 |
P0 |
Event delivery: durable outbox, not lossy RPC |
✅ Rewritten — workflow_notifications outbox in TownDO, drain pattern |
Cross-DO Communication |
| 3 |
P0 |
Alarm loop: async execution, no blocking I/O |
✅ Rewritten — Phase 0/1/2 pattern, side effects via Promise.allSettled, self-notification |
Alarm Loop |
| 4 |
P0 |
Diamond failure: on_failure semantics |
✅ Added — skip, continue, fail_run per edge |
Failure Handling |
| 5 |
P1 |
Schema missing last_run_at |
✅ Added to workflow_definitions schema |
SQLite Schema |
| 6 |
P1 |
No revision history storage |
✅ Acknowledged — deferred to Phase 3, schema sketched |
SQLite Schema |
| 7 |
P1 |
Layout leaks into execution graph |
✅ Separated — graph and layout are distinct columns |
Graph Definition Model |
| 8 |
P1 |
Effort estimates too aggressive |
✅ Revised — 9-12 weeks total, 2-week MVP scoped |
Implementation Plan |
| 9 |
P1 |
Permissions model missing |
✅ Added — follows resolveTownOwnership pattern |
Permissions |
| 10 |
P1 |
Billing/rate limiting missing |
✅ Added — per-tier limits table, enforcement points |
Billing & Rate Limiting |
| 11 |
P2 |
Cron catch-up policy |
✅ Explicit "skip missed ticks" with pseudocode |
Cron Scheduling |
| 12 |
P2 |
Auto-save + validation conflict |
✅ Draft model — auto-save writes draft, publish validates |
Auto-save & Draft Model |
| 13 |
P2 |
Template interpolation security |
✅ Acknowledged — allowed_domains, deferred to impl |
Graph Definition Model |
| 14 |
P2 |
action_script container strategy |
✅ Stated — own container, deferred to Phase 4 |
Graph Definition Model |
| 15 |
P2 |
Timezone in two places |
✅ Fixed — definition-level only, removed from triggers |
SQLite Schema |
Summary
Add a workflow system to Gastown that lets users define multi-step automation pipelines. Workflows can run on a schedule (cron), on events (bead closed, convoy landed, PR merged), on webhooks, or manually. Each node in the workflow is an action — sling a bead, run a convoy, call an API, send a notification, run a script, or talk to the Mayor.
This system is built around the canonical Gastown/Gascity mental model: formula (authored reusable definition) → molecule (durable runtime instance) → step runs (per-step tracked work). The implementation treats the existing in-repo molecule internals (
createMolecule,advanceMoleculeStep,rig_moleculestable) as disposable legacy and rebuilds from a clean definition/run model.Motivation
Convoys are great for one-shot batches of related work. But many real-world patterns are recurring or event-driven:
gastown, sling it as a bead automatically"Workflows turn Gastown from a task executor into an automation platform.
Naming / Canonical Compatibility
workflow_definitionworkflow_runstep_runworkflow_run.summaryImportant: Avoid bare
workflowas the sole object name in APIs where we actually mean definition or run — that ambiguity compounds quickly. Usedefinitionvsrunexplicitly in the data model and API names.Dual-label in UI: e.g. "Workflow Definitions (Formulas)", "Run Details (Molecule)". Preserve Gastown aliases in docs/tooltips/API descriptions.
Architecture Decision: One WorkflowDO Per Town
A new
WorkflowDODurable Object namespace. One WorkflowDO instance per town, keyed bytownId. All workflow definitions, runs, and step states for a town live in a single DO's SQLite database.Why one-per-town, not one-per-definition?
townIdto find the right WorkflowDO. One-per-definition would require a lookup table or definition→town mapping on every request.Why not TownDO? The TownDO alarm is already doing too much (#1855 — 47+ ops/tick). Workflows have fundamentally different scheduling needs (cron-based, potentially long-running waits). A dedicated DO keeps concerns separated and lets workflows scale independently.
Cross-DO Communication: Durable Outbox Pattern
Cross-DO RPC is fire-and-forget. If the call fails, or either DO is evicted between the event and the notification, the workflow step stays
waitingforever. We do not use direct RPC for event delivery.Instead, we use a durable outbox pattern — the same pattern used by
town_eventsfor the TownDO reconciler (seeevents.ts:insertEvent→drainEvents→markProcessed).How it works
TownDO side — when a workflow-relevant event fires (bead closed, convoy landed, PR merged), TownDO writes a row into a
workflow_notificationstable in its own SQLite:This is a synchronous SQL write inside the existing alarm tick — zero latency, zero failure risk. TownDO also re-arms the WorkflowDO alarm to ensure timely processing.
WorkflowDO side — on each alarm tick, the engine drains unprocessed notifications:
For each notification, it checks all enabled definitions' trigger configs for a match. On match, it starts a run. After processing, it marks the notification as processed (
markProcessedpattern).Sequence
The
ping()RPC is purely an optimization to reduce latency — it re-arms the WorkflowDO alarm. If it fails, the WorkflowDO's own alarm (ticking every 60s when idle) will pick up the notification on the next cycle. No data is lost.Notification table (in TownDO's SQLite)
Pruning: Processed notifications older than 7 days are cleaned up in TownDO's housekeeping phase (same pattern as
pruneOldEvents).Alarm Loop: Async Execution Model
Cloudflare DOs have a 30-second execution limit per alarm invocation. Node execution (Mayor Prompts, HTTP calls, script runs) must not run synchronously in the alarm tick.
The WorkflowDO alarm follows the same proven pattern as TownDO's reconciler:
Phase 0: Drain notifications
Read unprocessed notifications from TownDO (via shared SQLite or via the notification drain). Match against enabled definitions' triggers. For matches, create new
workflow_runrecords and seed initial step states.Phase 1: Reconcile — compute ready nodes, apply SQL mutations
Walk all active runs. For each run, evaluate the DAG:
dispatchedstatus (SQL mutation)applyActionreturning(() => Promise<void>) | null)All SQL mutations happen synchronously in this phase. No I/O.
Phase 2: Execute side effects (async, best-effort)
Side effects include:
townStub.slingBead(...)— create a bead on TownDO foraction_sling_beadnodesfetch(url, ...)— execute HTTP calls foraction_httpnodesmayorStub.prompt(...)— invoke Mayor foraction_mayor_promptnodesPhase 3: Housekeeping
Prune old completed runs, prune processed notifications, emit metrics.
Result delivery: self-notification pattern
When a side effect completes, it writes the result back as a workflow event (into WorkflowDO's own
workflow_eventstable) and re-arms the alarm:The next alarm tick drains these events, updates step status, and continues DAG traversal. This means:
dispatchedsteps that never wrote a result (with exponential backoff).Step status lifecycle
Failure Handling:
on_failureSemanticsEach edge in the graph can specify an
on_failurepolicy (default:'skip'):Semantics:
skip(default)skipped. Other paths to the same node are unaffected.continuenull.fail_runfailedimmediately. All pending/ready/dispatched steps are cancelled.Diamond resolution: In a diamond (A→B, A→C, B→D, C→D), if B completes and C fails:
on_failure: 'skip'→ D is skipped (both paths must succeed)on_failure: 'continue'→ D executes with C's output asnullon_failure: 'fail_run'→ entire run failsThe engine evaluates all incoming edges to a node. A node is ready when:
on_failure: 'skip'or'continue'have their source completed or failedon_failure: 'fail_run'has a failed sourceon_failure: 'continue'Wrangler Configuration
Add to
wrangler.jsonc:File:
src/dos/Workflow.do.tswith sub-modules insrc/dos/workflow/:SQLite Schema (WorkflowDO)
workflow_definitionsworkflow_definition_revisions— deferredworkflow_runsworkflow_step_runsworkflow_events(WorkflowDO internal outbox)Graph Definition Model
Separate
graphfromlayoutThe
graphcolumn stores execution-relevant structure:The
layoutcolumn stores visual-only data:Layout changes (dragging nodes, zooming) update
layoutwithout bumpingrevision. Graph changes (adding/removing nodes, changing edges, editing node config) bumprevision.Node types
Phase 1 (MVP): 3 node types
trigger_manual— manual run triggeraction_sling_bead— create a bead on TownDOcontrol_condition— if/else branchingPhase 2: +5 node types
trigger_schedule— cron-based triggertrigger_event— event-based trigger (bead closed, convoy landed)action_sling_convoy— create a convoycontrol_wait_bead— wait for a bead to reach a statuscontrol_delay— wait for a durationPhase 3: +7 node types
trigger_webhook— HTTP webhook triggeraction_mayor_prompt— LLM inference via Mayoraction_http— arbitrary HTTP requestaction_script— run a script (see security note below)control_wait_convoy— wait for convoy to landoutput_notify— send a notificationoutput_update_bead— update an existing beadNode Type Config Shapes
Each node type has a specific config schema validated at save time:
trigger_schedule{ cron: string }(timezone is definition-level)trigger_event{ event_type: 'bead_closed' | 'convoy_landed' | 'pr_merged', filter?: Record<string, string> }trigger_webhook{ secret?: string }(URL is auto-generated)trigger_manual{ input_schema?: Record<string, VariableSpec> }action_sling_bead{ rig_id: string, title: string, body?: string, priority?: string, labels?: string[] }action_sling_convoy{ rig_id: string, title: string, beads: ConvoyBeadSpec[] }action_mayor_prompt{ prompt_template: string }— supports{{step.<nodeId>.output}}interpolationaction_http{ url: string, method: string, headers?: Record<string,string>, body_template?: string, allowed_domains?: string[] }action_script{ script: string, timeout_ms?: number }— runs in own containercontrol_wait_bead{ bead_ref: string, target_status: 'closed' | 'failed', timeout_ms?: number }control_wait_convoy{ convoy_ref: string, timeout_ms?: number }control_condition{ expression: string }— evaluates to boolean, branches via edge conditionscontrol_delay{ duration_ms: number }output_notify{ channel: 'slack' | 'webhook', target: string, message_template: string }output_update_bead{ bead_ref: string, fields: Record<string, unknown> }Template interpolation
Use
{{step.<nodeId>.output.<path>}}syntax for referencing outputs from upstream steps.action_scriptsecurityDAG Execution Engine
Readiness evaluation
On each alarm tick, for each active run:
step_runrecords for the runpendingstep, check if all incoming edges' source nodes are in a terminal state (completed, failed, skipped)on_failuresemantics per edge to determine if the step should beready,skipped, or if the run should faildispatched(SQL mutation)Parallel execution
Steps with no dependency between them execute in parallel — they're all transitioned to
dispatchedin the same alarm tick, and their side effects run concurrently viaPromise.allSettled.Phase 1 (MVP): Linear execution only (no parallel branches). Simplifies the engine significantly.
Phase 2+: Full parallel branch support.
Cycle prevention
timed_out.Run cancellation
When a run is cancelled:
pending,ready, anddispatchedsteps are markedskippedskipped)cancelledConcurrent run limits
max_concurrent_runs(default: 5) on the definition. If a trigger fires while at the limit, the run is queued withpendingstatus and starts when a slot opens.Cron Scheduling
Evaluation
On each alarm tick, the engine queries all enabled definitions with a non-null
schedule:For each, evaluate the cron expression against the definition's
timezone. If the current time matches the cron andlast_run_atis before the previous matching tick, start a new run.Catch-up policy
Permissions
Follows the
resolveTownOwnershippattern:Webhook secrets are generated per-definition and stored in the
triggersJSON config. The webhook endpoint validates the secret before creating a run.Billing & Rate Limiting
mayor_promptstepRate limiting is enforced at:
dispatchedTier info is read from the town's billing context (same pattern as existing feature gates).
Auto-save & Draft Model
The visual editor uses a draft model:
draft_graphanddraft_layoutcolumn (or a separateworkflow_draftstable). No revision bump, no validation. The editor reads from the draft on load.graph/layout, bumpsrevision, and optionally updates theenabledstate.graph/layout.This prevents half-edited graphs from executing while still giving users a seamless editing experience.
tRPC API
Definition Management
Run Management
Webhook HTTP endpoint
UI: Phased Approach
Phase 1 (MVP): Form-based editor
4 routes:
/towns/:townId/workflows— list of workflow definitions with status badges/towns/:townId/workflows/new— create form (name, description, linear step list)/towns/:townId/workflows/:defId/edit— edit form (same as create, plus enable/disable toggle)/towns/:townId/workflows/:defId/runs— run history list with status, duration, trigger typeThe MVP editor is a form-based linear step editor, not React Flow. Users add steps in sequence. Each step has a type selector and config form. This is sufficient for linear workflows and dramatically reduces implementation complexity.
Phase 3: React Flow Visual Editor
Upgrade to a full visual DAG editor:
on_failureconfig in the inspectorEvent Triggers / TownDO Integration
When Gastown events occur, TownDO writes them to the
workflow_notificationsoutbox table (see Cross-DO Communication above). WorkflowDO drains these on each alarm tick and matches against trigger configs.Event Types Available as Triggers
Using the existing
TownEventTypeenum plus new workflow-specific events:bead_closed{ bead_id, bead_type, rig_id, title }closedstatusbead_failed{ bead_id, bead_type, rig_id, title, error }failedstatusconvoy_landed{ convoy_id, title, bead_count }convoy_started{ convoy_id, title }pr_merged{ bead_id, rig_id, pr_url }pr_created{ bead_id, rig_id, pr_url }Agent Ergonomics / Compatibility Bridge
Preserve
gt_mol_current/gt_mol_advancesemantics, backed by the new run model:gt_mol_current→ query WorkflowDO for the current ready step(s) in the active rungt_mol_advance→ mark a step as completed and advance the DAGThese are compatibility shims maintained as plugin tools, NOT structural constraints on the data model. The new API is the primary interface; these exist for agents that already use the molecule protocol.
Implementation Plan
Phase 0: Legacy Cleanup (1 week)
Goal: Mark current molecule internals as disposable legacy.
rig_moleculestable,createMolecule,advanceMoleculeStep,MoleculeStepper.tsx,FormulaLibrarycreateMolecule()treatsformulaas array but API sends{ steps: [...] }Phase 1: MVP (2 weeks)
Goal: End-to-end workflow execution with manual trigger, linear steps, form-based editor.
Backend:
townId)workflow_definitionstable + CRUDworkflow_runs+workflow_step_runstablesworkflow_eventstable (internal outbox)trigger_manual,action_sling_bead,control_conditionworkflow_notificationstable in TownDO + write path forbead_closedevents (wiring only — event triggers activate in Phase 2)Frontend:
Not in Phase 1: Cron, event triggers, webhooks, parallel branches, React Flow, wait nodes, Mayor/HTTP/script nodes, revision history, draft model.
Phase 2: Event Triggers & Scheduling (2-3 weeks)
trigger_scheduleandtrigger_eventnode typescontrol_wait_bead,control_delaynode typesaction_sling_convoynode typePhase 3: Visual Editor & Advanced Nodes (3-4 weeks)
trigger_webhook+ webhook HTTP endpointaction_mayor_promptnode typeaction_httpnode type (withallowed_domains)output_notify,output_update_beadnode typesworkflow_definition_revisionstable)Phase 4: Polish & Templates (1-2 weeks)
action_scriptnode type (own container, sandboxed)gt_mol_current/gt_mol_advancecompatibility shimsTotal realistic estimate: 9-12 weeks (including Phase 0)
Future scope (not in this issue)
References
reconciliation-spec.mdsrc/dos/town/events.tssrc/dos/Town.do.ts(Phase 0-2 pattern)src/dos/town/actions.ts(SQL mutations → deferred side effects)src/dos/town/reconciler.ts(read-only state → Action[] → side effects)Critique Findings Tracker
All 15 findings from the Architecture Review comment have been addressed:
townIdworkflow_notificationsoutbox in TownDO, drain patternPromise.allSettled, self-notificationon_failuresemanticsskip,continue,fail_runper edgelast_run_atworkflow_definitionsschemagraphandlayoutare distinct columnsresolveTownOwnershippatternallowed_domains, deferred to implaction_scriptcontainer strategy