YAML Based Broker SDK workflows by khaliqgant · Pull Request #436 · AgentWorkforce/relay

khaliqgant · 2026-02-18T13:59:53Z

…low definitions Three major additions to the workflows spec: 1. Reflection Protocol — event-driven reflection inspired by the Generative Agents paper (Park et al., 2023). Importance-weighted message accumulation triggers focal point generation, synthesis, and course correction. Includes ReflectionEngine implementation, REFLECT message protocol, and per-pattern reflection behavior. 2. Trajectory Integration — formal integration with the agent-trajectories SDK (v0.4.0). Workflows auto-record messages, reflections, and decisions as trajectory events. Auto-generates retrospectives on completion. Enables cross-workflow learning and compliance/attribution. 3. YAML Workflow Definitions — portable YAML schema for defining workflows, compatible with relay-cloud's relay.yaml (PR #94). Supports template variables, DAG-based step parallelism, built-in templates, and progressive configuration (one-liner to full custom). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

New patterns (6-10): - handoff: dynamic routing with circuit breaker (max hops) - cascade: cost-aware LLM escalation (cheap → capable) - dag: directed acyclic graph with parallel execution - debate: adversarial refinement with structured rounds + judge - hierarchical: multi-level delegation tree (lead → coordinators → workers) New primitives required: - DAG Scheduler (topological sort, parallel dispatch, join tracking) - Handoff Controller (active agent tracking, context transfer) - Round Manager (debate rounds, turn order, convergence detection) - Confidence Parser (extract [confidence=X.X] from DONE messages) - Tree Validator (structural validation, sub-team computation) New message protocol signals: - HANDOFF, CONFIDENCE, ARGUMENT, CONCEDE, VERDICT, TEAM_DONE Includes pattern × primitive matrix showing what each pattern needs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Decision framework and reference for fan-out, pipeline, hub-spoke, consensus, mesh, handoff, cascade, dag, debate, and hierarchical patterns. Includes reflection protocol, YAML workflow definitions, and common mistakes guide. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

DAG-based execution plan with 9 nodes covering shared types, DB migration, workflow runner, swarm coordinator, templates, API endpoints, CLI commands, dashboard panel, and integration tests. Uses broker SDK for agent lifecycle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tor script Adds stigmergic state store, agent pool manager, auction engine, branch pruner, and gossip disseminator to WORKFLOWS_SPEC.md (Phase 5). These bring coverage from 67% to 88% of the 42 swarm techniques catalogued from multi-agent orchestration literature. Also adds executable broker SDK script (scripts/run-swarm-implementation.ts) that uses a DAG pattern to coordinate 9 work nodes implementing relay-cloud PR #94, with dependency-aware parallel execution and convention injection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Script fixes: - Use Promise.allSettled instead of Promise.race for batch execution - Add --resume support with state persistence to .relay/swarm-impl-state.json - Propagate failures to downstream nodes immediately (mark as "blocked") - Add readFirst field to DAG nodes so agents read existing code first - Require detailed DONE messages with type signatures and file paths - Add resolved guard to prevent double-resolution in polling loop - Add "blocked" status to NodeResult for better reporting Skill updates: - Add "DAG Executor Pitfalls" section with 6 common implementation mistakes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Updated spawn/send/release/logs commands to match actual CLI syntax (positional args, not --flag format). Verified with --dry-run. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Import AgentRelayClient, getLogs, and BrokerEvent directly from the broker SDK sub-paths (client, logs, protocol) which avoid the @relaycast/sdk transitive dependency. Replaces all execSync calls with proper SDK methods: spawnPty, release, listAgents, onEvent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The AgentRelayClient expects the Rust broker binary which has init --name --channels for protocol mode. The Node.js CLI binary has a different init command (setup wizard). Built Rust binary with cargo build and pointed binaryPath to target/debug/agent-relay. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Relaycast API returns 409 when creating a workspace with a name that already exists. Without cached credentials the broker can't recover. Use a timestamped broker name to ensure uniqueness. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…olling The Rust broker doesn't write worker-logs/ files — that's a Node.js CLI feature. Switch watchForDone to use broker events: - worker_stream: accumulate PTY output chunks, scan for DONE/ERROR - relay_inbound: relay messages from agents - agent_exited: detect agent termination Remove unused getLogs import. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Instead of parsing PTY output for DONE signals (which matched the prompt template text), agents now: 1. Do their work 2. Send a relay message with "DONE: <summary>" to the workflow channel 3. Exit naturally The orchestrator watches for: - relay_inbound: captures DONE/ERROR summaries for downstream deps - agent_exited: definitive completion signal (code 0 = success) Removed all "DONE: <detailed summary>" template text from task prompts to prevent false positives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Spawned PTY agents don't have MCP relay tools, so they can't send relay messages. Instead, agents now write their summary to .relay/summaries/{nodeId}.md before exiting. The orchestrator waits for agent_exited, then reads the summary file for downstream deps. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

New pitfalls from running the swarm implementation script: - PTY prompt echo matching signal keywords (false DONE completion) - Assuming agent capabilities (PTY agents lack MCP tools) - Rust broker vs Node.js CLI binary confusion - Log polling assumes Node.js daemon (Rust broker doesn't write logs) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- runner.ts: executeStep now throws after marking step failed, enabling fail-fast/continue error strategies to trigger via Promise.allSettled - cli/index.ts: runScriptFile now only catches ENOENT errors, properly propagating script execution failures instead of trying next runner Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Similar to Wrangler's telemetry.md, this document explains: - What data is collected and why - What is explicitly NOT collected - How to opt out (CLI, env var, config file) - How to view telemetry events for debugging Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View 25 additional findings in Devin Review.

devin-ai-integration · 2026-02-18T17:11:33Z

packages/sdk-ts/src/workflows/runner.ts

+          if (strategy === 'fail-fast') {
+            // Mark all pending downstream steps as skipped
+            await this.markDownstreamSkipped(step.name, workflow.steps, stepStates, runId);
+            throw new Error(`Step "${step.name}" failed: ${error}`);
+          }


🟡 fail-fast with parallel step failures leaves downstream steps of subsequent failures in 'pending' state

When multiple steps run in parallel and more than one fails under the fail-fast strategy, only the first failed step's downstream dependents are marked as skipped. The loop throws immediately after processing the first failure, so downstream steps of the second (and subsequent) failed steps remain in pending state instead of skipped.

Root Cause and Impact

In executeSteps at packages/sdk-ts/src/workflows/runner.ts:593-614, the results of Promise.allSettled are iterated. When the first rejected result is encountered with fail-fast strategy, markDownstreamSkipped is called for that step, and then an error is thrown at line 607. This means subsequent rejected results in the same batch are never processed — their downstream steps are never marked as skipped.

For example, if steps A and B run in parallel and both fail:

Step A's failure is processed: A's downstream steps are marked skipped, then throw

Step B's failure is never processed in this loop (B itself is already marked failed by executeStep)

Step B's downstream steps remain in pending state in the DB

The run correctly ends in failed status (via the catch block at line 437), but the step state in the database is inconsistent — some steps that should be skipped are left as pending. This affects any UI or API that reads step states to show workflow progress, and it affects resume() which would attempt to re-run those pending steps even though their upstream dependency failed.

Prompt for agents

In packages/sdk-ts/src/workflows/runner.ts, in the executeSteps method around lines 593-614, the fail-fast strategy throws immediately after the first rejected result, skipping processing of subsequent rejected results in the same batch. To fix this, process ALL rejected results before throwing. Specifically, change the loop so that it: 1) Iterates through all results and marks each failed step and its downstream as skipped, 2) Collects the first error, 3) After the loop, throws the collected error. This ensures all downstream steps of all failed parallel steps are properly marked as skipped before the throw.

Was this helpful? React with 👍 or 👎 to provide feedback.

khaliqgant and others added 17 commits February 17, 2026 21:59

sdk workflow spec

03101cd

fix: use agent-relay CLI positional args for DAG executor

c42ddf7

Updated spawn/send/release/logs commands to match actual CLI syntax (positional args, not --flag format). Verified with --dry-run. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sdk based workflows

725128a

remove specs

1ad2b04

This comment was marked as resolved.

Sign in to view

khaliqgant added 4 commits February 18, 2026 15:18

relay workflows

083c706

sdk for workflows

969ad4b

bring in latest

723c4a6

exclude tests

95872c3

This comment was marked as resolved.

Sign in to view

khaliqgant added 2 commits February 18, 2026 15:55

remove validation

5914bc1

fixes

0679ff6

This comment was marked as resolved.

Sign in to view

khaliqgant added 2 commits February 18, 2026 17:18

pr review

7d6d011

run any language

ecc7522

This comment was marked as resolved.

Sign in to view

devin-ai-integration bot reviewed Feb 18, 2026

View reviewed changes

khaliqgant merged commit cf26336 into main Feb 18, 2026
37 checks passed

khaliqgant deleted the sdk-workflows branch February 18, 2026 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

YAML Based Broker SDK workflows#436

YAML Based Broker SDK workflows#436
khaliqgant merged 27 commits intomainfrom
sdk-workflows

khaliqgant commented Feb 18, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

khaliqgant commented Feb 18, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

khaliqgant commented Feb 18, 2026 •

edited by devin-ai-integration bot

Loading