Skip to content

Add structured agent result callbacks#932

Open
willwashburn wants to merge 7 commits into
mainfrom
codex/structured-agent-results
Open

Add structured agent result callbacks#932
willwashburn wants to merge 7 commits into
mainfrom
codex/structured-agent-results

Conversation

@willwashburn
Copy link
Copy Markdown
Member

Summary

  • Add per-spawn structured result contracts to the TypeScript SDK, including Zod/parser validation, agent.waitForResult(), per-spawn onResult, and global relay.onAgentResult.
  • Add broker support for per-agent callback tokens, /api/agent-result, agent_result SDK events, and Relaycast MCP env injection for the result tool.
  • Add the MCP submit_result tool plus docs and focused tests.

Testing

  • cargo check
  • cargo test -p agent-relay-broker agent_result_route
  • cargo test -p agent-relay-broker relaycast_server_config_includes_agent_result_vars
  • npm --prefix packages/sdk run check
  • npx tsc --noEmit
  • npx vitest run src/cli/relaycast-mcp.startup.test.ts
  • cd packages/sdk && npx vitest run src/__tests__/orchestration-upgrades.test.ts -t "structured result|waitForResult|agentResultSchema" src/__tests__/spawn-token.test.ts

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds broker-managed per-spawn callback tokens and a public /api/agent-result endpoint; injects callback URL/token and a submit_result MCP tool into workers; extends TypeScript SDK with typed result schemas, validation, waitForResult/onResult callbacks; updates tests, docs, and trajectory records.

Changes

Structured Agent Results

Layer / File(s) Summary
Trajectory metadata and index
.trajectories/completed/*, .trajectories/index.json
Adds completed trajectory records for structured results work and updates the trajectories index timestamp.
Wire protocol and type contracts
crates/broker/src/protocol.rs, crates/broker/src/types.rs, packages/sdk/src/protocol.ts, packages/sdk/src/types.ts
Adds BrokerEvent::AgentResult, defines AgentResultMcpConfig and SDK JsonSchema/agentResultSchema types to align broker and SDK contracts.
Broker listen API and result submission endpoint
crates/broker/src/listen_api.rs
Parses agent_result_schema on Spawn (aliases supported), adds SubmitAgentResult request variant, and registers public /api/agent-result POST endpoint that authenticates via callback token and forwards submissions to the broker; tests added.
Broker runtime token management and lifecycle
crates/broker/src/runtime/*.rs
BrokerRuntime gains agent_result_tokens; spawn allocates tokens mapped to agent names, spawn failures remove tokens, SubmitAgentResult validates tokens/name and emits AgentResult events, and release/maintenance paths prune tokens.
MCP snippets and Relaycast config plumbing
crates/broker/src/snippets.rs, crates/broker/src/relaycast/mod.rs
Threads AgentResultMcpConfig through MCP config builders; adds apply_agent_result_env and TOML-escaping helpers; updates provider/CLI config emission to inject AGENT_RELAY_RESULT_* env vars where appropriate and tests provider-specific persistence/omission behavior.
Worker spawn wiring and env injection
crates/broker/src/worker.rs
WorkerRegistry::spawn accepts optional agent_result, builds result-aware MCP args, and injects agent_result.env_pairs() into worker processes when present.
Relaycast MCP submit_result tool & CLI wiring
src/cli/relaycast-mcp.ts, src/cli/relaycast-mcp.startup.test.ts
Reads callback config from env, registers submit_result MCP tool that POSTs { agent, data, final, metadata } to the callback URL with bearer auth and timeout, returns parsed structured payload; test verifies POST and response handling.
Supervisor restart state propagation
crates/broker/src/supervisor.rs
Persist optional agent_result in RestartState and PendingRestart so restarts preserve result config; unit tests adapted to new argument.
SDK types and client payload forwarding
packages/sdk/src/types.ts, packages/sdk/src/client.ts, packages/sdk/package.json
Adds JsonSchema and agentResultSchema spawn fields; client forwards agentResultSchema to broker; adds zod/zod-to-json-schema deps for schema conversion.
SDK relay core: contract registration and spawn flows
packages/sdk/src/relay.ts
Introduce AgentResult types and generics for spawn APIs, prepare JSON schema from Zod, register per-agent result contracts, expose onAgentResult hook and per-agent waitForResult(timeoutMs?), validate incoming agent_result events, store results, and resolve/reject waiters; lifecycle cleanup added.
SDK tests and docs
packages/sdk/src/__tests__/*, web/content/docs/*
Adds Vitest cases validating schema forwarding and waitForResult/onResult behavior; updates unit mocks; documents "Structured results", Agent.waitForResult(...), and relay.addListener('agentResult', ...).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • khaliqgant

"I nibble on schemas, tidy the trail,
Tokens and callbacks wrapped in a tail.
Spawned agents now speak in structured song,
WaitForResult—I'll celebrate all day long!" 🐇✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add structured agent result callbacks' clearly and concisely describes the main feature addition (structured result callbacks) across the SDK, broker, and MCP layers.
Description check ✅ Passed The PR description includes a Summary section detailing the three main components (SDK, broker, MCP) and a Testing section listing verification steps. However, it lacks a Test Plan section with checkboxes and Screenshots section as specified in the template.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/structured-agent-results

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 839d0cda5c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +305 to +308
worker_relay_key,
rst.skip_relay_prompt,
None,
None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve result callback config on restart

When a supervised agent is restarted, the spawn call drops the structured-result configuration by passing None for the agent_result argument, so restarted workers no longer get the submit_result MCP wiring even if the original spawn requested a result contract. This breaks waitForResult(...) for crash-and-restart scenarios because the broker still expects a structured result path, but the restarted process has no callback token/schema injected.

Useful? React with 👍 / 👎.

Comment thread packages/sdk/src/relay.ts Outdated
return new Promise<AgentResult<unknown>>((resolve, reject) => {
let timer: ReturnType<typeof setTimeout> | undefined;
const token = ++this.resultResolverSeq;
this.resultResolvers.set(name, {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve all concurrent waitForResult callers

waitForAgentResult stores only one resolver per agent name, so a second agent.waitForResult() call overwrites the first pending resolver. In that case, the first promise can hang indefinitely (especially with no timeout) even after a valid agent_result event arrives, because only the most recent waiter is resolved.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment thread crates/broker/src/worker.rs Outdated
Comment on lines 167 to 168
if skip_relay_prompt && agent_result.is_none() {
return Ok(Vec::new());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 skip_relay_prompt=true is silently overridden when agent_result is specified, injecting the full relay MCP

When a caller sets skip_relay_prompt: true (documented as "skip injecting the relay MCP configuration … saving tokens") but also provides an agent_result schema, the early-return guard at crates/broker/src/worker.rs:167 no longer fires because agent_result.is_some(). This causes configure_relaycast_mcp_with_result to inject the entire relaycast MCP server config — messaging tools, channel tools, programmability tools, and the system prompt — not just the submit_result tool. The user's explicit opt-out of relay messaging is silently ignored, defeating the token-saving purpose of the flag.

Downstream impact

The configure_relaycast_mcp_with_result call at crates/broker/src/snippets.rs:704-715 builds the full MCP server configuration including all relay tooling. For Claude spawns this adds --mcp-config with every relaycast tool; for Codex it emits dozens of --config mcp_servers.relaycast.env.* args. A caller who set skipRelayPrompt: true to run a lightweight sub-task (e.g. a quick code review) and only wanted the structured result tool will instead get the full relay prompt and tool surface, spending significant extra tokens.

Prompt for agents
In crates/broker/src/worker.rs, the build_mcp_args method at line 167 checks `skip_relay_prompt && agent_result.is_none()` to decide whether to skip MCP config injection. When both skip_relay_prompt is true and agent_result is Some, it falls through to configure_relaycast_mcp_with_result which injects the FULL relaycast MCP server — all messaging, channel, and programmability tools — not just the submit_result tool.

The fix should ensure that when skip_relay_prompt is true and agent_result is Some, only the agent-result-related environment variables are injected (AGENT_RELAY_RESULT_URL, AGENT_RELAY_RESULT_TOKEN, AGENT_RELAY_RESULT_SCHEMA) without configuring the full relaycast MCP server. The env vars are already set on the child process at worker.rs:392-396, so the MCP server will pick them up if it is started through other means. One approach is to configure a minimal MCP config that only includes the result env vars when skip_relay_prompt is true, or to let the env injection at worker.rs:392 handle it and keep the early return for the MCP args.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/broker/src/runtime/maintenance.rs (1)

300-309: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Preserve agent-result callback config across supervised restarts.

At Line 308, restart always passes None for agent_result, so restarted workers lose result callback env/tool wiring and can no longer submit structured results.

Suggested direction
                 match workers
                     .spawn(
                         rst.spec.clone(),
                         rst.parent.clone(),
                         None,
                         worker_relay_key,
                         rst.skip_relay_prompt,
                         None,
-                        None,
+                        rst.agent_result.clone(),
                     )
                     .await

This requires storing agent_result in restart state (rst) when the worker is first spawned.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/broker/src/runtime/maintenance.rs` around lines 300 - 309, The restart
path always passes None for agent_result to workers.spawn, causing restarted
workers to lose their result-callback wiring; update the restart state (rst) to
store the original agent_result when the worker is first spawned and then pass
rst.agent_result (instead of None) into workers.spawn during supervised restarts
so the env/tool callback configuration is preserved; ensure any places that
construct or clone rst include the new agent_result field and that workers.spawn
continues to accept the agent_result parameter.
packages/sdk/src/relay.ts (2)

1306-1322: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

shutdown() leaves structured-result waiters and caches behind.

waitForResult() promises can remain pending forever after shutdown, and result maps are never cleared.

Proposed fix
   this.outputListeners.clear();
+  for (const entry of this.resultResolvers.values()) {
+    entry.reject(new Error('AgentRelay shutdown before structured result was submitted'));
+  }
+  this.resultResolvers.clear();
+  this.resultContracts.clear();
+  this.lastAgentResults.clear();
   for (const entry of this.exitResolvers.values()) {
     entry.resolve('released');
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sdk/src/relay.ts` around lines 1306 - 1322, The shutdown() method
currently clears many agent-related maps but does not resolve or clear
structured-result waiters and result caches used by waitForResult(), leaving
promises pending and maps populated; update shutdown() to iterate over the
structured-result waiters (e.g., the map used by waitForResult()) and call each
resolver with an appropriate termination value or error, then clear the waiters
map and the corresponding result cache(s) so no pending promises remain and no
stale results persist (refer to the waitForResult() implementation and the
result cache variable names to locate and clear the correct structures).

1944-1955: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Idempotent agent_not_found release path skips structured-result cleanup.

This branch removes core agent state but leaves result contracts/resolvers untouched, so waitForResult() callers can hang.

Proposed fix
           if (error instanceof AgentRelayProtocolError && error.code === 'agent_not_found') {
             relay.exitedAgents.add(name);
             relay.readyAgents.delete(name);
             relay.messageReadyAgents.delete(name);
             relay.idleAgents.delete(name);
             relay.knownAgents.delete(name);
             relay.outputListeners.delete(name);
+            relay.resultContracts.delete(name);
+            relay.lastAgentResults.delete(name);
+            relay.resultResolvers.get(name)?.reject(
+              new Error(`Agent '${name}' was released before submitting a result`)
+            );
+            relay.resultResolvers.delete(name);
             relay.exitResolvers.get(name)?.resolve('released');
             relay.exitResolvers.delete(name);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sdk/src/relay.ts` around lines 1944 - 1955, The agent_not_found
branch currently removes agent state but fails to clear structured-result
bookkeeping, which can leave waitForResult callers hanging; update the branch in
the error instanceof AgentRelayProtocolError && error.code === 'agent_not_found'
block (the same area that touches relay.exitedAgents, relay.readyAgents,
relay.exitResolvers, relay.idleResolvers and calls relay.invokeLifecycleHook) to
also: resolve and delete any pending result resolvers/waiters for the agent
(e.g., relay.resultResolvers.get(name)?.resolve(...) and
relay.resultResolvers.delete(name)), remove any result contracts/metadata (e.g.,
relay.resultContracts.delete(name) or equivalent), and delete any result
listeners/queues (e.g., relay.resultListeners.delete(name)), ensuring all
structured-result collections are cleaned up the same way other state maps are.
🧹 Nitpick comments (1)
crates/broker/src/protocol.rs (1)

271-279: ⚡ Quick win

Add a dedicated wire round-trip test for agent_result.

This new event shape introduces a renamed field (final) and optional metadata; a focused serde round-trip test here would prevent silent contract drift.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/broker/src/protocol.rs` around lines 271 - 279, Add a focused serde
round-trip unit test for the AgentResult struct: serialize an AgentResult
instance (populating name, result_id, data, final_result true/false, and both
Some(metadata) and None cases) to JSON and then deserialize it back, asserting
equality and that the "final" field is correctly mapped to final_result and that
metadata is optional; target the AgentResult type and its fields (name,
result_id, data, final_result, metadata) so any wire-format/serde rename
regressions are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.trajectories/completed/2026-05/traj_ceo5q9bh2od3.json:
- Around line 28-33: The trajectory event payload duplicates the decision text
in the "content" field and in "raw.question"/"raw.chosen" (e.g., "... plus MCP
tool: ... plus MCP tool"); update the code that assembles the trajectory payload
(where "content", "raw.question", and "raw.chosen" are set) to avoid repeating
the same phrase twice—either use the canonical string once (prefer raw.chosen or
raw.question) or run a small dedupe/normalize step that collapses immediately
repeated substrings (detect "X ... : X ..." patterns or consecutive identical
phrases) before assigning to "content" so the event payload contains the
decision text only once.

In @.trajectories/index.json:
- Line 1133: The index entry currently contains an absolute temp-machine path in
the "path" field (e.g.
"/private/tmp/relay-structured-agent-results/.trajectories/completed/2026-05/traj_ceo5q9bh2od3.json"),
which should be replaced with a portable value; update the producer that writes
the index so the "path" value is a repo-relative canonical path like
".trajectories/completed/2026-05/traj_ceo5q9bh2od3.json" or a stable trajectory
ID reference, and ensure any code that reads the index resolves that relative
path (or looks up by ID) instead of relying on absolute temp directories—locate
the code that sets the "path" property in the index writer and change it
accordingly.

In `@crates/broker/src/runtime/init.rs`:
- Around line 341-344: The AGENT_RELAY_RESULT_URL is currently built from
cmd.api_bind which can be an unconnectable host like "0.0.0.0"; change the
construction so it uses a connectable/advertised host instead (e.g., a new
config value or fallback to loopback) rather than cmd.api_bind. Locate where
AGENT_RELAY_RESULT_URL is formatted (the tuple using "AGENT_RELAY_RESULT_URL",
format!("http://{}:{}/api/agent-result", cmd.api_bind, actual_port)) and replace
cmd.api_bind with the appropriate advertised host variable or a safe fallback
(such as "127.0.0.1") so the resulting URL is reachable by workers; ensure this
advertised host is configurable and documented.

In `@packages/sdk/src/protocol.ts`:
- Around line 431-438: The agent_result type's metadata field is too narrow
(currently metadata?: Record<string, unknown> | null) and should accept any JSON
value; update the agent_result definition in packages/sdk/src/protocol.ts so
metadata is typed as unknown (e.g., metadata?: unknown or metadata?: unknown |
null) to match the broker wire contract and ensure the SDK accepts strings,
arrays, numbers, objects, etc.; keep the rest of the shape
(kind/name/result_id/data/final) unchanged.

In `@packages/sdk/src/relay.ts`:
- Around line 1863-1895: waitForAgentResult can register a resolver for an agent
that no longer exists and thus never settle; before creating and storing a
resolver (using this.resultResolverSeq and this.resultResolvers) add a fast-path
check for whether the agent is still tracked or active and, if not, immediately
reject the promise (instead of registering the resolver). Implement this by
checking the appropriate agent-tracking state you already maintain (for example
an agents/agentStates/activeAgents collection) and returning Promise.reject(new
Error(...)) when the agent is unknown/ended; otherwise proceed to use
this.resultResolverSeq, set this.resultResolvers.set(name, { resolve, reject,
token }) and the timeout logic as currently implemented.

In `@src/cli/relaycast-mcp.ts`:
- Around line 180-193: The POST in submit_result currently uses
fetch(config.url, ...) with no timeout; add an AbortController, pass its signal
to fetch, and set a timer (e.g., configurable default like 10s) that calls
controller.abort() to enforce a timeout; after fetch completes clear the timer
and catch the AbortError to handle/log a timeout case instead of letting the
call hang. Use the existing identifiers (submit_result, config.url,
config.token, config.agentName) so the signal is passed into the fetch options
and the timer is cleared on success/failure.

---

Outside diff comments:
In `@crates/broker/src/runtime/maintenance.rs`:
- Around line 300-309: The restart path always passes None for agent_result to
workers.spawn, causing restarted workers to lose their result-callback wiring;
update the restart state (rst) to store the original agent_result when the
worker is first spawned and then pass rst.agent_result (instead of None) into
workers.spawn during supervised restarts so the env/tool callback configuration
is preserved; ensure any places that construct or clone rst include the new
agent_result field and that workers.spawn continues to accept the agent_result
parameter.

In `@packages/sdk/src/relay.ts`:
- Around line 1306-1322: The shutdown() method currently clears many
agent-related maps but does not resolve or clear structured-result waiters and
result caches used by waitForResult(), leaving promises pending and maps
populated; update shutdown() to iterate over the structured-result waiters
(e.g., the map used by waitForResult()) and call each resolver with an
appropriate termination value or error, then clear the waiters map and the
corresponding result cache(s) so no pending promises remain and no stale results
persist (refer to the waitForResult() implementation and the result cache
variable names to locate and clear the correct structures).
- Around line 1944-1955: The agent_not_found branch currently removes agent
state but fails to clear structured-result bookkeeping, which can leave
waitForResult callers hanging; update the branch in the error instanceof
AgentRelayProtocolError && error.code === 'agent_not_found' block (the same area
that touches relay.exitedAgents, relay.readyAgents, relay.exitResolvers,
relay.idleResolvers and calls relay.invokeLifecycleHook) to also: resolve and
delete any pending result resolvers/waiters for the agent (e.g.,
relay.resultResolvers.get(name)?.resolve(...) and
relay.resultResolvers.delete(name)), remove any result contracts/metadata (e.g.,
relay.resultContracts.delete(name) or equivalent), and delete any result
listeners/queues (e.g., relay.resultListeners.delete(name)), ensuring all
structured-result collections are cleaned up the same way other state maps are.

---

Nitpick comments:
In `@crates/broker/src/protocol.rs`:
- Around line 271-279: Add a focused serde round-trip unit test for the
AgentResult struct: serialize an AgentResult instance (populating name,
result_id, data, final_result true/false, and both Some(metadata) and None
cases) to JSON and then deserialize it back, asserting equality and that the
"final" field is correctly mapped to final_result and that metadata is optional;
target the AgentResult type and its fields (name, result_id, data, final_result,
metadata) so any wire-format/serde rename regressions are caught.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 130fdc06-7d6e-49dd-8330-e86fd8477993

📥 Commits

Reviewing files that changed from the base of the PR and between db037b0 and 839d0cd.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (26)
  • .trajectories/completed/2026-05/traj_ceo5q9bh2od3.json
  • .trajectories/completed/2026-05/traj_ceo5q9bh2od3.md
  • .trajectories/index.json
  • crates/broker/src/listen_api.rs
  • crates/broker/src/protocol.rs
  • crates/broker/src/relaycast/mod.rs
  • crates/broker/src/runtime/api.rs
  • crates/broker/src/runtime/event_loop.rs
  • crates/broker/src/runtime/init.rs
  • crates/broker/src/runtime/maintenance.rs
  • crates/broker/src/runtime/mod.rs
  • crates/broker/src/runtime/relaycast_events.rs
  • crates/broker/src/snippets.rs
  • crates/broker/src/types.rs
  • crates/broker/src/worker.rs
  • packages/sdk/package.json
  • packages/sdk/src/__tests__/orchestration-upgrades.test.ts
  • packages/sdk/src/__tests__/unit.test.ts
  • packages/sdk/src/client.ts
  • packages/sdk/src/protocol.ts
  • packages/sdk/src/relay.ts
  • packages/sdk/src/types.ts
  • src/cli/relaycast-mcp.startup.test.ts
  • src/cli/relaycast-mcp.ts
  • web/content/docs/spawning-an-agent.mdx
  • web/content/docs/typescript-sdk.mdx

Comment on lines +28 to +33
"content": "Implemented structured agent results as a broker-mediated callback token plus MCP tool: Implemented structured agent results as a broker-mediated callback token plus MCP tool",
"raw": {
"question": "Implemented structured agent results as a broker-mediated callback token plus MCP tool",
"chosen": "Implemented structured agent results as a broker-mediated callback token plus MCP tool",
"alternatives": [],
"reasoning": "The SDK can declare a result contract at spawn time, the broker mints a per-agent callback token, and the injected Relaycast MCP server exposes submit_result without requiring the spawned agent to know broker credentials."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove duplicated decision text in trajectory event payload.

The decision text is repeated ("... plus MCP tool: ... plus MCP tool"), which makes generated history noisy and harder to scan.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.trajectories/completed/2026-05/traj_ceo5q9bh2od3.json around lines 28 - 33,
The trajectory event payload duplicates the decision text in the "content" field
and in "raw.question"/"raw.chosen" (e.g., "... plus MCP tool: ... plus MCP
tool"); update the code that assembles the trajectory payload (where "content",
"raw.question", and "raw.chosen" are set) to avoid repeating the same phrase
twice—either use the canonical string once (prefer raw.chosen or raw.question)
or run a small dedupe/normalize step that collapses immediately repeated
substrings (detect "X ... : X ..." patterns or consecutive identical phrases)
before assigning to "content" so the event payload contains the decision text
only once.

Comment thread .trajectories/index.json
"status": "completed",
"startedAt": "2026-05-20T21:24:17.929Z",
"completedAt": "2026-05-20T21:43:51.936Z",
"path": "/private/tmp/relay-structured-agent-results/.trajectories/completed/2026-05/traj_ceo5q9bh2od3.json"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid temp-machine absolute paths in trajectory index entries.

Storing /private/tmp/... makes index metadata non-portable and brittle across environments. Prefer a repo-relative canonical path (for example, .trajectories/completed/...) or a stable trajectory ID reference.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.trajectories/index.json at line 1133, The index entry currently contains an
absolute temp-machine path in the "path" field (e.g.
"/private/tmp/relay-structured-agent-results/.trajectories/completed/2026-05/traj_ceo5q9bh2od3.json"),
which should be replaced with a portable value; update the producer that writes
the index so the "path" value is a repo-relative canonical path like
".trajectories/completed/2026-05/traj_ceo5q9bh2od3.json" or a stable trajectory
ID reference, and ensure any code that reads the index resolves that relative
path (or looks up by ID) instead of relying on absolute temp directories—locate
the code that sets the "path" property in the index writer and change it
accordingly.

Comment thread crates/broker/src/runtime/init.rs
Comment thread packages/sdk/src/protocol.ts
Comment thread packages/sdk/src/relay.ts
Comment thread src/cli/relaycast-mcp.ts Outdated
- broker: preserve agent_result config across supervised restarts
  (plumb through Supervisor::register, RestartState, PendingRestart)
- broker: respect skip_relay_prompt even when agent_result is set; the
  callback env vars still reach the worker via worker.rs:392
- broker: derive callback URL from a connectable host (fall back to
  loopback when api_bind is 0.0.0.0/::)
- broker: add serde round-trip tests for AgentResult event covering
  the final field rename and optional metadata
- sdk: widen agent_result.metadata to unknown to match the wire contract
  (in both BrokerEvent and the user-facing AgentResultMeta)
- sdk: support multiple concurrent waitForResult() callers per agent
- sdk: reject waitForResult() immediately when the agent is unknown and
  no cached result exists
- sdk: clear structured-result state on shutdown() and on agent_not_found
  release, rejecting any pending waiters
- relaycast-mcp: add a configurable AbortController timeout (default 10s)
  to the submit_result callback POST
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/sdk/src/relay.ts (1)

1861-1874: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t satisfy waitForResult() with non-final updates.

This still resolves all waiters for every validated agent_result, including final: false. A partial progress update will therefore satisfy current waiters and populate the cache, so later waitForResult() calls return non-final data immediately. Keep onAgentResult/onResult for streaming updates, but only cache and resolve the waiter path once the final result arrives.

Suggested fix
-    this.lastAgentResults.set(name, result);
     this.onAgentResult?.(result);
     if (contract?.onResult) {
       Promise.resolve(contract.onResult(result.data, result)).catch((error) => {
         console.warn(`[AgentRelay] result("${name}") onResult hook threw`, error);
       });
     }
-    for (const waiter of this.takeResultResolvers(name)) waiter.resolve(result);
+    if (result.final) {
+      this.lastAgentResults.set(name, result);
+      for (const waiter of this.takeResultResolvers(name)) waiter.resolve(result);
+    }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sdk/src/relay.ts` around lines 1861 - 1874, The current loop
resolves and caches every validated agent_result (including partial updates),
which causes waitForResult() to be satisfied with non-final data; update the
logic in the method that currently calls takeResultResolvers(name), sets
lastAgentResults.set(name, result), and invokes onAgentResult/contract.onResult
so that onAgentResult and contract.onResult are still called for every validated
result (keep streaming), but only when result.final === true do you set
this.lastAgentResults.set(name, result) and iterate
this.takeResultResolvers(name) to resolve waiters (leave the existing
validation-rejection path intact); ensure you reference the same symbols
(takeResultResolvers, lastAgentResults, onAgentResult, contract.onResult) and
only move the cache+resolve block behind a final-result check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/broker/src/runtime/init.rs`:
- Around line 338-347: The current normalization of cmd.api_bind into
callback_host forces IPv4 loopback and leaves raw IPv6 literals unbracketed,
breaking v6-only listeners and producing invalid URLs; update the logic in
init.rs (the callback_host determination and the AGENT_RELAY_RESULT_URL
formatting where it uses callback_host and actual_port) to: detect the bound
socket family (use the socket or binding info available in the init/runtime
setup to choose IPv4 vs IPv6 loopback), set loopback to "127.0.0.1" for IPv4 or
"::1" for IPv6, and when formatting the URL wrap IPv6 addresses in brackets
(e.g., "[::1]") before appending :{actual_port}; ensure the new value is used in
the worker_env entry for "AGENT_RELAY_RESULT_URL" so callbacks work for v4, v6
and wildcard binds.

---

Outside diff comments:
In `@packages/sdk/src/relay.ts`:
- Around line 1861-1874: The current loop resolves and caches every validated
agent_result (including partial updates), which causes waitForResult() to be
satisfied with non-final data; update the logic in the method that currently
calls takeResultResolvers(name), sets lastAgentResults.set(name, result), and
invokes onAgentResult/contract.onResult so that onAgentResult and
contract.onResult are still called for every validated result (keep streaming),
but only when result.final === true do you set this.lastAgentResults.set(name,
result) and iterate this.takeResultResolvers(name) to resolve waiters (leave the
existing validation-rejection path intact); ensure you reference the same
symbols (takeResultResolvers, lastAgentResults, onAgentResult,
contract.onResult) and only move the cache+resolve block behind a final-result
check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9bb574b2-9321-4c4e-bcf8-cc6f494da2fb

📥 Commits

Reviewing files that changed from the base of the PR and between 839d0cd and c8299b7.

📒 Files selected for processing (8)
  • crates/broker/src/protocol.rs
  • crates/broker/src/runtime/init.rs
  • crates/broker/src/runtime/maintenance.rs
  • crates/broker/src/supervisor.rs
  • crates/broker/src/worker.rs
  • packages/sdk/src/protocol.ts
  • packages/sdk/src/relay.ts
  • src/cli/relaycast-mcp.ts

Comment on lines +338 to +347
let callback_host = match cmd.api_bind.as_str() {
"0.0.0.0" | "::" | "[::]" | "" => "127.0.0.1",
other => other,
};
let mut worker_env = vec![
("RELAY_BASE_URL".to_string(), http_base.clone()),
("RELAY_API_KEY".to_string(), relay_workspace_key.clone()),
(
"AGENT_RELAY_RESULT_URL".to_string(),
format!("http://{}:{}/api/agent-result", callback_host, actual_port),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve IPv6 reachability in the callback URL.

This normalization hard-codes IPv4 loopback for IPv6 wildcard binds and passes bare IPv6 literals through unchanged. That makes callbacks fail on v6-only listeners and can emit invalid authorities like http://::1:PORT/.... Please derive the loopback from the bound socket family and bracket IPv6 hosts before formatting the URL.

Suggested fix
-    let callback_host = match cmd.api_bind.as_str() {
-        "0.0.0.0" | "::" | "[::]" | "" => "127.0.0.1",
-        other => other,
-    };
+    let callback_host = match cmd.api_bind.as_str() {
+        "0.0.0.0" | "" => "127.0.0.1".to_string(),
+        "::" | "[::]" => "::1".to_string(),
+        other => other.trim_matches(&['[', ']'][..]).to_string(),
+    };
+    let callback_authority = if callback_host.contains(':') {
+        format!("[{}]", callback_host)
+    } else {
+        callback_host.clone()
+    };
@@
         (
             "AGENT_RELAY_RESULT_URL".to_string(),
-            format!("http://{}:{}/api/agent-result", callback_host, actual_port),
+            format!("http://{}:{}/api/agent-result", callback_authority, actual_port),
         ),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/broker/src/runtime/init.rs` around lines 338 - 347, The current
normalization of cmd.api_bind into callback_host forces IPv4 loopback and leaves
raw IPv6 literals unbracketed, breaking v6-only listeners and producing invalid
URLs; update the logic in init.rs (the callback_host determination and the
AGENT_RELAY_RESULT_URL formatting where it uses callback_host and actual_port)
to: detect the bound socket family (use the socket or binding info available in
the init/runtime setup to choose IPv4 vs IPv6 loopback), set loopback to
"127.0.0.1" for IPv4 or "::1" for IPv6, and when formatting the URL wrap IPv6
addresses in brackets (e.g., "[::1]") before appending :{actual_port}; ensure
the new value is used in the worker_env entry for "AGENT_RELAY_RESULT_URL" so
callbacks work for v4, v6 and wildcard binds.

@willwashburn
Copy link
Copy Markdown
Member Author

Copy link
Copy Markdown
Contributor

Copilot AI commented May 21, 2026

@copilot please fix CI issues with clippy in https://github.com/AgentWorkforce/relay/actions/runs/26227502004/job/77178422852

Fixed in 5b4005e. I investigated the linked CI job logs, reproduced the clippy failures locally, and applied a minimal fix so the clippy run now passes (cargo clippy -- -D warnings).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/broker/src/worker.rs (1)

168-175: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

skip_relay_prompt drops the callback config for some MCP backends.

This early return skips all backend-specific injection, but later command.env(...) only reaches the worker process. In this PR, Claude/Cursor are already treated elsewhere as not reliably forwarding parent env into MCP subprocesses, so a preconfigured relaycast MCP will miss AGENT_RELAY_RESULT_* and submit_result stops working under skip_relay_prompt=true. Consider rejecting this combination for affected CLIs or still injecting just the result-callback env block.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/broker/src/worker.rs` around lines 168 - 175, The early return on
skip_relay_prompt currently bypasses all backend-specific env injection and
drops the AGENT_RELAY_RESULT_* callback config (used by submit_result), which
breaks MCP backends that need those vars (e.g., Claude/Cursor). Change the logic
around skip_relay_prompt in this function so that you do not return immediately:
either (A) still inject only the result-callback env block (the same env keys
written to command.env) even when skip_relay_prompt is true, or (B) detect
affected CLIs/backends (Claude/Cursor) and reject the combination by returning
an Err explaining that skip_relay_prompt cannot be used with those backends;
reference the skip_relay_prompt flag, the AGENT_RELAY_RESULT_* env keys,
submit_result behavior, and the command.env injection helper to locate where to
apply the change.
crates/broker/src/snippets.rs (1)

865-913: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Per-spawn result tokens should not go through shared MCP config surfaces.

agent_result is per agent, but these branches push it into shared state: Gemini/Droid's mcp add relaycast, project-level opencode.json, and .cursor/mcp.json. The last spawn wins, so concurrent agents can overwrite each other's callback token/schema, and the token also persists beyond the worker lifetime. This needs an ephemeral per-process transport, or at minimum per-agent server names plus cleanup, before merge.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/broker/src/snippets.rs` around lines 865 - 913, The code is persisting
the per-agent agent_result into shared MCP config surfaces
(configure_gemini_droid_mcp, ensure_opencode_config_with_result,
ensure_cursor_mcp_config), causing cross-agent overwrites and token leakage;
change these branches so they do not write agent_result into project/global
files: instead create an ephemeral per-process transport (e.g., temporary mcp
config files with unique names tied to agent_name or a UUID and ensure cleanup),
or pass agent_result only via in-process channels/CLI env vars to the spawned
process, or append a per-agent unique server name and remove it on process exit;
update calls to configure_gemini_droid_mcp, ensure_opencode_config_with_result,
and ensure_cursor_mcp_config to accept a flag/alternate path for ephemeral
configs or to omit persisting agent_result entirely and implement cleanup logic
to delete any temp configs after spawn.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/broker/src/snippets.rs`:
- Around line 865-913: The code is persisting the per-agent agent_result into
shared MCP config surfaces (configure_gemini_droid_mcp,
ensure_opencode_config_with_result, ensure_cursor_mcp_config), causing
cross-agent overwrites and token leakage; change these branches so they do not
write agent_result into project/global files: instead create an ephemeral
per-process transport (e.g., temporary mcp config files with unique names tied
to agent_name or a UUID and ensure cleanup), or pass agent_result only via
in-process channels/CLI env vars to the spawned process, or append a per-agent
unique server name and remove it on process exit; update calls to
configure_gemini_droid_mcp, ensure_opencode_config_with_result, and
ensure_cursor_mcp_config to accept a flag/alternate path for ephemeral configs
or to omit persisting agent_result entirely and implement cleanup logic to
delete any temp configs after spawn.

In `@crates/broker/src/worker.rs`:
- Around line 168-175: The early return on skip_relay_prompt currently bypasses
all backend-specific env injection and drops the AGENT_RELAY_RESULT_* callback
config (used by submit_result), which breaks MCP backends that need those vars
(e.g., Claude/Cursor). Change the logic around skip_relay_prompt in this
function so that you do not return immediately: either (A) still inject only the
result-callback env block (the same env keys written to command.env) even when
skip_relay_prompt is true, or (B) detect affected CLIs/backends (Claude/Cursor)
and reject the combination by returning an Err explaining that skip_relay_prompt
cannot be used with those backends; reference the skip_relay_prompt flag, the
AGENT_RELAY_RESULT_* env keys, submit_result behavior, and the command.env
injection helper to locate where to apply the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5f8bed12-a345-489b-a4be-58589c0609f3

📥 Commits

Reviewing files that changed from the base of the PR and between 6dea769 and 5b4005e.

📒 Files selected for processing (3)
  • crates/broker/src/listen_api.rs
  • crates/broker/src/snippets.rs
  • crates/broker/src/worker.rs

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 33 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="crates/broker/src/listen_api.rs">

<violation number="1" location="crates/broker/src/listen_api.rs:807">
P2: Bearer token parsing is case-sensitive, so valid `authorization: bearer ...` headers are incorrectly rejected.</violation>
</file>

<file name="packages/sdk/src/relay.ts">

<violation number="1" location="packages/sdk/src/relay.ts:1878">
P2: `resetAgentLifecycleState` drops pending `waitForResult()` waiters without rejecting them, which can leave callers hanging when an agent name is reused.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

headers
.get("authorization")
.and_then(|value| value.to_str().ok())
.and_then(|value| value.strip_prefix("Bearer "))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Bearer token parsing is case-sensitive, so valid authorization: bearer ... headers are incorrectly rejected.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At crates/broker/src/listen_api.rs, line 807:

<comment>Bearer token parsing is case-sensitive, so valid `authorization: bearer ...` headers are incorrectly rejected.</comment>

<file context>
@@ -755,6 +789,84 @@ async fn listen_api_threads(
+            headers
+                .get("authorization")
+                .and_then(|value| value.to_str().ok())
+                .and_then(|value| value.strip_prefix("Bearer "))
+                .map(str::trim)
+                .filter(|value| !value.is_empty())
</file context>
Suggested change
.and_then(|value| value.strip_prefix("Bearer "))
.and_then(|value| {
let mut parts = value.splitn(2, ' ');
match (parts.next(), parts.next()) {
(Some(scheme), Some(token)) if scheme.eq_ignore_ascii_case("bearer") => {
Some(token)
}
_ => None,
}
})

Comment thread packages/sdk/src/relay.ts
private takeResultResolvers(name: string): AgentResultResolver[] {
const waiters = this.resultResolvers.get(name);
if (!waiters || waiters.length === 0) return [];
this.resultResolvers.delete(name);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: resetAgentLifecycleState drops pending waitForResult() waiters without rejecting them, which can leave callers hanging when an agent name is reused.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/sdk/src/relay.ts, line 1878:

<comment>`resetAgentLifecycleState` drops pending `waitForResult()` waiters without rejecting them, which can leave callers hanging when an agent name is reused.</comment>

<file context>
@@ -1709,14 +1810,146 @@ export class AgentRelay {
+  private takeResultResolvers(name: string): AgentResultResolver[] {
+    const waiters = this.resultResolvers.get(name);
+    if (!waiters || waiters.length === 0) return [];
+    this.resultResolvers.delete(name);
+    return waiters;
+  }
</file context>
Suggested change
this.resultResolvers.delete(name);
for (const waiter of this.takeResultResolvers(name)) {
waiter.reject(new Error(`Agent '${name}' lifecycle reset before submitting a structured result`));
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants