fix: MCP tools unavailable for agents spawned via agent_add#591
fix: MCP tools unavailable for agents spawned via agent_add#591khaliqgant merged 21 commits intomainfrom
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… types Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…colors - Replace plain console.log progress in cli.ts with listr2 task list - Per-step spinners show owner, retry, nudge, force-release, and review events - chalk colors: cyan for timestamps, green/red/yellow for status, dim for metadata - logRunSummary() and broker stderr use chalk for visual hierarchy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…flows Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…steps Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dering Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… path - cli.ts: installOutputFilter() in runWithListr so YAML workflows also suppress [broker]/[workflow HH:MM] noise during listr rendering - cli.ts: done.catch()/workflowDone.catch() guards for fast-failing steps - listr-renderer.ts: workflowDone.catch() guard for instant run:failed - listr-renderer.ts: add renderer.unmount() to JSDoc example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Output filter: use regex .test() instead of startsWith/^ anchor so
chalk-colored [broker] and [workflow HH:MM] lines are properly suppressed
- Resume mode: add event listener for step progress reporting
- GHA workflow: fix grep exit code with || true, use env var instead of
raw ${{ }} interpolation (script injection), use npx tsx instead of
non-existent 'run' subcommand, only validate/dry-run YAML files
- Workflow: fix incorrect CJS assumption (SDK is ESM), add final
type-check gate after review step
- Add chalk and listr2 to root package.json (Build & Validate requires them) - Dynamic import listr2 so SDK loads on Node 18 (styleText not available) - Show steps skipped without prior start event in listr output - Remove unused ListrType import
* fix: detect claude CLI with inline args for MCP injection * fix: extract executable from cli string in gemini/droid mcp setup When cli contains inline args (e.g. 'gemini --model foo'), Command::new(cli) fails because it treats the entire string as an executable path. Now extract just the binary via shlex::split before passing to Command::new and manual_cmd.
* bump versions * fix: refresh lockfile for relaycast sdk 1.0.0 bump * fix: bump gemini relay extension to relaycast mcp 1.0.0
…colors - Replace plain console.log progress in cli.ts with listr2 task list - Per-step spinners show owner, retry, nudge, force-release, and review events - chalk colors: cyan for timestamps, green/red/yellow for status, dim for metadata - logRunSummary() and broker stderr use chalk for visual hierarchy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two root causes prevented agents spawned via the Relaycast API (agent_add MCP tool) from loading MCP tools: 1. Claude: --strict-mcp-config blocked .mcp.json loading. Removed it so --mcp-config is additive — only passes relaycast config while Claude loads user MCP servers from .mcp.json independently. 2. All CLIs: The WS AgentSpawnRequested handler had no agent pre-registration. The AgentSpawnRequestedPayload struct doesn't include a token field, so relaycast_ws_spawn_token() always returned None. Added broker-side register_agent_token() calls (matching the SDK spawn_agent path) to both WS spawn handlers. Tests: - New integration test (agent-spawns-agent.test.ts) exercises the exact agent_add flow for claude, codex, and gemini - Updated unit tests and e2e tests for new --mcp-config behavior - All 220 lib + 8 e2e Rust tests pass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve package manifest/lockfile conflicts, fix workflow validation flag order, and queue listr renderer tasks until lazy init completes.
| return; | ||
| } | ||
| } | ||
| if (/\[broker\]/.test(str) || /\[workflow\s+\d{2}:\d{2}\]/.test(str)) return; |
There was a problem hiding this comment.
🟡 Output filter regex cannot match chalk-colored [workflow HH:MM] lines due to interleaved ANSI escape codes
The installOutputFilter() function in both cli.ts:77 and listr-renderer.ts:18 uses the regex /\[workflow\s+\d{2}:\d{2}\]/ to suppress noisy [workflow HH:MM] timing lines while the listr2 renderer owns the terminal. However, runner.ts:994 now formats these lines using three separate chalk.dim.cyan() calls:
console.log(`${chalk.dim.cyan('[workflow')} ${chalk.dim.cyan(ts)}${chalk.dim.cyan(']')} ${msg}`);Each chalk call wraps its text in ANSI open/close escape sequences (e.g., \x1b[2m\x1b[36m[workflow\x1b[39m\x1b[22m). The regex expects [workflow to be immediately followed by \s+ (whitespace), but the actual string has ANSI reset codes (\x1b[39m\x1b[22m) between [workflow and the space character. Since \x1b is not a whitespace character, the \s+ quantifier fails and the regex never matches. As a result, all workflow timing lines leak through the filter into the listr2 output, creating cluttered/broken progress display — directly undermining the PR's goal of polished CLI output.
Prompt for agents
Fix the output filter in both packages/sdk/src/workflows/cli.ts (line 77) and packages/sdk/src/workflows/listr-renderer.ts (line 18) to strip ANSI escape codes before testing the regex, OR change runner.ts line 994 to wrap the entire `[workflow HH:MM]` prefix in a single chalk call so the literal text remains contiguous.
Option A (preferred — fix runner.ts:994):
Change from:
console.log(`${chalk.dim.cyan('[workflow')} ${chalk.dim.cyan(ts)}${chalk.dim.cyan(']')} ${msg}`);
To:
console.log(`${chalk.dim.cyan(`[workflow ${ts}]`)} ${msg}`);
This keeps `[workflow 00:05]` as contiguous text inside a single chalk call, so the existing regex matches.
Option B (fix the filters in cli.ts:77 and listr-renderer.ts:18):
Strip ANSI codes from `str` before testing:
const plain = str.replace(/\x1b\[[0-9;]*m/g, '');
if (/\[broker\]/.test(plain) || /\[workflow\s+\d{2}:\d{2}\]/.test(plain)) return;
Was this helpful? React with 👍 or 👎 to provide feedback.
PR #591 added a synchronous register_agent_token() HTTP call with a 15s timeout in the WS event loop before spawning agents. This blocked the event loop and delayed Codex agent spawns by up to 15s (on top of the existing 25s boot marker timeout), causing apparent spawn failures. Reduce the timeout to 3s so the spawn proceeds quickly. On timeout or failure, the agent self-registers via its MCP server (pre-#591 behavior). Also adds ~/.local/bin, ~/.opencode/bin, ~/.claude/local to the fallback PATH in pty.rs so CLIs installed in user-local directories are found. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: reduce WS spawn pre-registration timeout from 15s to 3s PR #591 added a synchronous register_agent_token() HTTP call with a 15s timeout in the WS event loop before spawning agents. This blocked the event loop and delayed Codex agent spawns by up to 15s (on top of the existing 25s boot marker timeout), causing apparent spawn failures. Reduce the timeout to 3s so the spawn proceeds quickly. On timeout or failure, the agent self-registers via its MCP server (pre-#591 behavior). Also adds ~/.local/bin, ~/.opencode/bin, ~/.claude/local to the fallback PATH in pty.rs so CLIs installed in user-local directories are found. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: skip pre-registration for Claude agents (self-registers via MCP) Claude bakes the API key into --mcp-config JSON and self-registers reliably, so the blocking HTTP registration call is unnecessary. Non-Claude CLIs still get a 3s registration attempt since they need the token injected into their CLI args at spawn time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address Devin review — CLI arg parsing and dedup-after-spawn Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: seed dedup before spawn with cleanup on failure Issue 1: Keep dedup seeding before spawn (so WS echoes during spawn are deduplicated) but remove the dedup entry if spawn fails, preventing failed spawns from blocking retries for the 5-minute dedup window. Adds DedupCache::remove() and remove_local_spawn_control_dedup(). Issue 2: Already fixed in prior commit (parse_cli_command before normalize_cli_name for is_claude check). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: preserve dedup entries when spawn fails with already-exists When a second spawn request for an already-running agent fails with "already exists", we must not remove the dedup entry from the first successful spawn. Doing so would allow WebSocket echoes through. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
--strict-mcp-configfrom--mcp-configinjection. The flag blocked.mcp.jsonloading, preventing MCP tools from being discovered.--mcp-configis now additive — only passes relaycast config while Claude loads user MCP servers from.mcp.jsonindependently.AgentSpawnRequestedhandler. TheAgentSpawnRequestedPayloadstruct doesn't include atokenfield, sorelaycast_ws_spawn_token()always returnedNone. Without a pre-registered token, the MCP server failed to authenticate at startup. Now the broker callsregister_agent_token()before spawning (matching the SDKspawn_agentpath that already worked).Root Cause
Two independent issues combined to break MCP for agents spawned via
agent_add:For Claude:
--strict-mcp-configtold Claude "only use this inline config, ignore.mcp.json". If the inline MCP server failed to start for any reason, no MCP tools were available at all.For all CLIs: The WS spawn path (used by
agent_add) never pre-registered agents with the Relaycast API. The SDKspawn_agentpath calledhttp.register_agent_token()before spawning, giving the MCP server a valid token. The WS path skipped this, expecting the WS event to carry a token — butAgentSpawnRequestedPayloadhas no token field.Test plan
cargo test)agent-spawns-agent.test.tsexercises the exactagent_addflowgeminiCLI on PATH)mcp-injectiontests still pass (SDKspawnPtypath unaffected)🤖 Generated with Claude Code