Skip to content

feat(sdk-py): add direct spawn/message API#473

Merged
khaliqgant merged 9 commits intomainfrom
feat/python-sdk-spawn-api-only
Mar 2, 2026
Merged

feat(sdk-py): add direct spawn/message API#473
khaliqgant merged 9 commits intomainfrom
feat/python-sdk-spawn-api-only

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Mar 2, 2026

Summary

Adds AgentRelay facade, AgentRelayClient, Agent handles, and Models constants to the Python SDK so it can spawn agents and exchange messages directly via the broker binary — matching the TypeScript SDK's API.

New files:

  • protocol.py, client.py, relay.py, models.py in packages/sdk-py/src/agent_relay/

Features:

  • Backward compatible: workflow builder (workflow(), fan_out(), pipeline(), dag()) still works unchanged
  • Auto-downloads broker binary on first use
  • Event hooks for messages, agent lifecycle events

Target usage

from agent_relay import AgentRelay, Models

async def main():
    relay = AgentRelay(channels=["GTM"])
    relay.on_message_received = lambda msg: print(f"[{msg.from_name}]: {msg.text}")

    await relay.claude.spawn(name="Analyst", model=Models.Claude.OPUS, channels=["GTM"], task="Analyze")
    await relay.codex.spawn(name="Coder", model=Models.Codex.GPT_5_3_CODEX, channels=["GTM"], task="Build")

    await asyncio.gather(
        relay.wait_for_agent_ready("Analyst"),
        relay.wait_for_agent_ready("Coder"),
    )
    await relay.shutdown()

Split from PR #472

This PR contains only the Python SDK changes, split out from #472 as requested. The openclaw-relaycast package will be in a separate PR.

Test plan

  • All existing workflow builder tests pass
  • New imports verified working
  • Integration test with live broker binary

🤖 Generated with Claude Code


Open with Devin

khaliqgant and others added 6 commits March 2, 2026 15:36
Add AgentRelay facade, AgentRelayClient, Agent handles, and Models
constants so the Python SDK can spawn agents and exchange messages
directly via the broker binary, matching the TypeScript SDK's API.

- protocol.py: wire protocol types (AgentSpec, ProtocolEnvelope, etc.)
- client.py: async broker subprocess client with JSON protocol
- relay.py: high-level facade with event hooks and agent spawners
- models.py: Claude, Codex, Gemini model constants
- Updated __init__.py with new primary exports + backward compat
- Legacy agent_relay/ shim re-exports from src/agent_relay/
- Version bump to 0.3.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use agent.name (from broker response) instead of input name for
  state cleanup in spawn() and spawn_and_wait()
- Auto-download broker binary from GitHub releases on first use if
  not found at ~/.agent-relay/bin/ or on PATH. Includes platform
  detection, macOS codesigning, and binary verification.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add runtime auto-download fallback to resolveDefaultBinaryPath().
If the broker binary isn't found at any of the existing locations
(Cargo build, bundled npm, ~/.agent-relay/bin/, PATH), automatically
download it from GitHub releases. Matches the Python SDK behavior.

Includes platform detection, macOS codesigning, and binary
verification. Falls back to a helpful manual install message on
failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Lead with the new AgentRelay API showing Claude + Codex collaboration,
move workflow builder to an advanced section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix concurrent wait_for_exit/wait_for_idle overwrite bug: use list of
  futures per agent so multiple callers all resolve correctly
- Remove dead exception handler in send_message: client already catches
  unsupported_operation, check result dict sentinel instead
- Skip on_message_sent hook for unsupported operations
- Fix version placeholder to match latest published version (3.0.2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Python install/examples to introduction and quickstart using CodeGroup tabs
- Create full Python SDK reference page (reference/sdk-py.mdx)
- Add sdk-py to mint.json navigation
- Cross-link between TypeScript and Python SDK reference pages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

try {
fs.mkdirSync(installDir, { recursive: true });
execSync(`curl -fsSL "${downloadUrl}" -o "${targetPath}"`, {

Check warning

Code scanning / CodeQL

Indirect uncontrolled command line Medium

This command depends on an unsanitized
environment variable
.
This command depends on an unsanitized
environment variable
.

Copilot Autofix

AI about 2 months ago

General approach: Avoid passing untrusted or environment-derived values into a shell-processed command string. Use child_process.execFileSync (or spawnSync) with an argument array instead of execSync with a concatenated string. That way, targetPath and downloadUrl become plain arguments to curl, not subject to shell parsing.

Best specific fix here:

  • Replace the execSync call that runs curl with execFileSync('curl', [...]), passing downloadUrl and targetPath as separate arguments.
  • Similarly, replace the codesign invocation on macOS with execFileSync('codesign', [...]).
  • Replace the verification step execSync(\"${targetPath}" --help`, ...)withexecFileSync(targetPath, ['--help'], ...)`.
  • Keep timeouts and stdio options as close as possible to the original behavior.
  • To implement this, we only need to:
    • Import execFileSync from node:child_process alongside the existing imports.
    • Update the three execSync usages inside installBrokerBinary to execFileSync with argument arrays.

All changes are confined to packages/sdk/src/client.ts in the shown regions.

Suggested changeset 1
packages/sdk/src/client.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/packages/sdk/src/client.ts b/packages/sdk/src/client.ts
--- a/packages/sdk/src/client.ts
+++ b/packages/sdk/src/client.ts
@@ -1,5 +1,5 @@
 import { once } from 'node:events';
-import { execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
+import { execFileSync, execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
 import { createInterface, type Interface as ReadlineInterface } from 'node:readline';
 import fs from 'node:fs';
 import os from 'node:os';
@@ -695,7 +695,7 @@
 
   try {
     fs.mkdirSync(installDir, { recursive: true });
-    execSync(`curl -fsSL "${downloadUrl}" -o "${targetPath}"`, {
+    execFileSync('curl', ['-fsSL', downloadUrl, '-o', targetPath], {
       timeout: 60_000,
       stdio: ['pipe', 'pipe', 'pipe'],
     });
@@ -704,7 +704,7 @@
     // macOS: re-sign to avoid Gatekeeper issues
     if (process.platform === 'darwin') {
       try {
-        execSync(`codesign --force --sign - "${targetPath}"`, {
+        execFileSync('codesign', ['--force', '--sign', '-', targetPath], {
           timeout: 10_000,
           stdio: ['pipe', 'pipe', 'pipe'],
         });
@@ -714,7 +714,10 @@
     }
 
     // Verify
-    execSync(`"${targetPath}" --help`, { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
+    execFileSync(targetPath, ['--help'], {
+      timeout: 10_000,
+      stdio: ['pipe', 'pipe', 'pipe'],
+    });
   } catch (err) {
     try { fs.unlinkSync(targetPath); } catch { /* ignore */ }
     const message = err instanceof Error ? err.message : String(err);
EOF
@@ -1,5 +1,5 @@
import { once } from 'node:events';
import { execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
import { execFileSync, execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
import { createInterface, type Interface as ReadlineInterface } from 'node:readline';
import fs from 'node:fs';
import os from 'node:os';
@@ -695,7 +695,7 @@

try {
fs.mkdirSync(installDir, { recursive: true });
execSync(`curl -fsSL "${downloadUrl}" -o "${targetPath}"`, {
execFileSync('curl', ['-fsSL', downloadUrl, '-o', targetPath], {
timeout: 60_000,
stdio: ['pipe', 'pipe', 'pipe'],
});
@@ -704,7 +704,7 @@
// macOS: re-sign to avoid Gatekeeper issues
if (process.platform === 'darwin') {
try {
execSync(`codesign --force --sign - "${targetPath}"`, {
execFileSync('codesign', ['--force', '--sign', '-', targetPath], {
timeout: 10_000,
stdio: ['pipe', 'pipe', 'pipe'],
});
@@ -714,7 +714,10 @@
}

// Verify
execSync(`"${targetPath}" --help`, { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
execFileSync(targetPath, ['--help'], {
timeout: 10_000,
stdio: ['pipe', 'pipe', 'pipe'],
});
} catch (err) {
try { fs.unlinkSync(targetPath); } catch { /* ignore */ }
const message = err instanceof Error ? err.message : String(err);
Copilot is powered by AI and may make mistakes. Always verify output.
// macOS: re-sign to avoid Gatekeeper issues
if (process.platform === 'darwin') {
try {
execSync(`codesign --force --sign - "${targetPath}"`, {

Check warning

Code scanning / CodeQL

Indirect uncontrolled command line Medium

This command depends on an unsanitized
environment variable
.
This command depends on an unsanitized
environment variable
.

Copilot Autofix

AI about 2 months ago

In general, the safest way to fix this kind of issue is to avoid spawning a shell when executing external commands, and instead use APIs that accept the program name and its arguments as separate parameters (for Node.js, child_process.execFile / execFileSync or spawn). This way, any untrusted strings are passed directly as arguments to the target program and are not interpreted by a shell, eliminating shell injection concerns.

For this specific code, installBrokerBinary already imports execSync from node:child_process and uses it to run curl, codesign, and the newly installed broker binary. The only flagged command is the codesign invocation on macOS:

execSync(`codesign --force --sign - "${targetPath}"`, {
  timeout: 10_000,
  stdio: ['pipe', 'pipe', 'pipe'],
});

The minimal, behavior-preserving fix is to replace this with a call to execFileSync and pass the arguments as an array: ['--force', '--sign', '-', targetPath]. This avoids using a shell entirely, so any contents of targetPath (even if influenced by environment variables) are treated purely as a path argument by codesign. To do this:

  1. Update the import from node:child_process at the top of packages/sdk/src/client.ts to include execFileSync alongside the existing execSync and spawn.
  2. Replace the execSync invocation within the macOS codesign block with execFileSync('codesign', ['--force', '--sign', '-', targetPath], { ... }), preserving the existing timeout and stdio options.

No other behavior changes are needed, and the rest of the function can remain untouched. This single change will also address any additional alert variants that refer to the same sink.

Suggested changeset 1
packages/sdk/src/client.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/packages/sdk/src/client.ts b/packages/sdk/src/client.ts
--- a/packages/sdk/src/client.ts
+++ b/packages/sdk/src/client.ts
@@ -1,5 +1,5 @@
 import { once } from 'node:events';
-import { execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
+import { execSync, execFileSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
 import { createInterface, type Interface as ReadlineInterface } from 'node:readline';
 import fs from 'node:fs';
 import os from 'node:os';
@@ -704,7 +704,7 @@
     // macOS: re-sign to avoid Gatekeeper issues
     if (process.platform === 'darwin') {
       try {
-        execSync(`codesign --force --sign - "${targetPath}"`, {
+        execFileSync('codesign', ['--force', '--sign', '-', targetPath], {
           timeout: 10_000,
           stdio: ['pipe', 'pipe', 'pipe'],
         });
EOF
@@ -1,5 +1,5 @@
import { once } from 'node:events';
import { execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
import { execSync, execFileSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
import { createInterface, type Interface as ReadlineInterface } from 'node:readline';
import fs from 'node:fs';
import os from 'node:os';
@@ -704,7 +704,7 @@
// macOS: re-sign to avoid Gatekeeper issues
if (process.platform === 'darwin') {
try {
execSync(`codesign --force --sign - "${targetPath}"`, {
execFileSync('codesign', ['--force', '--sign', '-', targetPath], {
timeout: 10_000,
stdio: ['pipe', 'pipe', 'pipe'],
});
Copilot is powered by AI and may make mistakes. Always verify output.
}

// Verify
execSync(`"${targetPath}" --help`, { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });

Check warning

Code scanning / CodeQL

Indirect uncontrolled command line Medium

This command depends on an unsanitized
environment variable
.
This command depends on an unsanitized
environment variable
.

Copilot Autofix

AI about 2 months ago

General approach: Avoid passing any value derived from untrusted input (including environment variables) into a shell command string. Prefer child_process.execFileSync (or spawn with shell: false) so arguments are passed as an array and not interpreted by a shell. This removes shell metacharacter interpretation regardless of the content of targetPath.

Best concrete fix here: change the verification step on line 717 from execSync with a constructed command string to execFileSync with targetPath as the executable and '--help' as an argument. That way, even if HOME/USERPROFILE are maliciously set, targetPath is treated purely as an executable path, not parsed by a shell.

Details:

  • In packages/sdk/src/client.ts, add execFileSync to the existing import from node:child_process.
  • Replace:
    execSync(`"${targetPath}" --help`, { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
    with:
    execFileSync(targetPath, ['--help'], { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
  • This keeps behavior the same (run the just-installed broker with --help to verify it works) but avoids shell interpretation.
  • No additional helper methods or complex sanitization are needed.

Suggested changeset 1
packages/sdk/src/client.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/packages/sdk/src/client.ts b/packages/sdk/src/client.ts
--- a/packages/sdk/src/client.ts
+++ b/packages/sdk/src/client.ts
@@ -1,5 +1,5 @@
 import { once } from 'node:events';
-import { execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
+import { execSync, execFileSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
 import { createInterface, type Interface as ReadlineInterface } from 'node:readline';
 import fs from 'node:fs';
 import os from 'node:os';
@@ -714,7 +714,7 @@
     }
 
     // Verify
-    execSync(`"${targetPath}" --help`, { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
+    execFileSync(targetPath, ['--help'], { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
   } catch (err) {
     try { fs.unlinkSync(targetPath); } catch { /* ignore */ }
     const message = err instanceof Error ? err.message : String(err);
EOF
@@ -1,5 +1,5 @@
import { once } from 'node:events';
import { execSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
import { execSync, execFileSync, spawn, type ChildProcessWithoutNullStreams } from 'node:child_process';
import { createInterface, type Interface as ReadlineInterface } from 'node:readline';
import fs from 'node:fs';
import os from 'node:os';
@@ -714,7 +714,7 @@
}

// Verify
execSync(`"${targetPath}" --help`, { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
execFileSync(targetPath, ['--help'], { timeout: 10_000, stdio: ['pipe', 'pipe', 'pipe'] });
} catch (err) {
try { fs.unlinkSync(targetPath); } catch { /* ignore */ }
const message = err instanceof Error ? err.message : String(err);
Copilot is powered by AI and may make mistakes. Always verify output.
devin-ai-integration[bot]

This comment was marked as resolved.

willwashburn
willwashburn previously approved these changes Mar 2, 2026
Address Devin review comment - the Python SDK's wait_for_agent_message
was missing handling for the agent_released event. If an agent was
released before sending its first message, the method would hang until
timeout instead of immediately rejecting with a clear error.

This brings parity with the TypeScript SDK which handles all three
cases: relay_inbound, agent_exited, and agent_released.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Agent Relay and others added 2 commits March 2, 2026 16:42
Address Devin review comment - when user provides a custom env dict
to AgentRelay(env={...}), RELAY_API_KEY from os.environ was not being
injected. This would silently break Relaycast workspace connection.

Now matches TypeScript SDK behavior at packages/sdk/src/relay.ts:790-806
which explicitly handles injecting RELAY_API_KEY into custom env dicts.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed `pip install agent-relay` to `pip install agent-relay-sdk`
to match the package name in pyproject.toml.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 13 additional findings in Devin Review.

Open in Devin Review

Comment on lines +407 to +408
for listener in self._event_listeners:
listener(event)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Uncaught exception in event listener kills the stdout reader, silently breaking all event processing

If any event hook callback (e.g., on_message_received, on_agent_ready) raises an exception, it propagates through _handle_stdout_line_read_stdout, terminating the reader coroutine. After this, no more broker events or responses are processed, causing all pending and future requests to hang until their timeouts.

Root Cause and Impact

In client.py:407, event listeners are called in a bare loop with no try/except:

for listener in self._event_listeners:
    listener(event)

The listeners include the relay's _wire_events handler (relay.py:636), which calls user-provided hooks like self.on_message_received(msg) at relay.py:653. If the user's hook raises (e.g., relay.on_message_received = lambda msg: 1/0), the exception propagates up through _handle_stdout_line at client.py:388, out of the while True loop in _read_stdout at client.py:367-373, killing the reader task.

Since the reader task is fire-and-forget (asyncio.create_task at client.py:345), the exception is silently lost ("Task exception was never retrieved"). All subsequent stdout from the broker is ignored. Pending requests in self._pending hang until their individual timeouts. The broker process continues running but the SDK is effectively dead.

The TypeScript equivalent uses Node.js readline's on('line', ...) event handler (client.ts:403-414), where an exception in a handler doesn't prevent subsequent line events from being processed — making the TS version naturally more resilient to callback errors.

Impact: A single user callback error permanently breaks the SDK instance, with no visible error or recovery path.

Suggested change
for listener in self._event_listeners:
listener(event)
for listener in self._event_listeners:
try:
listener(event)
except Exception:
pass # Don't let listener errors kill the reader
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@khaliqgant khaliqgant merged commit cbc896f into main Mar 2, 2026
33 checks passed
@khaliqgant khaliqgant deleted the feat/python-sdk-spawn-api-only branch March 2, 2026 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants