feat(inference): add claude_agent_sdk provider (Claude CLI subprocess)#2746
Conversation
… -p subprocess Adds a new provider type that spawns `claude -p` as a subprocess and collects NDJSON output, letting users route OpenHuman inference through their Claude plan's Agent SDK credit (Pro/Max/Team) instead of the Anthropic HTTP API. - Config: `claude_agent_sdk` table with enabled, binary, default_model, max_budget_usd fields (disabled by default, no config.toml migration needed) - Provider: `ClaudeAgentSdkProvider` implementing `Provider` via subprocess NDJSON protocol (`--output-format stream-json`) - Factory: `claude_agent_sdk` and `claude_agent_sdk:<model>` provider strings - Doctor: warns if enabled but binary not found on PATH - Tests: construction, config defaults, factory dispatch for all three forms
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughThis PR adds a Claude Agent SDK subprocess provider: configuration schema, stream-json protocol types, a subprocess-based provider implementation, factory integration for ChangesClaude Agent SDK Provider Feature
Sequence Diagram(s)sequenceDiagram
participant Caller
participant ClaudeAgentSdkProvider
participant Claude_Binary
participant Stdout
Caller->>ClaudeAgentSdkProvider: chat_with_system(system, message, model)
ClaudeAgentSdkProvider->>Claude_Binary: spawn with -p --output-format stream-json
ClaudeAgentSdkProvider->>Claude_Binary: write prompt+message to stdin
Claude_Binary->>Stdout: stream SdkMessage NDJSON lines
loop Parse NDJSON lines
Stdout->>ClaudeAgentSdkProvider: SdkMessage(Text/Result/Error)
ClaudeAgentSdkProvider->>ClaudeAgentSdkProvider: accumulate text, capture result/error
end
Claude_Binary-->>ClaudeAgentSdkProvider: exit
ClaudeAgentSdkProvider->>Caller: return accumulated String or error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
PR Review — feat(inference): add claude_agent_sdk provider (Claude CLI subprocess) (#2479)Status: ✅ Rust Quality + Type Check passing. CI in progress (Build/E2E/Coverage). What this PR doesAdds a new Code quality
Tests (5 total)
Minor observation
No blocking issues. Recommend merge. |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/doctor/core.rs`:
- Around line 997-1036: Add grep-friendly structured logs to
check_claude_agent_sdk: log an entry message before probing (e.g.,
tracing::debug!("probe:claude_agent_sdk:entry binary={}", sdk.binary)), log just
before running the external command (include command and env), log branch
outcomes on success (tracing::info!("probe:claude_agent_sdk:ok binary={}
version={}", sdk.binary, version)) and on failure
(tracing::warn!("probe:claude_agent_sdk:warn binary={} status={:?} stdout={}
stderr={} err={:?}", sdk.binary, output.status,
String::from_utf8_lossy(&output.stdout),
String::from_utf8_lossy(&output.stderr), err_option)), and log an exit message
(e.g., tracing::debug!("probe:claude_agent_sdk:exit binary={} result={}",
sdk.binary, "ok"/"warn")). Use the existing check_claude_agent_sdk function and
emit stderr/Err details when cmd.output() errors so diagnostics include
external-call output and error information.
- Around line 1013-1034: The current match arm collapses execution errors and
non-zero exits into one warning; split the Ok(_) | Err(_) into two arms: keep
the Ok(output) if output.status.success() branch as-is, add a new Ok(output)
(non-success) branch that pushes DiagnosticItem::warn for "claude_agent_sdk"
including the configured sdk.binary, the output.status (exit code), and a short
preview of stderr (trimmed to one line) using a stable grep-friendly prefix, and
change the Err(e) arm to push DiagnosticItem::warn that includes the sdk.binary
and the actual error string (e) from cmd.output(); reference cmd.output(), the
local output variable, DiagnosticItem::warn, and sdk.binary when making these
changes.
In `@src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs`:
- Around line 71-73: The subprocess spawn and read/wait paths (the call to
cmd.spawn() -> variable child and subsequent waits/reads such as
child.wait_with_output or reading child.stdout/stderr) are currently unbounded
and can hang; wrap those async/blocking operations with a timeout (e.g.,
tokio::time::timeout) using a configurable duration (add a field like
config.timeout or use a sensible default), and on timeout ensure you kill/abort
the child process (child.kill() / child.start_kill()) and return a contextual
error via with_context so callers know it timed out; apply this pattern to the
spawn + wait_with_output path and any manual stdout/stderr read loops referenced
around the child variable and their corresponding functions so every child
wait/read is bounded and cleans up the process on timeout.
- Around line 54-56: The child process is created with stderr piped but never
read, which can block the subprocess; modify the code that builds and spawns the
Command (the block that sets .stdout(...).stderr(...).stdin(...)) and the
subsequent handling of the spawned Child (variable likely named `child`) to
concurrently drain `child.stderr` (and `child.stdout` if not already) — spawn a
background task/thread or tokio task that reads stderr to completion and
collects or logs the output (preserve it for error reporting) so the pipe never
fills; apply the same concurrent-drain pattern to the other subprocess
creation/handling code in the 75-132 region to prevent stalls and retain failure
context.
- Line 90: The tracing::trace! call that logs the raw NDJSON line
(tracing::trace!("[claude_agent_sdk] ndjson line: {}", line);) must not emit
full payloads; replace it with a redacted log that never prints the raw `line`
content. Implement or call a helper like redact_ndjson(line) to mask fields such
as prompt/input/response/user/content (or else only log safe metadata like byte
length, a hash, or a fixed prefix), and apply the same change to the other trace
sites mentioned (the similar logs around lines 122-125) so no raw NDJSON or PII
is ever written to logs. Ensure the helper is used consistently wherever `line`
is logged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 88be17bf-45ac-4575-8e9b-bc67bc65924d
📒 Files selected for processing (10)
src/openhuman/config/schema/claude_agent_sdk.rssrc/openhuman/config/schema/mod.rssrc/openhuman/config/schema/types.rssrc/openhuman/doctor/core.rssrc/openhuman/inference/provider/claude_agent_sdk/mod.rssrc/openhuman/inference/provider/claude_agent_sdk/protocol.rssrc/openhuman/inference/provider/claude_agent_sdk/subprocess.rssrc/openhuman/inference/provider/factory.rssrc/openhuman/inference/provider/factory_test.rssrc/openhuman/inference/provider/mod.rs
…k provider - doctor/core.rs: add structured probe logs (entry/exec/ok/warn/exit) with grep-friendly prefixes; split Ok(_)|Err(_) into separate arms so non-zero exit shows status+stderr preview and spawn error shows the actual error - subprocess.rs: drain stderr concurrently in a tokio task to prevent pipe stalls; wrap stdout read loop in tokio::time::timeout(120s) and child.wait() in timeout(30s) with kill on expiry; redact raw NDJSON payloads — log line_len instead of raw content to avoid leaking PII/secrets
graycyrus
left a comment
There was a problem hiding this comment.
hey @M3gA-Mind — the code looks really solid here. all 5 CodeRabbit findings (probe logging, timeout guards, stderr draining, error differentiation, NDJSON redaction) were properly fixed in b0eac2d, and the implementation is clean:
What's working well:
- Subprocess handling is tight: concurrent stderr drain prevents pipe stalls, 120s read + 30s wait timeouts prevent indefinite hangs, stderr captured for error context
- Security: no shell injection vectors, NDJSON payloads redacted in logs (no PII leakage), process cleanup on timeout
- Architecture is additive-only (zero breaking risk), disabled by default, no config migration needed
- Test coverage covers factory dispatch, config defaults, and model routing
Blocker: CI is currently failing on test / Rust Core Tests (Windows — secrets ACL) (unclear if related to your changes — might be flaky). once that clears, i'll come back and approve this. the code itself is ready.
let me know if you need anything to unblock the Windows test.
Summary
claude_agent_sdkprovider type that routes OpenHuman inference through theclaude -psubprocess, consuming the user's Claude plan Agent SDK credit (Pro: $20/mo, Max: $200/mo, Team: $100/seat/mo) instead of the Anthropic HTTP API. This is especially useful starting June 15, 2026 when Agent SDK credit becomes separate from API key billing.[claude_agent_sdk]withenabled,binary,default_model, andmax_budget_usdfields — disabled by default, requires no migration."claude_agent_sdk"(usesdefault_model) and"claude_agent_sdk:<model>"(uses specified model). Set any workload's*_providerconfig field to these strings to route that workload through the CLI.enabled = truebut the binary is not found on PATH.Closes
Fixes #2479
Implementation
Files added:
src/openhuman/config/schema/claude_agent_sdk.rs—ClaudeAgentSdkConfigstructsrc/openhuman/inference/provider/claude_agent_sdk/mod.rs— module rootsrc/openhuman/inference/provider/claude_agent_sdk/protocol.rs— NDJSON wire types for--output-format stream-jsonsrc/openhuman/inference/provider/claude_agent_sdk/subprocess.rs—ClaudeAgentSdkProviderimplementingProvidertraitFiles modified:
src/openhuman/config/schema/{mod,types}.rs— addclaude_agent_sdkconfig field toConfigsrc/openhuman/inference/provider/{mod,factory,factory_test}.rs— factory dispatch + testssrc/openhuman/doctor/core.rs— binary-availability checkTest plan
cargo check -p openhuman --testspasses (verified ✓)cargo fmt --all -- --checkpasses (verified ✓)provider_constructs_with_default_config,config_default_disabled, three factory dispatch testschat_provider = "claude_agent_sdk"in config, run a chat turn, confirm subprocess is spawned and response is returnedclaude_agent_sdk.enabled = truewithoutclaudeon PATH, run doctor, confirm Warn diagnostic appearsSummary by CodeRabbit
New Features
Diagnostics
Tests