Skip to content

feat(inference): add claude_agent_sdk provider (Claude CLI subprocess)#2746

Merged
M3gA-Mind merged 5 commits into
tinyhumansai:mainfrom
M3gA-Mind:feat/claude-agent-sdk-provider
May 27, 2026
Merged

feat(inference): add claude_agent_sdk provider (Claude CLI subprocess)#2746
M3gA-Mind merged 5 commits into
tinyhumansai:mainfrom
M3gA-Mind:feat/claude-agent-sdk-provider

Conversation

@M3gA-Mind
Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind commented May 27, 2026

Summary

  • Adds a new claude_agent_sdk provider type that routes OpenHuman inference through the claude -p subprocess, consuming the user's Claude plan Agent SDK credit (Pro: $20/mo, Max: $200/mo, Team: $100/seat/mo) instead of the Anthropic HTTP API. This is especially useful starting June 15, 2026 when Agent SDK credit becomes separate from API key billing.
  • New config table [claude_agent_sdk] with enabled, binary, default_model, and max_budget_usd fields — disabled by default, requires no migration.
  • New provider strings: "claude_agent_sdk" (uses default_model) and "claude_agent_sdk:<model>" (uses specified model). Set any workload's *_provider config field to these strings to route that workload through the CLI.
  • Doctor check warns when enabled = true but the binary is not found on PATH.

Closes

Fixes #2479

Implementation

Files added:

  • src/openhuman/config/schema/claude_agent_sdk.rsClaudeAgentSdkConfig struct
  • src/openhuman/inference/provider/claude_agent_sdk/mod.rs — module root
  • src/openhuman/inference/provider/claude_agent_sdk/protocol.rs — NDJSON wire types for --output-format stream-json
  • src/openhuman/inference/provider/claude_agent_sdk/subprocess.rsClaudeAgentSdkProvider implementing Provider trait

Files modified:

  • src/openhuman/config/schema/{mod,types}.rs — add claude_agent_sdk config field to Config
  • src/openhuman/inference/provider/{mod,factory,factory_test}.rs — factory dispatch + tests
  • src/openhuman/doctor/core.rs — binary-availability check

Test plan

  • cargo check -p openhuman --tests passes (verified ✓)
  • cargo fmt --all -- --check passes (verified ✓)
  • Unit tests: provider_constructs_with_default_config, config_default_disabled, three factory dispatch tests
  • Manual: set chat_provider = "claude_agent_sdk" in config, run a chat turn, confirm subprocess is spawned and response is returned
  • Manual: set claude_agent_sdk.enabled = true without claude on PATH, run doctor, confirm Warn diagnostic appears

Summary by CodeRabbit

  • New Features

    • Added Claude Agent SDK as a new inference provider that routes requests via the Claude CLI subprocess.
    • Added user-facing configuration: enable/disable toggle, custom binary path, default model selection, and optional budget limit.
  • Diagnostics

    • Added a health check to detect the Claude CLI and report its version or warn on failures.
  • Tests

    • Added unit tests for provider configuration and model-selection routing.

Review Change Stack

… -p subprocess

Adds a new provider type that spawns `claude -p` as a subprocess and
collects NDJSON output, letting users route OpenHuman inference through
their Claude plan's Agent SDK credit (Pro/Max/Team) instead of the
Anthropic HTTP API.

- Config: `claude_agent_sdk` table with enabled, binary, default_model,
  max_budget_usd fields (disabled by default, no config.toml migration needed)
- Provider: `ClaudeAgentSdkProvider` implementing `Provider` via subprocess
  NDJSON protocol (`--output-format stream-json`)
- Factory: `claude_agent_sdk` and `claude_agent_sdk:<model>` provider strings
- Doctor: warns if enabled but binary not found on PATH
- Tests: construction, config defaults, factory dispatch for all three forms
@M3gA-Mind M3gA-Mind requested a review from a team May 27, 2026 09:20
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 154deba5-5db8-481d-8832-df18e9daa813

📥 Commits

Reviewing files that changed from the base of the PR and between a81534d and 201d703.

📒 Files selected for processing (3)
  • src/openhuman/doctor/core.rs
  • src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs
  • src/openhuman/inference/provider/factory_test.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/openhuman/inference/provider/factory_test.rs
  • src/openhuman/doctor/core.rs
  • src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs

📝 Walkthrough

Walkthrough

This PR adds a Claude Agent SDK subprocess provider: configuration schema, stream-json protocol types, a subprocess-based provider implementation, factory integration for claude_agent_sdk[:<model>], and a doctor probe that checks the configured claude binary.

Changes

Claude Agent SDK Provider Feature

Layer / File(s) Summary
Configuration schema and integration
src/openhuman/config/schema/claude_agent_sdk.rs, src/openhuman/config/schema/mod.rs, src/openhuman/config/schema/types.rs
ClaudeAgentSdkConfig struct defines enabled, binary, default_model, and optional max_budget_usd fields with serde defaults. Integrated into top-level Config with serde defaulting annotation and Default impl.
Protocol models and provider implementation
src/openhuman/inference/provider/claude_agent_sdk/mod.rs, src/openhuman/inference/provider/claude_agent_sdk/protocol.rs, src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs
SdkMessage and SdkError types deserialize Claude's stream-json NDJSON. ClaudeAgentSdkProvider spawns the configured binary with -p --output-format stream-json, streams and parses NDJSON lines, accumulates text/result or returns errors, supports optional system prompts, default-model fallback, and conditional budget flags. Includes unit tests for defaults.
Factory provider string parsing and construction
src/openhuman/inference/provider/mod.rs, src/openhuman/inference/provider/factory.rs, src/openhuman/inference/provider/factory_test.rs
Provider factory recognizes claude_agent_sdk and claude_agent_sdk:<model> forms, exports grammar constants, constructs ClaudeAgentSdkProvider with resolved model, and updates error message valid-forms. Tests cover bare form, model suffix override, and config-based default model.
Doctor health check for Claude binary
src/openhuman/doctor/core.rs
check_claude_agent_sdk runs <binary> --version when enabled and emits Ok with detected version on success or Warn with guidance on failure.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant ClaudeAgentSdkProvider
  participant Claude_Binary
  participant Stdout
  Caller->>ClaudeAgentSdkProvider: chat_with_system(system, message, model)
  ClaudeAgentSdkProvider->>Claude_Binary: spawn with -p --output-format stream-json
  ClaudeAgentSdkProvider->>Claude_Binary: write prompt+message to stdin
  Claude_Binary->>Stdout: stream SdkMessage NDJSON lines
  loop Parse NDJSON lines
    Stdout->>ClaudeAgentSdkProvider: SdkMessage(Text/Result/Error)
    ClaudeAgentSdkProvider->>ClaudeAgentSdkProvider: accumulate text, capture result/error
  end
  Claude_Binary-->>ClaudeAgentSdkProvider: exit
  ClaudeAgentSdkProvider->>Caller: return accumulated String or error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2489: Modifies provider factory parsing/dispatch logic similar to this PR's changes to create_chat_provider_from_string.

Suggested reviewers

  • laith-max
  • senamakel
  • graycyrus

Poem

A rabbit built a tunnel deep,
To Claude's agent SDK to leap,
It spawns a subprocess, listens to streams,
NDJSON whispers and model dreams,
Hopping home with credits bright. 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding a new claude_agent_sdk provider that routes inference through the Claude CLI subprocess.
Linked Issues check ✅ Passed The PR successfully implements the core requirements from issue #2479: adds claude_agent_sdk provider, spawns claude -p subprocess, handles NDJSON protocol parsing, supports config with enabled/binary/default_model/max_budget_usd fields, includes doctor checks, and implements provider factory dispatch.
Out of Scope Changes check ✅ Passed All changes align with the scope defined in issue #2479. The PR adds the new provider integration and related config/doctor/factory plumbing without touching migrations, other providers, or composio integrations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@M3gA-Mind
Copy link
Copy Markdown
Contributor Author

PR Review — feat(inference): add claude_agent_sdk provider (Claude CLI subprocess) (#2479)

Status: ✅ Rust Quality + Type Check passing. CI in progress (Build/E2E/Coverage).

What this PR does

Adds a new claude_agent_sdk provider type that routes OpenHuman inference through the claude -p subprocess, consuming the user's Claude plan Agent SDK credit instead of the Anthropic HTTP API. Disabled by default; no migration required.

Code quality

claude_agent_sdk.rs (config)

  • #[serde(default)] on all fields — safe forward-compat for new TOML keys
  • ✅ Sensible defaults: enabled = false, binary = "claude", default_model = "claude-sonnet-4-6", max_budget_usd = None

subprocess.rs (provider)

  • ✅ System prompt injected inline as [SYSTEM]...[/SYSTEM] — correct workaround for claude -p having no separate system flag
  • ✅ NDJSON stream parsing handles Text, Result, Error, #[serde(other)] Unknown — forward-compatible with new message types
  • tracing::warn! on unparseable NDJSON lines rather than hard failure — production resilience
  • ✅ Prefers SdkMessage::Result.result over joined Text parts when both present
  • ✅ Bails correctly on is_error = true with the result/error message
  • ✅ Non-zero exit + empty output → bail (prevents silent failures)

doctor/core.rs

  • ✅ Only runs check when sdk.enabled = true — no overhead for disabled users
  • CREATE_NO_WINDOW on Windows prevents ghost console windows
  • ✅ Graceful warn (not error) when binary not found — correct severity

factory.rs

  • "claude_agent_sdk" and "claude_agent_sdk:<model>" both handled before the generic colon-split path
  • ✅ Error message updated to list both new provider forms
  • config.claude_agent_sdk.clone() — provider gets a snapshot of config at construction time

Tests (5 total)

  • provider_constructs_with_default_config, config_default_disabled — struct smoke tests
  • claude_agent_sdk_bare_provider_string_uses_default_model, _with_model_suffix, _with_custom_default_model_in_config — factory dispatch coverage

Minor observation

max_budget_usd is in the config but the subprocess command construction (visible in the diff) doesn't wire it to a --budget flag. If this is intentional (budget enforcement deferred), it's worth a TODO comment. Not a blocker.

No blocking issues. Recommend merge.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. labels May 27, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/doctor/core.rs`:
- Around line 997-1036: Add grep-friendly structured logs to
check_claude_agent_sdk: log an entry message before probing (e.g.,
tracing::debug!("probe:claude_agent_sdk:entry binary={}", sdk.binary)), log just
before running the external command (include command and env), log branch
outcomes on success (tracing::info!("probe:claude_agent_sdk:ok binary={}
version={}", sdk.binary, version)) and on failure
(tracing::warn!("probe:claude_agent_sdk:warn binary={} status={:?} stdout={}
stderr={} err={:?}", sdk.binary, output.status,
String::from_utf8_lossy(&output.stdout),
String::from_utf8_lossy(&output.stderr), err_option)), and log an exit message
(e.g., tracing::debug!("probe:claude_agent_sdk:exit binary={} result={}",
sdk.binary, "ok"/"warn")). Use the existing check_claude_agent_sdk function and
emit stderr/Err details when cmd.output() errors so diagnostics include
external-call output and error information.
- Around line 1013-1034: The current match arm collapses execution errors and
non-zero exits into one warning; split the Ok(_) | Err(_) into two arms: keep
the Ok(output) if output.status.success() branch as-is, add a new Ok(output)
(non-success) branch that pushes DiagnosticItem::warn for "claude_agent_sdk"
including the configured sdk.binary, the output.status (exit code), and a short
preview of stderr (trimmed to one line) using a stable grep-friendly prefix, and
change the Err(e) arm to push DiagnosticItem::warn that includes the sdk.binary
and the actual error string (e) from cmd.output(); reference cmd.output(), the
local output variable, DiagnosticItem::warn, and sdk.binary when making these
changes.

In `@src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs`:
- Around line 71-73: The subprocess spawn and read/wait paths (the call to
cmd.spawn() -> variable child and subsequent waits/reads such as
child.wait_with_output or reading child.stdout/stderr) are currently unbounded
and can hang; wrap those async/blocking operations with a timeout (e.g.,
tokio::time::timeout) using a configurable duration (add a field like
config.timeout or use a sensible default), and on timeout ensure you kill/abort
the child process (child.kill() / child.start_kill()) and return a contextual
error via with_context so callers know it timed out; apply this pattern to the
spawn + wait_with_output path and any manual stdout/stderr read loops referenced
around the child variable and their corresponding functions so every child
wait/read is bounded and cleans up the process on timeout.
- Around line 54-56: The child process is created with stderr piped but never
read, which can block the subprocess; modify the code that builds and spawns the
Command (the block that sets .stdout(...).stderr(...).stdin(...)) and the
subsequent handling of the spawned Child (variable likely named `child`) to
concurrently drain `child.stderr` (and `child.stdout` if not already) — spawn a
background task/thread or tokio task that reads stderr to completion and
collects or logs the output (preserve it for error reporting) so the pipe never
fills; apply the same concurrent-drain pattern to the other subprocess
creation/handling code in the 75-132 region to prevent stalls and retain failure
context.
- Line 90: The tracing::trace! call that logs the raw NDJSON line
(tracing::trace!("[claude_agent_sdk] ndjson line: {}", line);) must not emit
full payloads; replace it with a redacted log that never prints the raw `line`
content. Implement or call a helper like redact_ndjson(line) to mask fields such
as prompt/input/response/user/content (or else only log safe metadata like byte
length, a hash, or a fixed prefix), and apply the same change to the other trace
sites mentioned (the similar logs around lines 122-125) so no raw NDJSON or PII
is ever written to logs. Ensure the helper is used consistently wherever `line`
is logged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 88be17bf-45ac-4575-8e9b-bc67bc65924d

📥 Commits

Reviewing files that changed from the base of the PR and between abbde49 and a81534d.

📒 Files selected for processing (10)
  • src/openhuman/config/schema/claude_agent_sdk.rs
  • src/openhuman/config/schema/mod.rs
  • src/openhuman/config/schema/types.rs
  • src/openhuman/doctor/core.rs
  • src/openhuman/inference/provider/claude_agent_sdk/mod.rs
  • src/openhuman/inference/provider/claude_agent_sdk/protocol.rs
  • src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs
  • src/openhuman/inference/provider/factory.rs
  • src/openhuman/inference/provider/factory_test.rs
  • src/openhuman/inference/provider/mod.rs

Comment thread src/openhuman/doctor/core.rs
Comment thread src/openhuman/doctor/core.rs
Comment thread src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs
Comment thread src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs
Comment thread src/openhuman/inference/provider/claude_agent_sdk/subprocess.rs Outdated
M3gA-Mind added 2 commits May 27, 2026 16:34
…k provider

- doctor/core.rs: add structured probe logs (entry/exec/ok/warn/exit) with
  grep-friendly prefixes; split Ok(_)|Err(_) into separate arms so non-zero
  exit shows status+stderr preview and spawn error shows the actual error
- subprocess.rs: drain stderr concurrently in a tokio task to prevent pipe
  stalls; wrap stdout read loop in tokio::time::timeout(120s) and child.wait()
  in timeout(30s) with kill on expiry; redact raw NDJSON payloads — log
  line_len instead of raw content to avoid leaking PII/secrets
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @M3gA-Mind — the code looks really solid here. all 5 CodeRabbit findings (probe logging, timeout guards, stderr draining, error differentiation, NDJSON redaction) were properly fixed in b0eac2d, and the implementation is clean:

What's working well:

  • Subprocess handling is tight: concurrent stderr drain prevents pipe stalls, 120s read + 30s wait timeouts prevent indefinite hangs, stderr captured for error context
  • Security: no shell injection vectors, NDJSON payloads redacted in logs (no PII leakage), process cleanup on timeout
  • Architecture is additive-only (zero breaking risk), disabled by default, no config migration needed
  • Test coverage covers factory dispatch, config defaults, and model routing

Blocker: CI is currently failing on test / Rust Core Tests (Windows — secrets ACL) (unclear if related to your changes — might be flaky). once that clears, i'll come back and approve this. the code itself is ready.

let me know if you need anything to unblock the Windows test.

@M3gA-Mind M3gA-Mind merged commit cde5f65 into tinyhumansai:main May 27, 2026
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Claude Agent SDK provider for subscription-backed credit

2 participants