feat(agent): subprocess-driven Claude Code, replacing API-key requirement (PR C)#10
Merged
WiktorStarczewski merged 2 commits intomainfrom Apr 25, 2026
Merged
Conversation
…ment (PR C) Phase 2 originally shipped via the Anthropic Managed-Agents API (PRs #7 + #8), which is API-key-only. The constraint that surfaced post-merge: both peers and consumers in the actual user base have Claude Code subscriptions (Pro / Max / Team), not API keys. The shipped Phase-2 tools were therefore unusable for the real users. This PR replaces the SDK path with a subprocess driver around `claude --print`. The peer's existing OAuth credentials authenticate every call; subscription quota pays the bill. No code changes for consumers (Wiktor's side); breaking CLI-flag changes for peers. Internals --------- * internal/agent/cli.go (new) — subprocess driver. Builds three argv variants (OneShot, first-turn-of-conv, resumed-turn) with --print --output-format json --allowed-tools "Read Glob Grep" on every call. Parses Claude Code's JSON result for stop_reason + usage; replays the session JSONL via the existing internal/transcript package to extract per-tool-call detail (the JSON payload has no tool_calls[] field). * internal/agent/conversations.go (slimmed) — convID is now the Claude Code session UUID. StartConversation no longer hits an upstream service; it allocates a UUID, records the system_prompt, returns. EndConversation marks the local slot ended without deleting anything; the session JSONL stays on disk so Phase-1 read_session can still surface it. Idle reaper is local-only. * internal/agent/sdk.go, loop.go, customtools.go (deleted) — the Managed-Agents wrapper, event-loop translation layer, and hand-rolled Read/Glob/Grep handlers all go away. Claude Code's native tools replace the custom dispatch; the JSON output replaces the event stream. * go.mod — drops github.com/anthropics/anthropic-sdk-go and its transitive deps (jsonparser, tidwall/*, wk8/go-ordered-map, ...). Static binary supply-chain footprint shrinks meaningfully. Security -------- Two-leg defense for the read-only allowlist: 1. --allowed-tools "Read Glob Grep" is hardcoded on every invocation; Claude Code itself enforces it. Asserted by the argv-shape unit test. 2. After every call, the JSONL replay scans for tool_use blocks outside the allowlist. Any disallowed name flips StopReason to error / disallowed_tool. Defense-in-depth against future-Claude-Code drift / corrupted builds. ANTHROPIC_API_KEY in the operator's env is stripped from the subprocess env by default — Claude Code's auth precedence is ANTHROPIC_API_KEY > apiKeyHelper > OAuth/keychain, and a stale env var would silently redirect billing. --agent-keep-env-key opts back in. CLI changes (breaking) ---------------------- Removed (hard-fail with a friendly upgrade-notes pointer): * --agent-api-key-env * --agent-base-url * --agent-model Kept, reinterpreted: * --agent-default-max-tokens — soft budget (system-prompt nudge); the CLI doesn't expose a token cap. Added: * --agent-claude-bin <path> * --agent-keep-env-key Unchanged: --enable-agent, --agent-default-max-tool-calls, --agent-default-timeout-seconds, --agent-log-path, --max-conversations, --conversation-idle-timeout. Version bumped 0.1.0-dev → 0.3.0-dev (skipping 0.2 since PR B forgot to bump). Test coverage ------------- Aggregate race-mode coverage 92.9% on a clean run (fluctuates 88-93% with the race detector's goroutine-timing sensitivity). Coverage gate is 90% with cache:false on setup-go. Plan: ~/.claude/plans/hearsay-phase-2-subprocess.md (six review rounds: 3 → 4 → 1 → 2 → 0 → 0 issues).
Linux's /bin/sh is dash, which does NOT forward SIGTERM to a child sleep / claude process. CI's TestRunClaude_TimeoutKillsSubprocess saw the subprocess run the full 5s sleep — a real bug in the production path on Linux too, since claude itself spawns helper processes that wouldn't get the cancel signal otherwise. Fix: set Setpgid on the subprocess so it owns its own process group; on cancel, send SIGTERM to the whole group via syscall.Kill(-pgid, SIGTERM) instead of just the leader. cli_unix.go and cli_other.go build-tag the platform-specific calls. macOS keeps working (Setpgid is a no-op for already-isolated procs); Windows compiles to a stub since hearsay doesn't ship Windows builds.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 originally shipped via the Anthropic Managed-Agents API (#7 + #8), which is API-key-only. The constraint that surfaced post-merge: both peers and consumers in the actual user base have Claude Code subscriptions (Pro / Max / Team), not API keys. The shipped Phase-2 tools were therefore unusable for the real users.
This PR replaces the SDK path with a subprocess driver around
claude --print. The peer's existing OAuth credentials authenticate every call; subscription quota pays the bill. No code changes for consumers (Wiktor's side — the MCP wire shape is unchanged); breaking CLI-flag changes for peers (with a hard-fail + upgrade-notes pointer to make migration loud).Internals
internal/agent/cli.go(new) — subprocess driver. Three argv variants (OneShot, first-turn-of-conv, resumed-turn) all share--print --output-format json --allowed-tools "Read Glob Grep". Parses Claude Code's JSON result forstop_reason+usage; replays the session JSONL via the existinginternal/transcriptpackage to extract per-tool-call detail (the JSON payload has notool_calls[]field — verified by running the actual CLI).internal/agent/conversations.go(slimmed) — convID is now the Claude Code session UUID.StartConversationno longer hits an upstream service.EndConversationmarks the local slot ended without deleting anything; the JSONL stays on disk so Phase-1read_sessioncan still surface it. Idle reaper is local-only.internal/agent/sdk.go/loop.go/customtools.go(deleted) — the Managed-Agents wrapper, event-loop translation, and hand-rolledRead/Glob/Grephandlers all go away. Claude Code's native tools replace the custom dispatch; the JSON output replaces the event stream.go.mod— dropsgithub.com/anthropics/anthropic-sdk-goand its transitive deps (jsonparser,tidwall/*,wk8/go-ordered-map, ...). Static-binary supply-chain footprint shrinks.Security
Two-leg defense for the read-only allowlist:
--allowed-tools "Read Glob Grep"is hardcoded on every invocation. Claude Code itself enforces it. Asserted by the argv-shape unit test.tool_useblocks outside the allowlist. Any disallowed name flipsStopReasontoerror/disallowed_tool.ANTHROPIC_API_KEYin the operator's env is stripped from the subprocess env by default — Claude Code's auth precedence isANTHROPIC_API_KEY > apiKeyHelper > OAuth/keychain, and a stale env var would silently redirect billing.--agent-keep-env-keyopts back in for the team-API-account / CI cases.CLI changes (breaking, peer side only)
--agent-api-key-env--enable-agent--agent-base-url--agent-model--agent-default-max-tokens--agent-claude-bin--agent-keep-env-key--enable-agent,--agent-default-max-tool-calls,--agent-default-timeout-seconds,--agent-log-path,--max-conversations,--conversation-idle-timeoutVersion bumped
0.1.0-dev→0.3.0-dev(skipping0.2.0-devsince PR B forgot to bump).Test plan
go build ./... && go vet ./...— clean.go test ./...— green across all 9 packages.go test -race -coverpkg=./... -coverprofile=…— aggregate 92.9% on a clean run (race-mode timing causes some flicker; CI'scache: falseshould be deterministic).cli_test.go): three variants asserted;--allowed-toolsvalue hardcoded and verified per call;--bareand--no-session-persistenceactively rejected.subtype+stop_reasoncombination mapped to the rightStopReason; futuremax_turns_exceededdefensively →max_tool_calls.tool_use.name == "Bash"→stopReason:"error", errorSummary:"disallowed_tool". Read+Grep allowed.MaxToolCallscap: 5tool_useblocks vsMaxToolCalls=2→stopReason:"max_tool_calls".stopReason:"timeout", no zombie.ErrClaudeMissing: binary deleted between New and OneShot → typedErrorSummary:"claude_missing"(no substring guessing; bypassesclassifyErrorMsg).ANTHROPIC_API_KEYstripped by default, kept under--agent-keep-env-key,EnvAPIKeyHandlingaudit field correctly tags each path.main_test.go): refuses to start without claude on PATH; accepts override path; hard-fails on each removed flag with the right message.hearsay_e2e_test.go):TestE2E_AgentRefusesStartWithoutClaudeBin(replaces the API-key version),TestE2E_AgentToolPresentWhenEnabledand the three conversation tests pass with a fakeclaudeshell script on PATH.hearsay --enable-agentagainst a real Claude Code login;mcp__wiktor__ask_peer_claude("read this repo's go.mod"); verify the JSONL appears in~/.claude/projects/andlist_sessionssurfaces it.Plan
~/.claude/plans/hearsay-phase-2-subprocess.md— six review rounds: 3 → 4 → 1 → 2 → 0 → 0 issues.