agents: add Claude Code adapter with coop messaging + git collaboration#50
Merged
Conversation
Wraps the official @anthropic-ai/claude-code CLI as an AgentRunner so CooperBench can evaluate Claude Code on solo and coop settings. The CLI runs inside the task's Docker container; the adapter installs it with npm at runtime, invokes it in --print --output-format=stream-json mode, parses the terminating result event for cost/tokens, and walks the session JSONL for the message trajectory. Patches are harvested from /workspace/repo/patch.txt, the same convention v2 already uses. Coop support: when agents has 2+ entries and comm_url is set, copies an in-container Python helper that wraps Redis send/recv plus shell-wrapper scripts (coop-send, coop-recv, coop-broadcast, coop-peek, coop-agents) that Claude can invoke via Bash. The coop variant of the prompt documents these tools. Redis URLs of the form redis://localhost:... are rewritten to redis://host.docker.internal:... and the container is launched with --add-host=host.docker.internal:host-gateway so the in-container helper can reach the host Redis daemon. Git collaboration: when git_enabled is true, the adapter configures the team remote against the shared cooperbench-git server, creates an agent-named branch, and pushes initial state. The coop+git prompt section teaches the team remote name, the partner-branch shape, and the fetch/merge/push workflow so Claude can integrate peer changes before submitting patch.txt. Credentials resolve in order: ANTHROPIC_API_KEY env var, then CLAUDE_CODE_OAUTH_TOKEN env var, then ~/.claude/.credentials.json accessToken (subscription-based usage for hosts already logged in via claude login). End-to-end runs on dottxt_ai_outlines_task/1371 features 1+2: - solo f1: Submitted, $0.53, 211-line patch - coop without git: 1/2 features pass (classic merge-failure mode) - coop with --git: 2/2 features pass Tests: 45 unit tests covering stream-json + session-JSONL parsers, registry, credential resolution, comm URL rewriting, sent-log parsing, prompt branching for solo/coop/coop+git, git setup command shape, and adapter wiring (joins network, runs git setup, skips when disabled). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
akhatua2
approved these changes
May 16, 2026
Collaborator
akhatua2
left a comment
There was a problem hiding this comment.
Looks great. Thanks Hao!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
claude_codeagent adapter wrapping the official@anthropic-ai/claude-codeCLI; supports solo, coop (Redis messaging), and coop+git (sharedcooperbench-gitremote) settings.dottxt_ai_outlines_task/1371features 1+2 — coop+git scores 2/2 features pass.How it works
claude-codeinside the task container at runtime (npm).claude --print --output-format=stream-json --permission-mode=bypassPermissions; parses the terminatingresultevent for cost/tokens and walks$CLAUDE_CONFIG_DIR/projects/*/.../*.jsonlfor messages./workspace/repo/patch.txt(same convention v2 uses).coop-send/coop-recv/coop-broadcast/coop-peek/coop-agentsshell commands, rewriteslocalhost→host.docker.internalin the comm URL, and launches the container with--add-host=host.docker.internal:host-gateway.cooperbenchdocker network, configuresteamremote against the shared git server, creates an agent-named branch, and pushes initial state. The prompt teaches the fetch/merge/push workflow.ANTHROPIC_API_KEY→CLAUDE_CODE_OAUTH_TOKEN→~/.claude/.credentials.json(subscription-based usage).End-to-end smoke results
-f 1--gitteamand rebuiltpatch.txtfrom merged treeTest plan
ruff check,ruff format --check,mypy, andpytest(all green locally — 204 passed, 63 integration skipped).uv run cooperbench run -a claude_code -m claude-sonnet-4-5 -r dottxt_ai_outlines_task -t 1371 -f 1,2 --setting coop --git --backend dockerfollowed byuv run cooperbench eval.ANTHROPIC_API_KEY(orCLAUDE_CODE_OAUTH_TOKEN, or a~/.claude/.credentials.jsonproduced byclaude login) is present on the host before running.Follow-ups (not in this PR)
--max-turns,--effort, and other Claude Code knobs asagent-configflags.🤖 Generated with Claude Code