agents: add Claude Code adapter with coop messaging + git collaboration by ProKil · Pull Request #50 · cooperbench/CooperBench

ProKil · 2026-05-15T23:07:41Z

Summary

New claude_code agent adapter wrapping the official @anthropic-ai/claude-code CLI; supports solo, coop (Redis messaging), and coop+git (shared cooperbench-git remote) settings.
45 unit tests covering stream-json + session-JSONL parsing, credential resolution, comm URL rewriting, prompt branching, and git setup.
End-to-end verified on dottxt_ai_outlines_task/1371 features 1+2 — coop+git scores 2/2 features pass.

How it works

Installs claude-code inside the task container at runtime (npm).
Invokes claude --print --output-format=stream-json --permission-mode=bypassPermissions; parses the terminating result event for cost/tokens and walks $CLAUDE_CONFIG_DIR/projects/*/.../*.jsonl for messages.
Patch is harvested from /workspace/repo/patch.txt (same convention v2 uses).
Coop: copies an in-container Python messaging helper that wraps Redis, exposes it as coop-send/coop-recv/coop-broadcast/coop-peek/coop-agents shell commands, rewrites localhost→host.docker.internal in the comm URL, and launches the container with --add-host=host.docker.internal:host-gateway.
Coop+git: joins the cooperbench docker network, configures team remote against the shared git server, creates an agent-named branch, and pushes initial state. The prompt teaches the fetch/merge/push workflow.
Credentials: resolves in order ANTHROPIC_API_KEY → CLAUDE_CODE_OAUTH_TOKEN → ~/.claude/.credentials.json (subscription-based usage).

End-to-end smoke results

Run	Features	Pass rate	Notes
solo `-f 1`	1	n/a (eval needs pair)	$0.53, 211-line diff
coop (no git)	1, 2	1/2	classic merge-failure: agent2's patch fails to apply on agent1's restructured file
coop `--git`	1, 2	2/2	each agent fetched/merged peer's branch via `team` and rebuilt `patch.txt` from merged tree

Test plan

CI runs ruff check, ruff format --check, mypy, and pytest (all green locally — 204 passed, 63 integration skipped).
Manual: uv run cooperbench run -a claude_code -m claude-sonnet-4-5 -r dottxt_ai_outlines_task -t 1371 -f 1,2 --setting coop --git --backend docker followed by uv run cooperbench eval.
Confirm ANTHROPIC_API_KEY (or CLAUDE_CODE_OAUTH_TOKEN, or a ~/.claude/.credentials.json produced by claude login) is present on the host before running.

Follow-ups (not in this PR)

MCP-based collaboration tools (instead of shell-wrappers) for a more polished surface.
--max-turns, --effort, and other Claude Code knobs as agent-config flags.

🤖 Generated with Claude Code

Wraps the official @anthropic-ai/claude-code CLI as an AgentRunner so CooperBench can evaluate Claude Code on solo and coop settings. The CLI runs inside the task's Docker container; the adapter installs it with npm at runtime, invokes it in --print --output-format=stream-json mode, parses the terminating result event for cost/tokens, and walks the session JSONL for the message trajectory. Patches are harvested from /workspace/repo/patch.txt, the same convention v2 already uses. Coop support: when agents has 2+ entries and comm_url is set, copies an in-container Python helper that wraps Redis send/recv plus shell-wrapper scripts (coop-send, coop-recv, coop-broadcast, coop-peek, coop-agents) that Claude can invoke via Bash. The coop variant of the prompt documents these tools. Redis URLs of the form redis://localhost:... are rewritten to redis://host.docker.internal:... and the container is launched with --add-host=host.docker.internal:host-gateway so the in-container helper can reach the host Redis daemon. Git collaboration: when git_enabled is true, the adapter configures the team remote against the shared cooperbench-git server, creates an agent-named branch, and pushes initial state. The coop+git prompt section teaches the team remote name, the partner-branch shape, and the fetch/merge/push workflow so Claude can integrate peer changes before submitting patch.txt. Credentials resolve in order: ANTHROPIC_API_KEY env var, then CLAUDE_CODE_OAUTH_TOKEN env var, then ~/.claude/.credentials.json accessToken (subscription-based usage for hosts already logged in via claude login). End-to-end runs on dottxt_ai_outlines_task/1371 features 1+2: - solo f1: Submitted, $0.53, 211-line patch - coop without git: 1/2 features pass (classic merge-failure mode) - coop with --git: 2/2 features pass Tests: 45 unit tests covering stream-json + session-JSONL parsers, registry, credential resolution, comm URL rewriting, sent-log parsing, prompt branching for solo/coop/coop+git, git setup command shape, and adapter wiring (joins network, runs git setup, skips when disabled). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

akhatua2

Looks great. Thanks Hao!

ProKil requested a review from akhatua2 May 15, 2026 23:14

akhatua2 approved these changes May 16, 2026

View reviewed changes

ProKil merged commit 44b4a7a into main May 16, 2026
3 checks passed

ProKil deleted the claude-code-adapter branch May 16, 2026 04:35

ProKil mentioned this pull request May 16, 2026

agents/codex: add Codex adapter; lift shared coop bits into _coop #51

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agents: add Claude Code adapter with coop messaging + git collaboration#50

agents: add Claude Code adapter with coop messaging + git collaboration#50
ProKil merged 1 commit into
mainfrom
claude-code-adapter

ProKil commented May 15, 2026

Uh oh!

akhatua2 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ProKil commented May 15, 2026

Summary

How it works

End-to-end smoke results

Test plan

Follow-ups (not in this PR)

Uh oh!

akhatua2 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants