Adversarial review for Claude Code coding sessions.
After Claude finishes a task, debate forks the session for one or
more critic agents (Codex by default), runs a multi-round
cross-examination per critic, applies any concessions the proposer
makes, and surfaces only the unresolved disputes for human attention.
Each critic picks its own attack topic in round 1 (security, perf,
internal-consistency, evidence-gap, ...); later critics are told
which topics are taken and pick something else. No debate content
ever lands in the root Claude session - debate happens in branched
forks off the root.
Design and per-component contracts live under specs/ - start at specs/README.md for the index. Release-cut evidence (probe outcomes, smoke recordings) is committed to release-notes-v0.0.1.md.
curl -fsSL https://raw.githubusercontent.com/latere-ai/debate/main/install.sh | shThe script detects your OS / arch, fetches the latest release tarball
from
github.com/latere-ai/debate/releases,
verifies the sha256 checksum, installs debate to /usr/local/bin
(via sudo if needed), and runs debate install-hook --scope user.
Knobs:
DEBATE_VERSION=v0.0.1-rc2 # pin a specific tag; default: latest
DEBATE_PREFIX=$HOME/.local # binary lands at $DEBATE_PREFIX/bin
DEBATE_NO_HOOK=1 # skip the install-hook stepFrom source (requires Go 1.26+):
go install latere.ai/x/debate/cmd/debate@latest
debate install-hook --scope userinstall-hook merges the verbose-format Stop hook entry into
~/.claude/settings.json (or ./.claude/settings.json with --scope project). The entry is <path-to>/debate hook — a built-in
subcommand of the binary, so no separate shell script needs to be on
PATH. uninstall-hook removes it.
With the Stop hook installed, debate runs automatically when claude finishes responding. You see one stdout line; the summary lives on disk:
$ # ...claude does its thing...
[debate] 2 unresolved; see /repo/.debate/sessions/20260506T140905Z-q3a9f1/summary.md
$ cat .debate/sessions/*/summary.md
# Debate review - terminated: steady-state
## Headline (most contested unresolved)
- [security/api.go:88] SQL injection via unparameterized LIKE
- Critic: framework auto-escape doesn't cover LIKE patterns
- Proposer: parameterized via SQLAlchemy
- **Stake**: GET /search?q=%' OR 1=1--
- Contention: 3 (re-attacked: true)
## Resolved (5)
- [conceded] Off-by-one in pagination → fixed at api.go:42
...
## Stats
critic-found-bug rate: 5/8 attacks led to a fix
debate cost: 38k tokens, 6 rounds, 4 critics
Trivial diffs (under --changed-lines-min, default 10) short-circuit
in milliseconds. No session folder, just one kind:"skipped" line in
.debate/log.jsonl.
For CI gating, scripted batch runs, or out-of-band review, invoke manually:
debate \
--session-id <root-claude-session-id> \
--side-count 4 \
--max-turn 6Each of the four critics picks its own topic in R1; the orchestrator
passes prior critics' topics to each later critic as anti-duplication
signal. debate --help lists every flag. Exit codes: 0 clean,
1 unresolved leaves, 130 interrupted, 100s pre-flight failure.
user's claude session (the "root")
│
│ --fork-session (per critic)
┌───────────┼───────────┐
▼ ▼ ▼
fork-1 fork-2 … fork-N
┌──────┐ ┌──────┐ ┌──────┐
│ pro- │ <══════> │ ... │ │ ... │
│poser │ rounds │ │ │ │
│clone │ │ │ │ │
└──┬───┘ └──────┘ └──────┘
│
▼ writes round files to disk
.debate/sessions/<id>/forks/critic-i/rounds/r{1,2,3,…}-{critic,proposer}.md
│
▼
summary.md (contention-scored headline + leaves)
Five load-bearing pieces (full design in spec 01):
- Forked debate, no debate content in root. Each critic gets its
own claude fork via
--fork-session. The user's root transcript never sees a debate turn. Probe-confirmed against claude 2.1.131: a no-output Stop hook produces zerohook_*attachments, so the invariant holds across modes including the Stop-hook path. See specs/28. - Verbatim channel. Critic output reaches the proposer-clone as a
plain user turn pointing at a file:
Some comments at @<path>. Please resolve or respond.No skill, slash-command, or plugin-template wrapping that would distort the proposer's normal defense behavior. - Self-declared topics. Each critic chooses its own attack topic in R1 (security, perf, internal-consistency, evidence-gap, ...) and later critics are told which topics are already claimed so they pick something else. No fixed catalog. The debate-theoretic property - one competent honest player suffices for soundness - means a lazy critic on one topic doesn't break the others.
- Persisted ledger. Every attack carries a stable id (
c<critic>-<seq>), every transition is appended toattacks.jsonl. Headlines are picked by a pure contention score (rounds_survived + (1 if re-attacked)) - no LLM judging at this layer. - Best-effort critic isolation. v0 enforces "artifact + task
only" by critic system prompt and
codex --sandbox read-only, not OS isolation; strict per-fork sandbox dirs are v1.
- agents-byzantine-tolerance
- research repo studying multi-agent Byzantine fault tolerance, including spec 07 / Adversarial Debate, the architecture this tool productizes. Specs 08–13 explore protocol variants (compute asymmetry, recursive sub-debate, stochastic systems, PCP-style leaves, Prover-Estimator, DQC scaling). None drive v0; each one's empirical result could license a specific change here if it goes a particular way - see specs/README.md §Related research for the conditional mapping.
- Irving, Christiano & Amodei, AI Safety via Debate (2018) - one agent proposes, another finds flaws, a judge inspects only the single disputed claim that decides the debate. The complexity-theoretic intuition (debate ≈ PSPACE under optimal play) motivates the architecture.
- Brown-Cohen, Irving & Piliouras, Scalable AI Safety via Doubly-Efficient Debate (2023) - extends 2018 to stochastic systems and proves soundness under unbounded compute asymmetry between the players. The formal license for applying debate to LLMs at all.