agon

Adversarial review for Claude Code coding sessions.

After Claude finishes a task, agon forks the session for one or more critic agents (Codex by default), runs a multi-round cross-examination per critic, applies any concessions the proposer makes, and surfaces only the unresolved disputes for human attention. Each critic picks its own attack topic in round 1 (security, perf, internal-consistency, evidence-gap, ...); later critics are told which topics are taken and pick something else. No agon content ever lands in the root Claude session - agon happens in branched forks off the root.

Design and per-component contracts live under specs/ - start at specs/README.md for the index. Release-cut evidence (probe outcomes, smoke recordings) is committed to release-notes-v0.0.1.md.

Installation

curl -fsSL https://raw.githubusercontent.com/latere-ai/agon/main/install.sh | sh

The script detects your OS / arch, fetches the latest release tarball from github.com/latere-ai/agon/releases, verifies the sha256 checksum, and installs agon to /usr/local/bin (via sudo if needed). Knobs:

AGON_VERSION=v0.0.1-rc2  # pin a specific tag; default: latest
AGON_PREFIX=$HOME/.local # binary lands at $AGON_PREFIX/bin

From source (requires Go 1.26+):

go install latere.ai/x/agon/cmd/agon@latest

Example usage

agon is a deliberate CLI: run it yourself after a coding session (see "Run it on demand" below). You see one stdout line; the summary lives on disk:

$ # ...claude does its thing...
[agon] 2 unresolved; see /repo/.agon/sessions/20260506T140905Z-q3a9f1/summary.md

$ cat .agon/sessions/*/summary.md
# Agon review - terminated: steady-state

## Headline (most contested unresolved)
- [security/api.go:88] SQL injection via unparameterized LIKE
  - Critic: framework auto-escape doesn't cover LIKE patterns
  - Proposer: parameterized via SQLAlchemy
  - **Stake**: GET /search?q=%' OR 1=1--
  - Contention: 3 (re-attacked: true)

## Resolved (5)
- [conceded] Off-by-one in pagination → fixed at api.go:42
...

## Stats
critic-found-bug rate: 5/8 attacks led to a fix
agon cost: 38k tokens, 6 rounds, 4 critics

Trivial diffs (under --changed-lines-min, default 10) short-circuit in milliseconds. No session folder, just one kind:"skipped" line in .agon/log.jsonl.

Run it on demand

There is no in-editor manual trigger: a slash command, skill, or UserPromptSubmit sentinel all mutate the root transcript (probed - see specs/36-probe-userpromptsubmit-manual-trigger.md). The byte-identical way to trigger on demand is to run agon yourself in a terminal - it only touches the live session via --fork-session, so your Claude Code transcript is untouched, exactly as under the Stop hook. This is also the path for CI gating, scripted batch runs, and reviewing a saved session:

agon \
  --session-id <root-claude-session-id> \
  --side-count 4 \
  --max-turn 6

A shell alias keeps it one keystroke away (agon resolves the latest session for the cwd when --session-id is omitted):

alias agon-attack='agon --side-count 4 --max-turn 6'

Each of the four critics picks its own topic in R1; the orchestrator passes prior critics' topics to each later critic as anti-duplication signal. agon --help lists every flag. Exit codes: 0 clean, 1 unresolved leaves, 130 interrupted, 100s pre-flight failure.

Design architecture

                  user's claude session (the "root")
                            │
                            │  --fork-session (per critic)
                ┌───────────┼───────────┐
                ▼           ▼           ▼
          fork-1            fork-2 …    fork-N
          ┌──────┐          ┌──────┐    ┌──────┐
          │ pro- │ <══════> │ ...  │    │ ...  │
          │poser │  rounds  │      │    │      │
          │clone │          │      │    │      │
          └──┬───┘          └──────┘    └──────┘
             │
             ▼ writes round files to disk
   .agon/sessions/<id>/forks/critic-i/rounds/r{1,2,3,…}-{critic,proposer}.md
                                            │
                                            ▼
                                    summary.md  (contention-scored headline + leaves)

Five load-bearing pieces (full design in spec 01):

Forked agon, no agon content in root. Each critic gets its own claude fork via --fork-session. agon runs as a separate process and only ever touches the live session through that fork, writing results to disk — the user's root transcript never sees an agon turn.
Verbatim channel. Critic output reaches the proposer-clone as a plain user turn pointing at a file: Some comments at @<path>. Please resolve or respond. No skill, slash-command, or plugin-template wrapping that would distort the proposer's normal defense behavior.
Self-declared topics. Each critic chooses its own attack topic in R1 (security, perf, internal-consistency, evidence-gap, ...) and later critics are told which topics are already claimed so they pick something else. No fixed catalog. The debate-theoretic property - one competent honest player suffices for soundness - means a lazy critic on one topic doesn't break the others.
Persisted ledger. Every attack carries a stable id (c<critic>-<seq>), every transition is appended to attacks.jsonl. Headlines are picked by a pure contention score (rounds_survived + (1 if re-attacked)) - no LLM judging at this layer.
Best-effort critic isolation. v0 enforces "artifact + task only" by critic system prompt and codex --sandbox read-only, not OS isolation; strict per-fork sandbox dirs are v1.

Related work

agents-byzantine-tolerance
- research repo studying multi-agent Byzantine fault tolerance, including spec 07 / Adversarial Debate, the architecture this tool productizes. Specs 08–13 explore protocol variants (compute asymmetry, recursive sub-debate, stochastic systems, PCP-style leaves, Prover-Estimator, DQC scaling). None drive v0; each one's empirical result could license a specific change here if it goes a particular way - see specs/README.md §Related research for the conditional mapping.
Irving, Christiano & Amodei, AI Safety via Debate (2018) - one agent proposes, another finds flaws, a judge inspects only the single disputed claim that decides the debate. The complexity-theoretic intuition (debate ≈ PSPACE under optimal play) motivates the architecture.
Brown-Cohen, Irving & Piliouras, Scalable AI Safety via Doubly-Efficient Debate (2023) - extends 2018 to stochastic systems and proves soundness under unbounded compute asymmetry between the players. The formal license for applying debate to LLMs at all.

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github/workflows		.github/workflows
cmd		cmd
deploy/prod		deploy/prod
e2e		e2e
frontend		frontend
internal		internal
scripts		scripts
specs		specs
testdata		testdata
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
Dockerfile.web		Dockerfile.web
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agon

Installation

Example usage

Run it on demand

Design architecture

Related work

About

Uh oh!

Releases 11

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agon

Installation

Example usage

Run it on demand

Design architecture

Related work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages