Skip to content

v2.1.0 — /review augmentable with agent-toolkit's evaluator

Choose a tag to compare

@alexherrero alexherrero released this 14 May 02:01
· 381 commits to main since this release

Additive minor — no breaking changes. The /review phase spec gains a new optional §3b "Optional: evaluator augmentation (agent-toolkit)" documenting how to dispatch the evaluator sub-agent (shipped in agent-toolkit v0.6.0) alongside the existing adversarial-reviewer flow.

The two reviewers are complementary, not competing:

adversarial-reviewer (§3) evaluator (§3b, agent-toolkit)
Framing "the code contains bugs, find them" "did this satisfy the rubric?"
Output failing test / file:line defect / NO ISSUES FOUND PASS / NEEDS_WORK + per-rubric-item PASS/FAIL
Input the artifact + PLAN.md task the artifact + an explicit rubric
Best when rubric is loose; you want defect surfacing rubric is precise; you want binary judgment

Both can run in the same /review session — their outputs combine. The harness still works standalone without the toolkit installed: §3b graceful-skips when agent-toolkit is absent.

Highlights

  • New §3b in harness/phases/04-review.md — 54-line section between §3a Reconcile and §4 Validate format. Covers when to add/skip evaluator dispatch, the ARTIFACT: + RUBRIC: prompt shape, the PASS / NEEDS_WORK output shape, treat-as-finding semantics, and a side-by-side comparison table.
  • check-references.py renameEXTERNAL_SKILLSEXTERNAL_CUSTOMIZATIONS. The exclusion now applies to both DISPATCH_AGENT_RE and INVOKE_SKILL_RE regexes. Inline-commented agent-toolkit homes per entry.
  • No adapter edits. All three review wrappers already reference harness/phases/04-review.md exactly once; §3b inherits via the existing canonical-reference pattern.
  • No new harness ADR. The evaluator design decision (read-only allowlist, caller-supplied inline rubric, coexist not replace, structured output) lives in agent-toolkit ADR 0002 since the customization itself lives there.

Migration

No migration needed — purely additive. To enable §3b's evaluator dispatch in your projects, install agent-toolkit v0.6.0 alongside the harness:

# If you don't already have agent-toolkit cloned as a sibling:
gh repo clone alexherrero/agent-toolkit ../agent-toolkit

# Install into your project (lands evaluator at .claude/agents/, .agent/skills/, .gemini/agents/):
bash ../agent-toolkit/install.sh /path/to/your-project

Without the toolkit, /review continues to satisfy the phase contract using adversarial-reviewer alone.

See CHANGELOG.md for the full v2.1.0 entry.