v1.11.0
tagged this
14 Mar 21:01
AV-31 sprint: 373 manually-reviewed traces across 12 model providers. Loop detector promoted to first hard CI gate (P=0.986 [0.960], R=1.000 [0.982] on 370-trace expanded corpus). - Bundle av31_reviewed corpus (289 traces) for CI validation - Update ci_backtest.py to load av31 alongside synth + real corpora - Version bump 1.9.0 → 1.10.0 - CHANGELOG with full validation metrics and gap analysis Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>