v1.5.0 — Head-to-head benchmark comparison
AgentArmor v1.5.0 ships the head-to-head benchmark comparison infrastructure — the first honest, per-sample-verdicts, bootstrap-CI'd comparison of AgentArmor against established safety classifiers (LlamaGuard 3 + OpenAI Moderation) across six industry datasets.
See BENCHMARKS_HEAD_TO_HEAD.md for the results and RUNBOOK.md for operations.
Highlights
- Head-to-head runner — sequential, resumable comparison with per-sample verdicts, bootstrap F1 / MCC / balanced-accuracy, adapter + config drift detection on resume, structured
run.jsonlevent log - Taxonomy applicability rubric with
ensure_complete()CI gate — methodologically defensible (baseline, dataset) verdicts BaselineCheckerABC migration —score(text) -> floatcontract with legacy auto-bridge andDeprecationWarning- Secret allow-list in config loader — rejects
*_API_KEY/*_TOKEN/*_SECRETfields - JSON summary schema with additive-minor / major-bump semver enforcement
- Deterministic markdown generator with byte-identical regeneration
- Vendor-drift canary with 20 committed neutral samples + abort-on-delta pre-publish check
- Operations runbook with 7 numbered procedures (setup, key rotation, resume, publishing, rollback, canary failure)
Policy
- No paper-number fallback — a failing baseline yields a blank cell, never a prior-paper-cited number
raw_response: nullin committed per-sample JSONL;--keep-raw-responseswrites gitignored only
Pinned
numpy>=1.26,<2.0for bootstrap determinismPyYAML>=6.0,<7.0for config loader- Optional
head_to_head_llamaguardextra pullsllama-cpp-python>=0.2.0,<0.4.0
Also shipped in the 1.2.0 → 1.5.0 gap (previously unreleased)
- Explain Mode v2 (1.4.0) — structured trace recording;
agentarmor.last_trace()shows which shields ran, what each decided, and why - Semantic Drift Detector (1.3.0) — embedding-based multi-turn conversation trajectory tracker
- Pricing API (1.3.0) —
register_pricing()for custom model entries; added o3, o4-mini, claude-opus-4-6, claude-sonnet-4-6, gemini-2.5-pro, gemini-2.5-flash - Strict mode +
demo_attacks()(1.3.x) — catches typo'd kwargs atinit()time; runs ~21 synthetic attacks through your active config
Full changelog: CHANGELOG.md