Release v1.5.0 — Head-to-head benchmark comparison · ankitlade12/AgentArmor

AgentArmor v1.5.0 ships the head-to-head benchmark comparison infrastructure — the first honest, per-sample-verdicts, bootstrap-CI'd comparison of AgentArmor against established safety classifiers (LlamaGuard 3 + OpenAI Moderation) across six industry datasets.

See BENCHMARKS_HEAD_TO_HEAD.md for the results and RUNBOOK.md for operations.

Highlights

Head-to-head runner — sequential, resumable comparison with per-sample verdicts, bootstrap F1 / MCC / balanced-accuracy, adapter + config drift detection on resume, structured run.jsonl event log
Taxonomy applicability rubric with ensure_complete() CI gate — methodologically defensible (baseline, dataset) verdicts
BaselineChecker ABC migration — score(text) -> float contract with legacy auto-bridge and DeprecationWarning
Secret allow-list in config loader — rejects *_API_KEY / *_TOKEN / *_SECRET fields
JSON summary schema with additive-minor / major-bump semver enforcement
Deterministic markdown generator with byte-identical regeneration
Vendor-drift canary with 20 committed neutral samples + abort-on-delta pre-publish check
Operations runbook with 7 numbered procedures (setup, key rotation, resume, publishing, rollback, canary failure)

Policy

No paper-number fallback — a failing baseline yields a blank cell, never a prior-paper-cited number
raw_response: null in committed per-sample JSONL; --keep-raw-responses writes gitignored only

Pinned

numpy>=1.26,<2.0 for bootstrap determinism
PyYAML>=6.0,<7.0 for config loader
Optional head_to_head_llamaguard extra pulls llama-cpp-python>=0.2.0,<0.4.0

Also shipped in the 1.2.0 → 1.5.0 gap (previously unreleased)

Explain Mode v2 (1.4.0) — structured trace recording; agentarmor.last_trace() shows which shields ran, what each decided, and why
Semantic Drift Detector (1.3.0) — embedding-based multi-turn conversation trajectory tracker
Pricing API (1.3.0) — register_pricing() for custom model entries; added o3, o4-mini, claude-opus-4-6, claude-sonnet-4-6, gemini-2.5-pro, gemini-2.5-flash
Strict mode + demo_attacks() (1.3.x) — catches typo'd kwargs at init() time; runs ~21 synthetic attacks through your active config

Full changelog: CHANGELOG.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.5.0 — Head-to-head benchmark comparison

Choose a tag to compare

Sorry, something went wrong.