v0.5.2 - Industrial-grade Benchmark & OWASP Agentic Top 10 Full Coverage

HeadyZhang released this 05 Feb 02:13

· 58 commits to master since this release

965999e

Agent-Audit v0.5.2

Highlights

Complete OWASP Agentic Top 10 (ASI-01~10) Coverage with 45+ detection rules
Agent-Vuln-Bench v1.0: 12 KNOWN CVEs + 6 WILD patterns + 2 NOISE projects
Ground Truth v2.2: 81 samples, 218 vulnerability annotations
Industrial-grade metrics: Precision 98.51%, Recall 100%, F1-Score 99.25%

New Detection Rules (v0.5.x Series)

AGENT-043: Daemon privilege escalation (launchctl, systemctl, pm2)
AGENT-044: Sudoers NOPASSWD configuration
AGENT-045: Browser automation without sandbox
AGENT-046: System credential store access (Keychain, gnome-keyring)
AGENT-047: Subprocess execution without sandbox
AGENT-048: Extension permission boundaries

v0.5.2 Micro-Patch

AGENT-043 tightened daemon detection (excludes pkill, kill, nohup)
AGENT-046 credential store deduplication
AGENT-047 extended safe command list (macOS utilities, text processing)
Risk Score v2 formula with natural log scaling

Benchmark Infrastructure

precision_recall.py: Per-ASI recall metrics
quality_gates_v2.yaml: Layer 1/2 thresholds
Agent-Vuln-Bench harness with SWE-bench style evaluation
Multi-tool comparison support (vs Bandit, Semgrep)

Tests

716 tests passing
Full ASI category coverage validation

Installation

Assets 6