Skip to content

v0.5.2 - Industrial-grade Benchmark & OWASP Agentic Top 10 Full Coverage

Choose a tag to compare

@HeadyZhang HeadyZhang released this 05 Feb 02:13
· 58 commits to master since this release

Agent-Audit v0.5.2

Highlights

  • Complete OWASP Agentic Top 10 (ASI-01~10) Coverage with 45+ detection rules
  • Agent-Vuln-Bench v1.0: 12 KNOWN CVEs + 6 WILD patterns + 2 NOISE projects
  • Ground Truth v2.2: 81 samples, 218 vulnerability annotations
  • Industrial-grade metrics: Precision 98.51%, Recall 100%, F1-Score 99.25%

New Detection Rules (v0.5.x Series)

  • AGENT-043: Daemon privilege escalation (launchctl, systemctl, pm2)
  • AGENT-044: Sudoers NOPASSWD configuration
  • AGENT-045: Browser automation without sandbox
  • AGENT-046: System credential store access (Keychain, gnome-keyring)
  • AGENT-047: Subprocess execution without sandbox
  • AGENT-048: Extension permission boundaries

v0.5.2 Micro-Patch

  • AGENT-043 tightened daemon detection (excludes pkill, kill, nohup)
  • AGENT-046 credential store deduplication
  • AGENT-047 extended safe command list (macOS utilities, text processing)
  • Risk Score v2 formula with natural log scaling

Benchmark Infrastructure

  • precision_recall.py: Per-ASI recall metrics
  • quality_gates_v2.yaml: Layer 1/2 thresholds
  • Agent-Vuln-Bench harness with SWE-bench style evaluation
  • Multi-tool comparison support (vs Bandit, Semgrep)

Tests

  • 716 tests passing
  • Full ASI category coverage validation

Installation