Skip to content

v0.16.0 — Competitive Benchmarking & FP Optimization

Choose a tag to compare

@HeadyZhang HeadyZhang released this 20 Feb 22:40
· 33 commits to master since this release

Highlights

  • 94.6% recall on Agent-Vuln-Bench (19 samples, 3 vulnerability categories)
  • 3.2x recall advantage over Bandit (29.7%), 3.5x over Semgrep (27.0%)
  • 100% Set B coverage — only tool that audits MCP configurations
  • 79% AGENT-034 false positive reduction via qualified name matching
  • 10/10 OWASP Agentic Top 10 categories covered
  • 1142 unit tests passing, zero regressions

Changes

Bug Fixes — AGENT-034 False Positive Reduction

  • Qualified name matchingre.compile(), asyncio.run(), platform.system() no longer falsely trigger
  • Framework path suppression — Framework internal code suppresses tool_no_input_validation
  • Extended safe tool patterns — 31 additional safe function patterns
  • Boundary confidence multiplier — More accurate confidence scoring

Layer 2 Scan Results

Target Findings Delta
T6 openai-agents-python 25 -36%
T7 adk-python 40 -27%
T8 agentscope 10 -50%
T9 crewAI 155 -31%