v0.16.0 — Competitive Benchmarking & FP Optimization

HeadyZhang released this 20 Feb 22:40

· 33 commits to master since this release

9469382

Highlights

94.6% recall on Agent-Vuln-Bench (19 samples, 3 vulnerability categories)
3.2x recall advantage over Bandit (29.7%), 3.5x over Semgrep (27.0%)
100% Set B coverage — only tool that audits MCP configurations
79% AGENT-034 false positive reduction via qualified name matching
10/10 OWASP Agentic Top 10 categories covered
1142 unit tests passing, zero regressions

Changes

Bug Fixes — AGENT-034 False Positive Reduction

Qualified name matching — re.compile(), asyncio.run(), platform.system() no longer falsely trigger
Framework path suppression — Framework internal code suppresses tool_no_input_validation
Extended safe tool patterns — 31 additional safe function patterns
Boundary confidence multiplier — More accurate confidence scoring

Layer 2 Scan Results

Target	Findings	Delta
T6 openai-agents-python	25	-36%
T7 adk-python	40	-27%
T8 agentscope	10	-50%
T9 crewAI	155	-31%

Assets 6