Performance refactor: eliminate duplication, zero-copy hot paths, clean dead code by ksek87 · Pull Request #31 · ksek87/fuzzd

ksek87 · 2026-05-14T03:03:32Z

Summary

Eliminated Pattern struct duplication — struct Pattern and the Aho-Corasick scan loop were defined identically in both description.rs and response.rs; extracted to a single Pattern + scan_with_automaton() in fuzzer/mod.rs, consumed via super:: in both scanners
Moved extract_snippet to utils.rs per project guidelines (shared pure utilities belong there)
Zero-copy corpus_refs — changed Finding::corpus_refs from Vec<&'static str> (one heap allocation per finding) to &'static [&'static str] (pointer copy); eliminates an allocation on every scan hot-path
Zero-allocation all_findings() — changed Observer::all_findings() return type from Vec<&Finding> (collects on every call) to impl Iterator<Item = &Finding>; has_blocking_findings() calls .any() directly on the iterator
Removed empty stub modules — deleted src/analyzer/mod.rs and src/reporter/mod.rs (single-line comments serving no purpose) and removed their mod declarations from main.rs
Replaced file-wide #[allow(dead_code)] blanket suppressions with targeted #[allow(dead_code)] on the specific test-helper items in production-facing files; infrastructure modules pending audit CLI wiring keep a module-level suppression with an explanatory comment

Test plan

cargo fmt --check passes
cargo clippy -- -D warnings passes (zero warnings)
cargo test — 120/120 tests pass
No regressions in description scanner, response scanner, or observer tests

https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

…alidation milestones Adds three new planned milestones: - v0.7: Semantic/embedding detection layer to close Message Hijacking and Privacy Leakage gaps - v0.8: Tool output/response analysis to detect server-side poisoning via CallToolResult - v1.0: LLM-in-the-loop live attack validation measuring actual exploit success rate Renumbers downstream milestones accordingly (chain fuzzer → v0.9, reporter → v1.1, protocol fuzzer → v1.2) and adds milestone detail section explaining the rationale for each new stage. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

…, Python SDK, package scanning) Adds five new planned milestones targeting friction points in how teams actually build and adopt agentic tooling: - v0.9: SARIF output + GitHub Security tab integration (moved earlier; multiplies reach) - v0.10: GitHub Action published to Marketplace (one-line CI integration) - v0.11: Package-level scanning (--package @scope/mcp-server, zero friction pre-adoption) - v0.12: Python SDK + LangChain/LlamaIndex/AutoGen/LangGraph framework adapters - v0.13: npx wrapper (no binary install for JS/TS teams) Renumbers chain fuzzer to v0.14 and live validation to v1.0. Adds milestone detail for each new stage. Updates Contributing section to reference current milestone names. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

Notes that v0.12 will use PyO3 + maturin to compile the Rust core as a native Python extension module, with a thin Python adapter for framework conversions. Same CI pipeline feeds binary and wheel releases with no code duplication. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

LLM-in-the-loop is model benchmarking (what MCPTox already does), not a fuzzer feature — it would break determinism and add API cost to CI. Replaced with a fuzzd validate subcommand: a thin reporting layer over the existing DescriptionScanner that accepts labelled fixtures and outputs detection/FP rates. No new detection code; replaces bench/run.sh as a first-class CLI surface. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

…rds, and 5 citations Detection signals (+2, 11→13): - message_hijacking: recipient substitution, BCC injection, proxy number framing (Invariant Labs WhatsApp PoC; Postmark npm real-world incident) - unicode_obfuscation: U+200B/200C/200D/FEFF zero-width character injection to hide instructions from human reviewers (Noma Security; arxiv 2601.17549) Patterns (+16, 86→102) across both new signals plus: - /.env, .cursor/mcp.json added to credential_reference - "provide the contents of", "send to email" added to exfiltration_mechanism Corpus records (+4, 23→27): - TPA-018: MCP-UPD parasitic toolchain .env exfiltration (arxiv 2509.06572) - TPA-019: Trivial Trojans cross-tool SSH key theft (arxiv 2507.19880) - TPA-020: Message hijacking via recipient substitution (Invariant Labs / Postmark) - TPA-021: Unicode zero-width character obfuscation (Noma Security) Citations (+5): MCP-UPD [^9], Trivial Trojans [^10], When MCP Servers Attack [^11], Breaking the Protocol [^12], Noma Security invisible characters [^13] All 103 tests pass. clippy and rustfmt clean. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

…ijacking 46.7%) Re-ran full MCPTox benchmark (485 tools) against current binary. Message Hijacking improved 40.0% → 46.7% from new recipient/BCC patterns. Updated signal distribution table to reflect 102 patterns across 13 signals. Notes semantic detection layer (v0.7) as the path to closing remaining Privacy Leakage and Message Hijacking gaps. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

…tion ResponseScanner (src/fuzzer/response.rs): - New Signal::EmbeddedInstruction for response-phase injection patterns - 20 patterns across 2 signals (EmbeddedInstruction, HtmlInjectionTag): classic prompt injection ("ignore previous instructions"), indirect pre-response injection ("before responding to the user"), cross-tool injection ("you must now call"), model-specific tags (<|system|>, <<sys>>, \nSYSTEM:, \nASSISTANT:), HTML/XML injection tags shared with description scanner - Reuses extract_snippet() and AhoCorasick from existing infrastructure - Only Text content blocks scanned; image/resource content skipped - 10 tests covering all detection categories and dedup invariant Observer (src/runner/observer.rs): - Observer<T> wraps Harness<T>; intercepts call_tool() results - Logs ObservedCall{tool_name, findings} per invocation - all_findings() flattens log; has_blocking_findings() checks ≥ High severity - 5 tests using MockTransport — no real network or child processes - 15 signal variants total (was 14) 120 tests passing. clippy and rustfmt clean. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

…free all_findings - Extract shared Pattern struct and scan_with_automaton() to fuzzer/mod.rs, removing identical definitions from description.rs and response.rs - Move extract_snippet from description.rs to utils.rs per project guidelines - Change Finding::corpus_refs Vec<&'static str> → &'static [&'static str] (one fewer heap allocation per finding on every scan hot-path) - Change Observer::all_findings() to return impl Iterator<Item = &Finding> instead of collecting into a Vec (zero allocation per call) - Simplify has_blocking_findings() to call .any() directly on the iterator - Remove empty analyzer/mod.rs and reporter/mod.rs stub files - Replace file-level #![allow(dead_code)] blanket suppressions with targeted #[allow(dead_code)] on specific test-helper items in production-facing files; infrastructure pending audit CLI wiring keeps module-level suppression with comment All 120 tests pass; cargo clippy -D warnings clean. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

…n_severity lazy - Corpus::embedded() now parses all 27 JSON records exactly once per process using OnceLock<Vec<AttackRecord>>; subsequent calls clone the cached vec instead of re-running serde_json::from_str 27 times (measurable in tests where corpus.embedded() is called 5 times per test run) - by_category() and by_min_severity() now return impl Iterator<Item = &AttackRecord> instead of collecting into Vec — zero allocation per filter call; callers that need a Vec can still .collect(), callers that need count use .count() - Update corpus filter tests to use .count() and .next().is_some() https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk

claude added 8 commits May 14, 2026 03:03

ksek87 changed the title ~~Expand roadmap with semantic detection, response analysis, and live validation~~ Performance refactor: eliminate duplication, zero-copy hot paths, clean dead code May 17, 2026

ksek87 merged commit 0477cc8 into main May 23, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance refactor: eliminate duplication, zero-copy hot paths, clean dead code#31

Performance refactor: eliminate duplication, zero-copy hot paths, clean dead code#31
ksek87 merged 9 commits into
mainfrom
claude/plan-fuzzd-project-88AMD

ksek87 commented May 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ksek87 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ksek87 commented May 14, 2026 •

edited

Loading