Performance refactor: eliminate duplication, zero-copy hot paths, clean dead code#31
Merged
Merged
Conversation
…alidation milestones Adds three new planned milestones: - v0.7: Semantic/embedding detection layer to close Message Hijacking and Privacy Leakage gaps - v0.8: Tool output/response analysis to detect server-side poisoning via CallToolResult - v1.0: LLM-in-the-loop live attack validation measuring actual exploit success rate Renumbers downstream milestones accordingly (chain fuzzer → v0.9, reporter → v1.1, protocol fuzzer → v1.2) and adds milestone detail section explaining the rationale for each new stage. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…, Python SDK, package scanning) Adds five new planned milestones targeting friction points in how teams actually build and adopt agentic tooling: - v0.9: SARIF output + GitHub Security tab integration (moved earlier; multiplies reach) - v0.10: GitHub Action published to Marketplace (one-line CI integration) - v0.11: Package-level scanning (--package @scope/mcp-server, zero friction pre-adoption) - v0.12: Python SDK + LangChain/LlamaIndex/AutoGen/LangGraph framework adapters - v0.13: npx wrapper (no binary install for JS/TS teams) Renumbers chain fuzzer to v0.14 and live validation to v1.0. Adds milestone detail for each new stage. Updates Contributing section to reference current milestone names. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
Notes that v0.12 will use PyO3 + maturin to compile the Rust core as a native Python extension module, with a thin Python adapter for framework conversions. Same CI pipeline feeds binary and wheel releases with no code duplication. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
LLM-in-the-loop is model benchmarking (what MCPTox already does), not a fuzzer feature — it would break determinism and add API cost to CI. Replaced with a fuzzd validate subcommand: a thin reporting layer over the existing DescriptionScanner that accepts labelled fixtures and outputs detection/FP rates. No new detection code; replaces bench/run.sh as a first-class CLI surface. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…rds, and 5 citations Detection signals (+2, 11→13): - message_hijacking: recipient substitution, BCC injection, proxy number framing (Invariant Labs WhatsApp PoC; Postmark npm real-world incident) - unicode_obfuscation: U+200B/200C/200D/FEFF zero-width character injection to hide instructions from human reviewers (Noma Security; arxiv 2601.17549) Patterns (+16, 86→102) across both new signals plus: - /.env, .cursor/mcp.json added to credential_reference - "provide the contents of", "send to email" added to exfiltration_mechanism Corpus records (+4, 23→27): - TPA-018: MCP-UPD parasitic toolchain .env exfiltration (arxiv 2509.06572) - TPA-019: Trivial Trojans cross-tool SSH key theft (arxiv 2507.19880) - TPA-020: Message hijacking via recipient substitution (Invariant Labs / Postmark) - TPA-021: Unicode zero-width character obfuscation (Noma Security) Citations (+5): MCP-UPD [^9], Trivial Trojans [^10], When MCP Servers Attack [^11], Breaking the Protocol [^12], Noma Security invisible characters [^13] All 103 tests pass. clippy and rustfmt clean. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…ijacking 46.7%) Re-ran full MCPTox benchmark (485 tools) against current binary. Message Hijacking improved 40.0% → 46.7% from new recipient/BCC patterns. Updated signal distribution table to reflect 102 patterns across 13 signals. Notes semantic detection layer (v0.7) as the path to closing remaining Privacy Leakage and Message Hijacking gaps. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…tion
ResponseScanner (src/fuzzer/response.rs):
- New Signal::EmbeddedInstruction for response-phase injection patterns
- 20 patterns across 2 signals (EmbeddedInstruction, HtmlInjectionTag):
classic prompt injection ("ignore previous instructions"),
indirect pre-response injection ("before responding to the user"),
cross-tool injection ("you must now call"),
model-specific tags (<|system|>, <<sys>>, \nSYSTEM:, \nASSISTANT:),
HTML/XML injection tags shared with description scanner
- Reuses extract_snippet() and AhoCorasick from existing infrastructure
- Only Text content blocks scanned; image/resource content skipped
- 10 tests covering all detection categories and dedup invariant
Observer (src/runner/observer.rs):
- Observer<T> wraps Harness<T>; intercepts call_tool() results
- Logs ObservedCall{tool_name, findings} per invocation
- all_findings() flattens log; has_blocking_findings() checks ≥ High severity
- 5 tests using MockTransport — no real network or child processes
- 15 signal variants total (was 14)
120 tests passing. clippy and rustfmt clean.
https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…free all_findings - Extract shared Pattern struct and scan_with_automaton() to fuzzer/mod.rs, removing identical definitions from description.rs and response.rs - Move extract_snippet from description.rs to utils.rs per project guidelines - Change Finding::corpus_refs Vec<&'static str> → &'static [&'static str] (one fewer heap allocation per finding on every scan hot-path) - Change Observer::all_findings() to return impl Iterator<Item = &Finding> instead of collecting into a Vec (zero allocation per call) - Simplify has_blocking_findings() to call .any() directly on the iterator - Remove empty analyzer/mod.rs and reporter/mod.rs stub files - Replace file-level #![allow(dead_code)] blanket suppressions with targeted #[allow(dead_code)] on specific test-helper items in production-facing files; infrastructure pending audit CLI wiring keeps module-level suppression with comment All 120 tests pass; cargo clippy -D warnings clean. https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
…n_severity lazy - Corpus::embedded() now parses all 27 JSON records exactly once per process using OnceLock<Vec<AttackRecord>>; subsequent calls clone the cached vec instead of re-running serde_json::from_str 27 times (measurable in tests where corpus.embedded() is called 5 times per test run) - by_category() and by_min_severity() now return impl Iterator<Item = &AttackRecord> instead of collecting into Vec — zero allocation per filter call; callers that need a Vec can still .collect(), callers that need count use .count() - Update corpus filter tests to use .count() and .next().is_some() https://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
struct Patternand the Aho-Corasick scan loop were defined identically in bothdescription.rsandresponse.rs; extracted to a singlePattern+scan_with_automaton()infuzzer/mod.rs, consumed viasuper::in both scannersextract_snippettoutils.rsper project guidelines (shared pure utilities belong there)corpus_refs— changedFinding::corpus_refsfromVec<&'static str>(one heap allocation per finding) to&'static [&'static str](pointer copy); eliminates an allocation on every scan hot-pathall_findings()— changedObserver::all_findings()return type fromVec<&Finding>(collects on every call) toimpl Iterator<Item = &Finding>;has_blocking_findings()calls.any()directly on the iteratorsrc/analyzer/mod.rsandsrc/reporter/mod.rs(single-line comments serving no purpose) and removed theirmoddeclarations frommain.rs#[allow(dead_code)]blanket suppressions with targeted#[allow(dead_code)]on the specific test-helper items in production-facing files; infrastructure modules pending audit CLI wiring keep a module-level suppression with an explanatory commentTest plan
cargo fmt --checkpassescargo clippy -- -D warningspasses (zero warnings)cargo test— 120/120 tests passhttps://claude.ai/code/session_014T1x8ZiDbJcVvkZBfP91nk