▶ Demo
- Feature deep-dive — the agent live, with a real on-screen self-correction (plaso unavailable → adapts to
mft_timeline,fault_injection=0): https://youtu.be/jw6etogNzhY - 4-minute showcase overview: https://youtu.be/4RQnVden6L8
See it self-correct — the #1 judged capability
A genuine tool failure handled with no human and no injected/staged error (fault_injection=0).
Live in the Claude Code terminal — plaso_parse is unavailable (log2timeline.py not on PATH), so the agent says so and adapts to mft_timeline:
Same event in the hash-chained audit log — real failure → course_correction (narrow / continue other lanes) → heartbeat escalation → honest partial verdict:
What it found — with custody you can verify offline
NIST CFReDS Hacking Case (SCHARDT.dd): SUSPICIOUS, 27 tool-cited findings across 6 artifact classes; every finding traces to its tool call via scripts/trace-finding, and manifest_verify.overall = true.
| Tamper-evident chain of custody | Reconstructed attack timeline |
|---|---|
![]() |
![]() |
Architecture
The whole workflow — every boundary is crossed only through a typed, read-only tool whose output is hash-chained into the custody log. Evidence vault -> SIFT tool subprocesses -> two typed MCP servers -> Claude Code agent loop -> cryptographic custody -> presentation:
The code architecture — the same pipeline mapped to the repository: entrypoints (scripts/), the agent loop governed by agent-config/, the .mcp.json surface (product servers findevil-mcp 31 Rust tools + findevil-agent-mcp 12 Python tools = 43 audit-chained tools; the n8n / playwright / puppeteer / qmd convenience servers that never emit findings), the SIFT DFIR subprocess tools, the read-only evidence vault, the hash-chained custody chain (audit.jsonl -> manifest_finalize -> manifest_verify), and the outputs (verdict.json, coverage_manifest.json, REPORT.{html,pdf}, apps/web SSE dashboard):
A documentation-only release. VERDICT's public docs are rewritten to read as a shipped product: a product-first README and reader-facing docs, with the accuracy/anti-overclaim doctrine preserved verbatim. No code, test, CI, or runtime behavior changed since v0.1.4.
Highlights
README — full rewrite to a product-first voice
- Action-first flow: Install & run -> What you get -> See it run -> How it works -> Capabilities -> Accuracy & scope -> Getting started -> Repository layout -> Documentation -> License.
- Removed competition/judge framing and superlatives; collapsed dense caption blocks to one-line captions; trimmed repeated claims to a single statement each.
- One understated origin credit line in the footer.
Reader docs — de-hackathon pass (structure unchanged)
INSTALL.md,QUICKSTART.md,docs/architecture.md,docs/DATASET.md,docs/verdict-semantics.md,docs/false-positives.md,docs/cryptographic-attestation.md,docs/index.md: judge/submission references reframed; SANS SIFT VM kept as the reference forensic environment.
Preserved verbatim
- The accuracy/anti-overclaim doctrine:
coverage_manifestlanguage, verdict-word scoping (SUSPICIOUS / INDETERMINATE / NO_EVIL), the CONFIRMED > INFERRED > HYPOTHESIS hierarchy, and the >=2-artifact-class execution rule. - All images/GIFs and every CI-guarded link/path.
Verification
All gates passed on the release commit: L0 Static (incl. docs cross-references intact + tool-count), L1 Unit+Build, L2 SIFT-lite, Amendment A2 invariants, and the aggregate CI Required gate. run-all-smokes.sh: 21 passed / 0 failed.
Install
Prebuilt, checksum-verified findevil-mcp binaries are attached for Linux x86_64/aarch64 and macOS x86_64/aarch64. scripts/install.sh fetches them with FINDEVIL_MCP_PREBUILT=1 FINDEVIL_MCP_VERSION=v0.1.5 (verified against SHA256SUMS); otherwise it builds from source. See INSTALL.md / QUICKSTART.md.
Changelog (since v0.1.4)
- docs: rewrite README + de-hackathon reader docs to release-ready voice (#36)
Full diff: v0.1.4...v0.1.5
Since v0.1.5 — evidence & demo (docs/demo-tooling only; merged to master)
Real, verified scripts/verdict runs committed as compact evidence (each traces clean with scripts/trace-finding; manifest_verify.overall = true):
- NIST CFReDS Hacking Case disk (
SCHARDT.dd) —SUSPICIOUS, 27 findings across 6 artifact classes (custody, disk/filesystem, MFT, prefetch, registry, timeline);plaso_parsegenuinely unavailable → organiccourse_correction→ timeline sealedPARTIAL, run continued. (docs/release-evidence/nist-schardt-disk-*) - ~18 GB memory image (Volatility) —
vol_pslist/psscan/psxview/malfind; honestINDETERMINATE,malfindheld atHYPOTHESIS. (docs/release-evidence/memory-volatility-summary.json) - Stage Two evidence map — per-criterion artifact + one-line verify command (
docs/release-evidence/stage-two-evidence.md). - Full accuracy report (false positives / missed artifacts / hallucinated claims / evidence integrity) and an organic self-correction trace (
fault_injection=0). - Feature deep-dive film rebuilt: George / ElevenLabs v3 narration + a real interactive Claude Code TUI self-correction clip (see Demo above).
No runtime, tool, or CI behavior changed; these are documentation, evidence, and demo-tooling additions.





