Skip to content

VERDICT v0.1.5

Latest

Choose a tag to compare

@github-actions github-actions released this 15 Jun 04:27
· 15 commits to master since this release
6ce82b6

▶ Demo

See it self-correct — the #1 judged capability

A genuine tool failure handled with no human and no injected/staged error (fault_injection=0).

Live in the Claude Code terminalplaso_parse is unavailable (log2timeline.py not on PATH), so the agent says so and adapts to mft_timeline:

Live Claude Code TUI self-correction

Same event in the hash-chained audit log — real failure → course_correction (narrow / continue other lanes) → heartbeat escalation → honest partial verdict:

Self-correction in the audit trace

What it found — with custody you can verify offline

NIST CFReDS Hacking Case (SCHARDT.dd): SUSPICIOUS, 27 tool-cited findings across 6 artifact classes; every finding traces to its tool call via scripts/trace-finding, and manifest_verify.overall = true.

Tamper-evident chain of custody Reconstructed attack timeline
Chain of custody Attack timeline

Architecture

The whole workflow — every boundary is crossed only through a typed, read-only tool whose output is hash-chained into the custody log. Evidence vault -> SIFT tool subprocesses -> two typed MCP servers -> Claude Code agent loop -> cryptographic custody -> presentation:

VERDICT architecture and chain of custody

The code architecture — the same pipeline mapped to the repository: entrypoints (scripts/), the agent loop governed by agent-config/, the .mcp.json surface (product servers findevil-mcp 31 Rust tools + findevil-agent-mcp 12 Python tools = 43 audit-chained tools; the n8n / playwright / puppeteer / qmd convenience servers that never emit findings), the SIFT DFIR subprocess tools, the read-only evidence vault, the hash-chained custody chain (audit.jsonl -> manifest_finalize -> manifest_verify), and the outputs (verdict.json, coverage_manifest.json, REPORT.{html,pdf}, apps/web SSE dashboard):

VERDICT code architecture


A documentation-only release. VERDICT's public docs are rewritten to read as a shipped product: a product-first README and reader-facing docs, with the accuracy/anti-overclaim doctrine preserved verbatim. No code, test, CI, or runtime behavior changed since v0.1.4.

Highlights

README — full rewrite to a product-first voice

  • Action-first flow: Install & run -> What you get -> See it run -> How it works -> Capabilities -> Accuracy & scope -> Getting started -> Repository layout -> Documentation -> License.
  • Removed competition/judge framing and superlatives; collapsed dense caption blocks to one-line captions; trimmed repeated claims to a single statement each.
  • One understated origin credit line in the footer.

Reader docs — de-hackathon pass (structure unchanged)

  • INSTALL.md, QUICKSTART.md, docs/architecture.md, docs/DATASET.md, docs/verdict-semantics.md, docs/false-positives.md, docs/cryptographic-attestation.md, docs/index.md: judge/submission references reframed; SANS SIFT VM kept as the reference forensic environment.

Preserved verbatim

  • The accuracy/anti-overclaim doctrine: coverage_manifest language, verdict-word scoping (SUSPICIOUS / INDETERMINATE / NO_EVIL), the CONFIRMED > INFERRED > HYPOTHESIS hierarchy, and the >=2-artifact-class execution rule.
  • All images/GIFs and every CI-guarded link/path.

Verification

All gates passed on the release commit: L0 Static (incl. docs cross-references intact + tool-count), L1 Unit+Build, L2 SIFT-lite, Amendment A2 invariants, and the aggregate CI Required gate. run-all-smokes.sh: 21 passed / 0 failed.

Install

Prebuilt, checksum-verified findevil-mcp binaries are attached for Linux x86_64/aarch64 and macOS x86_64/aarch64. scripts/install.sh fetches them with FINDEVIL_MCP_PREBUILT=1 FINDEVIL_MCP_VERSION=v0.1.5 (verified against SHA256SUMS); otherwise it builds from source. See INSTALL.md / QUICKSTART.md.

Changelog (since v0.1.4)

  • docs: rewrite README + de-hackathon reader docs to release-ready voice (#36)

Full diff: v0.1.4...v0.1.5


Since v0.1.5 — evidence & demo (docs/demo-tooling only; merged to master)

Real, verified scripts/verdict runs committed as compact evidence (each traces clean with scripts/trace-finding; manifest_verify.overall = true):

  • NIST CFReDS Hacking Case disk (SCHARDT.dd)SUSPICIOUS, 27 findings across 6 artifact classes (custody, disk/filesystem, MFT, prefetch, registry, timeline); plaso_parse genuinely unavailable → organic course_correction → timeline sealed PARTIAL, run continued. (docs/release-evidence/nist-schardt-disk-*)
  • ~18 GB memory image (Volatility)vol_pslist/psscan/psxview/malfind; honest INDETERMINATE, malfind held at HYPOTHESIS. (docs/release-evidence/memory-volatility-summary.json)
  • Stage Two evidence map — per-criterion artifact + one-line verify command (docs/release-evidence/stage-two-evidence.md).
  • Full accuracy report (false positives / missed artifacts / hallucinated claims / evidence integrity) and an organic self-correction trace (fault_injection=0).
  • Feature deep-dive film rebuilt: George / ElevenLabs v3 narration + a real interactive Claude Code TUI self-correction clip (see Demo above).

No runtime, tool, or CI behavior changed; these are documentation, evidence, and demo-tooling additions.