Skip to content

Phase 1

Juwon1405 edited this page Jun 15, 2026 · 17 revisions

Phase 1 — Agentic DFIR (current SANS submission) ⭐

This page is the operator's-eye summary of Phase 1 — what is shipping for SANS FIND EVIL! 2026.

For the full roadmap context (Phases 2 / 3 / 4 too), see Roadmap.


In one sentence

Phase 1 is the agentic DFIR layer — the autonomous reasoning loop that takes a single forensic case end-to-end with an architecturally enforced read-only boundary, an audit chain that survives reboot, and a contradiction handler that cannot be smoothed over.


What ships in Phase 1

The architectural guarantees (cannot be loosened by any future Phase)

  • The 73-tool typed forensic function surface on the MCP wire (48 native pure-Python + 25 SIFT adapters). Anything outside the surface (execute_shell, write_file, mount, eval) raises ToolNotFound regardless of prompt content. Asserted by the bypass suite.
  • SHA-256 chained audit log. Every MCP call hashed and chained. Tamper breaks the chain. 50 threads × 20 calls = 1000-entry audit chain verified concurrent-safe (v0.4.1 fix).
  • Path sandbox. _safe_resolve rejects ../, null bytes, absolute escapes, paths >1024 chars.
  • Contradiction enforcement. dart-corr flags UNRESOLVED between artifacts. Serializer rejects findings that ignore unresolved contradictions.
  • Audit-id citation. Every finding cites the audit ID of the call that produced it. Serializer rejects findings without one.

These five guarantees are the load-bearing architecture for all four phases. Phase 2 / 3 / 4 are extensions, not replacements.

The cross-platform forensic surface

OS Coverage
Windows EVTX, MFT, AmCache, Prefetch, ShimCache, Shellbags, USB history, Registry, Scheduled Tasks, Kerberos, Windows logons
Linux (v0.4 — 2026-04-30) auditd, systemd-journal, bash history, /etc/passwd, web access logs, Unix auth logs
macOS (v0.4 — 2026-04-30) unified log, launchd plists, bash history
Memory + Network process tree, open sockets, credential signals

broad MITRE ATT&CK enterprise tactic coverage — 10 of the 12 in-scope tactics covered by scoped detection rules. detect_dns_tunneling (added in v0.6.1) adds DNS-tunneling C2 indicators (Iodine/dnscat2 signatures plus Shannon-entropy and per-domain volume heuristics), but full TA0011 (Command and Control) and TA0009 (Collection) are the two tactics deferred to Phase 2 (full C2 needs end-to-end PCAP; Collection has parsers but no scoped detection rule yet).

Methodology — three playbook versions

Playbook Lines Status
senior-analyst-v1.yaml 128 Quick-demo baseline
senior-analyst-v2.yaml 845 Methodology baseline (Mandiant + Bianco + Diamond + 25 references)
senior-analyst-v3.yaml 1182 Default. Industrialization release — adds Palantir ADS + MaGMa UCF + TaHiTI hunt cycle + Bianco HMM. 42 references.

See dart-playbook for the deep dive.

Empirical validation

Documentation

  • 26-page wiki (this very wiki you are reading)
  • The Memex Bet — frames Agentic-DART in the lineage from Vannevar Bush 1945 → Karpathy 2026 → Agentic-DART
  • About the name — what DART means and why it expands cleanly
  • Threat model — what we defend against and what we explicitly do NOT defend against
  • 4-minute SANS demo video (mock-screencast pre-cut shipped; live screencast in flight per #14)

What remains for Phase 1 (closing 2026-06-15)

Item Status Issue
Live screencast on SANS SIFT v22.04 🟡 In progress #14
Devpost submission click (T-2 = 2026-06-13) 🟡 Scheduled #15
Ali Hadi Memory Forensic Challenge #1 accuracy 🟡 In progress #16
NIST CFReDS Hacking Case re-measure (post T1070.006 tightening) ⏰ TODO #1, #17
Digital Corpora M57 Patents accuracy ⏰ TODO #18

After 2026-06-15, Phase 1 is closed. Bug fixes only on main. Architectural changes go to a Phase 2 branch.


What Phase 1 explicitly does NOT do

These are intentional omissions, deferred by design — Phase 1 ships a tight, defensible architecture rather than a sprawling feature surface.

Capability Phase Why deferred
Live response (kill / quarantine / block) Phase 3 Read-only Phase 1 cannot grow response without breaking the architectural guarantee. Response gets a separate armed MCP server with a different audit chain and human-in-the-loop confirmation.
Sigma rule synthesis from observed evidence Phase 2 The dart-synth package is scoped but unimplemented. Tracked in #10.
Cloud DFIR (CloudTrail / GuardDuty) Phase 2 analyze_aws_cloudtrail is scoped. Tracked in #11.
Volatility-style memory plugin coverage Phase 2 Memory currently used for process-tree + sockets only. Full memory forensics is a separate engineering project.
Auto-execute YAML playbooks (no Python phase scaffold) Phase 2 YAML is read by the agent today; execution still goes through hardcoded Python phases. Auto-execution tracked in #34.
Enterprise multi-host orchestration Phase 3 Phase 1 is single-host offline. Multi-host is a Phase 3 dart-responder concern.

Versions shipped during Phase 1

Date Version Highlight
2026-04-28 v0.3 Initial 31-function MCP surface, 17 tests passing
2026-04-29 v0.3.1 dart-corr correlation engine GA
2026-04-30 v0.4 Linux + macOS expansion → 35 native functions, 20 tests passing
2026-05-02 v0.5 SIFT Workstation tool adapter layer → 60 functions (35 native + 25 SIFT), 22 tests passing
2026-04-30 Playbook v2 845-line methodology release
2026-05-01 Playbook v3 Industrialization release — Palantir ADS + MaGMa + TaHiTI + HMM
2026-05-01 Playbook v3 patch Yamato Security external references added to v3 (no separate v3.1 file; refs merged into senior-analyst-v3.yaml)
2026-04-30 v0.4.1 Audit chain race condition fix (threading.Lock())
2026-05-03 v0.5.1 Evergreen visuals + full-surface QA pass (counts removed from PNG identity)
2026-05-03 v0.5.2 Defensive runtime guards + 3 regression tests → 31 tests passing. dart_audit JSON default=str consistency, dart_agent._report() early-exit guard, correlate_timeline SQL-injection hardening
2026-05-08 v0.5.4 First external benchmark — NIST CFReDS Hacking Case integrated as case-08. parse_registry_hive shipped, recall 0.10/0.40 → 0.50/0.80 (strict/lenient) on 10 sampled findings
2026-05-10 v0.6.0 Supply-chain IOC sweep functions (litellm PyPI 2026-03, npm typosquat, preinstall hook abuse, credential file access) + agentic-dart-collector-adapter
2026-05-14 v0.6.1 macOS QuarantineV2 (T1204 download provenance) + Linux cron enumeration with attacker-pattern flagging (T1053.003) + DNS tunneling detection (T1071.004/T1568.002/T1572) — opens TA0011 Command and Control
2026-05-16 v0.7.0 case-11 supply-chain → ADCS ESC8 → DCSync → Golden Ticket (12 findings) + every canonical evidence_root file enriched to native forensic-tool dump fidelity (EvtxECmd / Zeek conn.log / MFTECmd / PECmd / SBECmd / RECmd / Hindsight / systemd-journald / auditd / FSEventsParser / log show). 11 cases / 99 findings total
2026-05-16 v0.7.1 Linux DFIR triplet (parse_linux_text_log + parse_linux_shell_history; parse_linux_cron_jobs already in v0.6.1). case-09 ground-truth function-name reconciliation. 47 native + 25 SIFT = 72 typed MCP tools, then-current suite passing, 32 of 36 expected functions implemented (89% coverage)
2026-06-10 v1.0.x Schema-validated MCP calls, Plaso derived-storage isolation via DART_DERIVED_ROOT, measured case-01 benchmark rows only, and current full-suite QA.

Where to go next

If you are reading this Phase 1 page and want to understand:

Agentic-DART

Concepts

The 5 packages

Reference

Running it

Case studies

Project


Project links

Clone this wiki locally