Release v1.1.0 — Stable release (SANS FIND EVIL! 2026) · Juwon1405/agentic-dart

Agentic-DART is an autonomous DFIR agent on the SANS SIFT Workstation. It runs a senior-analyst reasoning loop over a custom MCP server of 73 typed, read-only forensic tools (48 native pure-Python + 25 SIFT adapters) and produces a courtroom-traceable report. Evidence integrity is enforced by the shape of the system — destructive operations (execute_shell, write_file, mount) simply do not exist on the wire — not by asking the model to behave.

This is the first genuinely stable release, verified end-to-end from a clean clone.

Why 1.1.0 supersedes everything before it

Earlier tags that claimed "stable" did not actually run clean in a fresh environment. 1.1.0 is the result of a full correctness pass — install, benchmark, scoring, external disk-image handling, and the test suite all fixed and re-verified. The prior 1.0.2 "stable" tag has been removed to avoid confusion.

Tests are green from anywhere. 156 tests pass. The Phase-2 placeholder suite is now explicitly skipped (not failed) wherever it's collected, so pytest is clean whether you run it from the repo root or any subdirectory.
No version is pinned in docs or tests. The release number lives in pyproject.toml only; READMEs, the wiki, the site, and the version test were all genericized, so a future bump touches one file.

Highlights

73 typed read-only MCP tools — 48 native forensic functions + 25 SIFT adapters (Volatility 3, MFTECmd, EvtxECmd, PECmd, RECmd, AmcacheParser, YARA, Plaso), plus a versioned Sigma detection-rule matcher.
11 case studies / 99 ground-truth findings across two tiers: 8 internal self-evaluation cases (ready evidence) + 3 external full-disk public images (NIST CFReDS Hacking Case, Ali Hadi DFIR Challenge #1, Digital Corpora M57).
External is a first-class tier. Full-disk images are adapted via ewfmount + mmls + tsk_recover (partition-offset aware) into an evidence tree, then analyzed. Run the tiers as separate processes — scripts.eval.demo / scripts.eval.self / scripts.eval.external — each independently debuggable; an append-only docs/benchmarks/HISTORY.md records every self/external run.
Linux-only host, hardened installer — refuses to run under sudo, stages the full toolchain, verifies it, and offers to fetch the external images at the end.

Requirements / dependencies

Host OS — Linux only. Verified on the SANS SIFT Workstation (Ubuntu 22.04); RHEL / Rocky / AlmaLinux 8+ and Fedora work via dnf/yum. macOS and Windows are not supported as the host — the Plaso / libyal toolchain does not build cleanly there. Default shell is bash.

Requirement	Version	Verified
Python	3.10+ (CI: 3.10 – 3.13)	3.10, 3.12
OS	Ubuntu 22.04 (SANS SIFT) primary; RHEL/Rocky/Alma 8+, Fedora	SIFT

Python libraries (lower bounds; installed by scripts/install.sh):

Library	Minimum	Role
`anthropic`	≥ 0.40	Claude API client (live mode)
`mcp`	≥ 1.0	MCP client/server transport
`duckdb`	≥ 1.5.3, < 2.0	in-memory correlation store
`python-registry`	≥ 1.3	Windows registry hive parsing
`PyYAML`	≥ 6.0	playbook / Sigma rule loading
`requests`	≥ 2.25	dataset download (benchmarks)

External forensic tools (staged by the installer; SIFT ships most): sleuthkit (mmls, tsk_recover), ewfmount (ewf-tools / libewf), Volatility 3, Plaso (log2timeline.py, psort.py), EZ Tools, YARA, Velociraptor.

Install

git clone https://github.com/Juwon1405/agentic-dart.git
cd agentic-dart
bash scripts/install.sh          # Linux only; refuses sudo. Offers to fetch external images (~13 GB).
export ANTHROPIC_API_KEY='sk-ant-...'
python3 -m scripts.eval.demo                                           # deterministic, no key
python3 -m scripts.eval.self     --models claude-haiku-4-5-20251001   # 8 bundled cases
python3 -m scripts.eval.external --models claude-haiku-4-5-20251001   # public disk images

License: MIT. SANS FIND EVIL! 2026 submission.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.1.0 — Stable release (SANS FIND EVIL! 2026)

Choose a tag to compare

Sorry, something went wrong.