Skip to content

Operator guide

Juwon1405 edited this page Jun 14, 2026 · 10 revisions

Operator guide

This page is for the DFIR engineer who wants to run dart-agent on a real case folder, not the hackathon demo. If you just want to see the agent work, the README's Quick start is faster.


Prerequisites

Minimum Recommended
OS Linux or macOS, Python 3.10+ SANS SIFT Workstation v22.04
RAM 4 GB 16 GB (correlation against multi-million-row MFT timelines)
Disk 5 GB free for evidence + audit SSD, 50 GB+
Network None for deterministic mode Outbound HTTPS for live mode (Claude API)

The agent does not require Anthropic API access for the deterministic demo path. Live mode requires ANTHROPIC_API_KEY environment variable.


One-time setup

git clone https://github.com/Juwon1405/agentic-dart.git
cd agentic-dart
bash scripts/install.sh

install.sh installs dart_audit, dart_mcp, dart_corr, and dart_agent in editable mode into the current interpreter — it does not create or require a virtualenv. If you want isolation on a shared SIFT VM (optional), activate one first and run the installer from it:

python3 -m venv .venv
source .venv/bin/activate
bash scripts/install.sh          # installs into the activated venv

Verify the install:

python3 -c "from dart_mcp import list_tools; print(len(list_tools()))"
# Should print: 72

Run the test suite:

export PYTHONPATH="$PWD/dart_audit/src:$PWD/dart_mcp/src:$PWD/dart_agent/src:$PWD/dart_corr/src"
python3 -m pytest tests/ dart_corr/tests/

Mounting your case as read-only

This is the most important step in the operator workflow. The agent trusts the OS-level mount to enforce read-only.

From an E01 / disk image

sudo mkdir -p /mnt/case-evidence
sudo mount -o ro,loop /path/to/case.dd /mnt/case-evidence
export DART_EVIDENCE_ROOT=/mnt/case-evidence

From an extracted directory

sudo mkdir -p /mnt/case-evidence
sudo mount --bind -o ro /path/to/extracted /mnt/case-evidence
export DART_EVIDENCE_ROOT=/mnt/case-evidence

From an artifact-collector ZIP

If you collected with yushin-mac-artifact-collector or a similar triage tool, extract first, then bind-mount as above.

Verifying the mount

mount | grep case-evidence
# Should show 'ro' in the options

touch /mnt/case-evidence/test 2>&1
# Should fail with: "Read-only file system"

If the touch succeeds, stop. The mount is not read-only and the architectural guarantee does not hold for this run.


Running the agent

Deterministic mode (no API key)

# Evidence root is set via env var (not a CLI flag)
export DART_EVIDENCE_ROOT=/mnt/case-evidence

python3 -m dart_agent --case CASE-2026-001 \
                     --out ./out/case-2026-001 \
                     --max-iterations 25

The deterministic mode runs the senior-analyst loop using the bundled playbook and does not call any external service. It is suitable for CI, repeatability checks, and for environments where network egress is forbidden.

Live mode (real Claude API)

Live mode connects an actual LLM to the typed MCP surface. The MCP boundary still applies — the model can only call functions that exist on the surface.

export ANTHROPIC_API_KEY=sk-ant-...
claude mcp add agentic-dart -s user -- python3 -m dart_mcp.server_stdio

Then in your Claude Code session:

/mcp call agentic-dart get_amcache
/mcp call agentic-dart parse_prefetch --target chrome.exe

See docs/live-mode.md for the full integration including how to run the agent loop on top of live MCP rather than the deterministic stub.


Reading the output

A completed run produces three artifacts:

reports/<case>.md           Final hypothesis, MITRE chain, citations
audit/<case>.jsonl          SHA-256 chained step-by-step trace
dart-corr/<case>.duckdb     Correlation database for post-hoc queries

Verifying the audit chain

python3 -m dart_audit verify audit/<case>.jsonl

This re-hashes every entry and checks the chain. Tampering with any entry — by the agent, by the operator, by anyone — will fail verification.

Tracing a finding back to evidence

python3 -m dart_audit trace audit/<case>.jsonl F-013

F-013 is the finding ID from reports/<case>.md. The trace command walks the audit chain backward from the finding to the underlying MCP calls, which include the file path and byte offset of the source artifact.

Querying the correlation DB

import duckdb
con = duckdb.connect("dart-corr/<case>.duckdb")
con.execute("SELECT * FROM unresolved_contradictions").fetchall()

UNRESOLVED rows are the reasoning forks the agent had to handle. Reviewing them is the fastest way to gauge whether the agent's final verdict is sound.


Common operational issues

Symptom Likely cause Fix
ToolNotFound: 'parse_X' Function not registered (typo or __init__ bug) python3 -c "from dart_mcp import list_tools; print([t['name'] for t in list_tools()])"
EvidenceRootEscape exception Path arg tried to leave the evidence tree Check the offending tool call's input — likely a .. or absolute path
Audit chain verify fails The audit log was edited (or written by a non-dart_audit writer) Re-run; do not edit audit.jsonl by hand
Agent loops at max-iterations without convergence Hypothesis is too underspecified for the typed tools Increase --max-iterations, or add a more specific case header in the playbook
dart-corr returns no contradictions on a known dirty case Time-proximity threshold too tight Tune dart_corr/correlation-rules.yaml

Performance notes from the field

  • A SIFT VM with 8 GB RAM completes the bundled IP-KVM case in ~14 seconds (deterministic mode), ~90 seconds (live mode + Claude Sonnet 3.7).
  • Large MFT correlations (5M+ rows) finish in 3-6 seconds with DuckDB if the host has SSD. On HDD, count on 10x.
  • Memory overhead is dominated by parsed MFT in DataFrame form; ~600 MB for a 5M-row MFT. Free that immediately after dart-corr ingests it (del df; gc.collect()).

What the agent will not do for you

  • It will not modify, quarantine, or block anything. Phase 3 (agentic SOC) will introduce supervised response, but the current surface is read-only.
  • It will not contact external services in deterministic mode. No TI lookups, no IP-WHOIS, no VirusTotal hits. If you want enrichment, pipe audit.jsonl to your own enrichment tooling.
  • It will not give you a confident verdict on out-of-corpus cases. The accuracy report is calibrated against three datasets. Anything outside those is reported with whatever the loop converges on, which may be wrong. Treat low-confidence verdicts as low-confidence.

Further reading

Agentic-DART

Concepts

The 5 packages

Reference

Running it

Case studies

Project


Project links

Clone this wiki locally