-
Notifications
You must be signed in to change notification settings - Fork 5
Live mode
dart-agent ships in two modes: deterministic (scripted policy, no API key needed) and live (real Claude API connected to dart-mcp over JSON-RPC stdio). This page documents live mode end-to-end.
| Deterministic | Live | |
|---|---|---|
| LLM | None — scripted policy | Claude (default claude-haiku-4-5) |
| API key required | No | ANTHROPIC_API_KEY |
| Use case | CI, reproducibility, air-gapped runs | Real DFIR work, judgment-heavy cases |
| Network egress | None | Outbound HTTPS to api.anthropic.com
|
| Architectural guarantees | Same | Same |
The architectural guarantees (read-only MCP boundary, audit chain, contradiction enforcement) are identical across modes. The only difference is who picks the next call: the YAML playbook policy, or Claude.
git clone https://github.com/Juwon1405/agentic-dart.git
cd agentic-dart
bash scripts/install.sh
export ANTHROPIC_API_KEY="sk-ant-..."
export DART_EVIDENCE_ROOT=/mnt/case-evidenceTo register dart-mcp with Claude Code (so you can run it interactively):
claude mcp add agentic-dart -s user -- python3 -m dart_mcp.serverThen in your Claude Code session:
/mcp call agentic-dart get_amcache --hive_path AmCache.hve
/mcp call agentic-dart parse_prefetch --target chrome.exe
# Evidence root is set via env var (not a CLI flag)
export DART_EVIDENCE_ROOT=/mnt/case-evidence
python3 -m dart_agent \
--case CASE-2026-001 \
--out ./out/case-2026-001 \
--mode live \
--max-iterations 25(Add --dry-run to use a scripted mock Claude with no API key — useful for CI.)
The agent:
- Spawns
dart-mcpas a subprocess with stdio piped. - Performs the JSON-RPC
initializehandshake. - Calls
tools/list— Claude sees exactly 72 typed forensic functions (47 native + 25 SIFT adapters), nothing more. - Loops:
- Sends the current state + hypothesis to Claude as a
messages.createrequest withtools=[...the 60...]. - Claude returns a
tool_useblock selecting one tool + arguments. - The agent forwards the call to
dart-mcpover stdio. - The output goes into the audit chain and back to Claude as a
tool_resultmessage. -
dart-corrruns on the new state. Contradictions force hypothesis revision.
- Sends the current state + hypothesis to Claude as a
- Stops at confidence ≥ 0.90, max iterations, or when Claude emits no further
tool_use.
Can:
- Choose any of the typed MCP functions on the surface (67 total — native + SIFT adapters)
- Pass any schema-valid arguments
- Reason about the output and pick the next call
Cannot:
- Call functions not on the surface — this raises
ToolNotFoundat the wire boundary, not at the agent - Modify evidence — no function on the surface can write
- Bypass the audit log — the agent runs
audit.log()after every result, before the result is consumed - Ignore
UNRESOLVEDcontradictions —dart-corrruns after every step and the serializer refuses to emit findings while contradictions are open
This is the architectural guarantee made concrete: a fully jailbroken model is still bounded by the surface.
tests/test_live_mcp.py runs end-to-end tests against the real MCP stdio server (with a scripted "mock Claude" that picks tools deterministically). No API key required:
python3 tests/test_live_mcp.pyThe four assertions:
- Initialize handshake completes
-
tools/listadvertises the full typed MCP surface (native + SIFT adapters) - Calling a non-registered function returns
ToolNotFoundover the wire - The full loop produces a chain-verified audit log
A single iteration of the live loop consumes ~5K-15K tokens depending on artifact size (mostly tool output sent back to Claude). The bundled IP-KVM case completes in ~5 iterations. The default claude-haiku-4-5 runs at zero per-call cost on a Claude Code OAuth subscription; a typical real-case run (15-25 iterations) on claude-sonnet-4-6 (--model, higher fidelity) costs roughly $0.50-$1.50 at current pay-as-you-go pricing. Costs are logged in the audit chain (token count per call).
For air-gapped or cost-sensitive environments, deterministic mode handles the same case classes the playbook covers, with no external dependency.
| Symptom | Likely cause | Fix |
|---|---|---|
ANTHROPIC_API_KEY not set |
env var missing | export ANTHROPIC_API_KEY=... |
MCP handshake timeout |
dart-mcp subprocess crashed at startup |
Run python3 -m dart_mcp.server directly to see the error |
tools/list returns 0 tools |
Wrong PYTHONPATH | export PYTHONPATH="$PWD/dart_mcp/src:..." |
Loop hangs |
Claude waiting on tool_result that never arrived | Check audit.jsonl for the last call — likely a parser raised silently |
- dart-agent — the wrapper loop
- dart-mcp — the typed surface that gets exposed
- Architecture deep dive
-
docs/live-mode.md— the equivalent doc in repo form
Agentic-DART — autonomous DFIR agent · architecture-first, not prompt-first · MIT license · github.com/Juwon1405/agentic-dart
- The Memex bet ⭐ Why this design
- About the name
- Architecture-first vs prompt-first
- Architecture deep dive
- Threat model
- Glossary
- dart-mcp — typed surface (native + SIFT adapters)
- dart-agent — senior-analyst loop
- dart-corr — cross-artifact correlation
- dart-audit — SHA-256 chained log
- dart-playbook — senior-analyst sequencing rules (v3 default)
- MCP function catalog (native + SIFT adapters)
- Comparison with adjacent tools
- FAQ
- Operator guide — distro-agnostic
- Running on SIFT
- Live mode
- Accuracy report
-
Roadmap ⭐ Phase 1 ~95% complete
- Phase 1 — Agentic DFIR ⭐ dedicated page · SANS submission
-
Phase 2 — Detection engineering
- The self-learning loop ⭐ design note
- Phase 3 — Agentic SOC
- Phase 4 — Broader agentic security