FAQ

Project basics

What is Agentic-DART, in one sentence?

An autonomous DFIR agent on the SANS SIFT Workstation that thinks like a senior analyst — architecture-first, not prompt-first.

What does DART stand for?

Detection And Response Team. See About the name for the full four-phase plan.

Why "Agentic-DART" and not "DART"?

The "Agentic" prefix signals that this is an autonomous loop, not a wrapper around an LLM. The work unit is the agent's iteration, not the prompt.

Is this a fork of something?

No. Original work, MIT licensed. The MCP protocol is from Anthropic, and Claude is the LLM used in live mode, but the architecture and code are independent.

Why did you pick this hackathon?

SANS FIND EVIL! 2026 explicitly asks for autonomous DFIR systems on the SIFT Workstation. The judging criteria align cleanly with the architectural claims this project is making.

Technical

Is the MCP surface really fixed in size?

Yes. tests/test_mcp_surface.py asserts the exact positive set. If a 36th appears or one of the 35 disappears, the test fails on the next CI run.

Does Agentic-DART work without the Claude API?

Yes. The deterministic demo path (bash examples/demo-run.sh) runs end-to-end with no API key. Live mode (real Claude API + MCP stdio) is available but optional. See Live mode.

How big is the audit log?

~3-5 KB per MCP call. A typical 25-iteration run produces an audit log of around 120-200 KB. The chain is verified on every run; tampered logs are detected.

Why DuckDB and not SQLite?

DuckDB handles columnar joins on millions of rows orders of magnitude faster than SQLite, which matters for MFT-scale timeline correlation. SQLite is fine for the audit log; DuckDB is right for dart-corr.

Will it work on macOS / Linux outside SIFT?

Yes. macOS dev mode is documented in Running on macOS. The SIFT Workstation is the production target because that's the hackathon's target environment, but the code does not depend on SIFT-specific paths.

Why Python and not Rust / Go?

Three reasons:

The MCP ecosystem is Python-first
DFIR tooling (Volatility, Plaso, etc.) is Python
The bottleneck is LLM API latency, not Python execution time

If a specific function needed to be rewritten in a faster language (e.g. an MFT parser doing 10M rows), it would still be exposed via the same MCP schema. The MCP surface is what the agent sees; the implementation is opaque.

Safety & guarantees

Can the agent damage evidence?

No. By construction. The MCP surface has no write functions, and the evidence directory is mounted read-only at the OS level. See Architecture-first vs prompt-first.

Can the agent make stuff up?

It can, in the sense that any LLM can. The architectural guarantee is not that the agent never hallucinates. The guarantee is that:

Every claim must cite an audit_id from a real MCP call
The audit log is replayable and tamper-evident
dart-corr flags contradictions as UNRESOLVED rather than hiding them

So a hallucinated finding either (a) lacks an audit_id and gets blocked at write time, or (b) has an audit_id, in which case a human reviewer can replay the call and confirm.

What if the LLM ignores the system prompt?

Doesn't matter. The system prompt is not a security boundary. The MCP surface is. See Architecture-first vs prompt-first.

What's NOT in scope for safety?

Confidentiality of the evidence (the agent reads everything you mount)
Network egress prevention (run in an air-gapped environment if you care)
Resource exhaustion (use container limits)

These are deployment concerns. Agentic-DART addresses them by not being responsible for them.

Comparison with adjacent tools

How is this different from Velociraptor?

Velociraptor is excellent for collection. Agentic-DART is for reasoning over collected evidence. They compose: a Velociraptor flow collects, then dart-agent --case reasons over the output.

How is this different from KAPE?

KAPE is similar — collection / triage. Same compositional answer.

How is this different from a fine-tuned LLM?

This project doesn't fine-tune anything. The LLM is generic; the value comes from the architecture (MCP surface + correlation engine + audit chain + playbook). A fine-tuned LLM could replace the generic one, but it would still need this scaffolding to be safe and auditable.

How is this different from "just give the LLM bash"?

The "just give the LLM bash" approach is exactly what dart-mcp is designed to not be. See Architecture-first vs prompt-first.

Hackathon-specific

Are you submitting solo?

Yes. This is a personal/independent submission. The README's Author section makes that explicit.

Was AI used in the development?

Yes, openly. The "Development approach" section of the README discloses Claude as a coding collaborator. Architectural decisions, threat coverage taxonomy, MITRE mapping, and final review are human-driven; implementation, sample-evidence generation, test scaffolding, and documentation drafting were AI-accelerated. Every commit is reviewed before it lands.

What's the headline metric?

22 / 22 tests passing on a fresh clone. 60 typed MCP functions (35 native + 25 SIFT Workstation adapters). 11 / 12 MITRE ATT&CK enterprise tactics covered.

What are you most proud of?

The bypass tests. They make the architectural claim mechanical, not rhetorical.

What would you change with more time?

Three things:

PCAP analysis for full TA0011 (Command and Control) coverage
Sigma rule synthesis (Phase 2 work)
A real-world dataset run against an Ali Hadi or NIST CFReDS image, with published metrics

← Back to Home

_{Agentic-DART — autonomous DFIR agent · architecture-first, not prompt-first · MIT license · github.com/Juwon1405/agentic-dart}

Agentic-DART

Home

Concepts

The 5 packages

dart-mcp — typed surface (native + SIFT adapters)
dart-agent — senior-analyst loop
dart-corr — cross-artifact correlation
dart-audit — SHA-256 chained log
dart-playbook — senior-analyst sequencing rules (v3 default)

Reference

MCP function catalog _{(native + SIFT adapters)}
Comparison with adjacent tools
FAQ

Running it

Case studies

Project

Accuracy report
Roadmap ⭐ _{Phase 1 ~95% complete}
- Phase 1 — Agentic DFIR ⭐ _{dedicated page · SANS submission}
- Phase 2 — Detection engineering
  - The self-learning loop ⭐ _{design note}
- Phase 3 — Agentic SOC
- Phase 4 — Broader agentic security

FAQ

FAQ

Project basics

What is Agentic-DART, in one sentence?

What does DART stand for?

Why "Agentic-DART" and not "DART"?

Is this a fork of something?

Why did you pick this hackathon?

Technical

Is the MCP surface really fixed in size?

Does Agentic-DART work without the Claude API?

How big is the audit log?

Why DuckDB and not SQLite?

Will it work on macOS / Linux outside SIFT?

Why Python and not Rust / Go?

Safety & guarantees

Can the agent damage evidence?

Can the agent make stuff up?

What if the LLM ignores the system prompt?

What's NOT in scope for safety?

Comparison with adjacent tools

How is this different from Velociraptor?

How is this different from KAPE?

How is this different from a fine-tuned LLM?

How is this different from "just give the LLM bash"?

Hackathon-specific

Are you submitting solo?

Was AI used in the development?

What's the headline metric?

What are you most proud of?

What would you change with more time?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Agentic-DART

Concepts

The 5 packages

Reference

Running it

Case studies

Project

Project links

Clone this wiki locally