-
Notifications
You must be signed in to change notification settings - Fork 5
dart playbook
The agent's playbook. A YAML file that encodes "what should a senior analyst look at next, given the current state of the case?" — without writing imperative Python.
The whole point of architecture-first, not prompt-first is that operator-tunable rules don't live in the model's prompt. They live in YAML the operator can read and edit.
A Python playbook would couple the rules to the agent's release cycle. A YAML playbook is data: an analyst can fork the playbook, tune for their specific case class (web-app breach vs insider threat vs ransomware), and commit it to their own runbook repo.
| Playbook | Lines | Phases | Case classes | When to use |
|---|---|---|---|---|
senior-analyst-v1.yaml |
133 | 4 | 3 | Quick demos / simple scenarios |
senior-analyst-v2.yaml |
845 | 10 | 10 | Methodology baseline (Mandiant + Bianco + Diamond) |
senior-analyst-v3.yaml ⭐ |
Default | 10 | 10 + UC IDs | Default. Industrialized — adds ADS + MaGMa + TaHiTI + HMM |
v3 is the default for any new case. v2 is retained as the methodology baseline (no v3 industrialization scaffolds) so pre-industrialization runs remain reproducible. v1 is kept for backward compatibility and tutorials.
v3 is the industrialization release. v2 encoded a senior analyst's reasoning. v3 encodes a mature SOC's operating model around that reasoning as YAML data so it's inspectable, forkable, and citable.
Honest framing. The four framework blocks below ship in v3 as structured YAML data. They define the contract a mature-SOC implementation should satisfy. The runtime activation of these contracts in
dart_agentanddart_corris intentionally a post-SANS work item (tracked in issue #44) — activating any of them at runtime would shift the baseline measured byscripts/measure_accuracy.pymid-window. v2's runtime path (10-phase sequence + next_call_decisions + contradiction_triggers + stop_conditions) is what the agent actually executes today, and it remains intact in v3.
github.com/palantir/alerting-detection-strategy-framework
Encoded as ads_template in the v3 YAML. Defines a 9-section documentation contract for every detection: goal, categorization (MITRE ATT&CK), strategy abstract, technical context, blind spots & assumptions, false positives, validation (Atomic Red Team test ID), priority, response (SOAR runbook ref).
Lint modes documented: permissive → warn → strict. The lint pass that enforces the contract on each finding is post-SANS.
FI-ISAC NL · Rob van Os (SOC-CMM author) · full paper
Encoded as magma_ucf in the v3 YAML. Three-tier traceability:
- L1 business drivers (4 entries) — protect data integrity, detect ransomware before recovery denial, etc.
-
L2 attack patterns (8 entries, MITRE-mapped) —
AP-001ransomware-recovery-denial throughAP-008IP-KVM-physical-access - L3 detection coverage — MCP function mapping per L2 pattern
CMMI 5-level maturity scale documented:
- Initial (ad-hoc) → 2. Managed (documented) → 3. Defined (ADS-templated) → 4. Quantitatively Managed (FP/TP measured) → 5. Optimizing (TI feedback loop active)
v3 yaml self-declares L3 Defined as the current state. Per-run runtime CMMI scoring is post-SANS work.
Encoded as hunt_cycle in the v3 YAML, with the designed trigger condition confidence < 0.6 AND iterations >= 8 and three phases:
- H1 Initiate — document hypothesis, attach TI context (M-Trends, DFIR Report, CISA, Sigma)
- H2 Hunt — execute targeted MCP calls, pivot through Pyramid of Pain
- H3 Finalize — emit findings + new ADS, OR document negative result, OR hand off
Runtime entry into hunt mode from the agent loop on plateau detection is post-SANS work — the data scaffold is what defines what a TaHiTI-aware run would look like.
Encoded as hunting_maturity_model in the v3 YAML. Five levels documented with what each implies:
- HMM0 Initial — no hunt
- HMM1 Minimal — TI-driven (IOC-based)
- HMM2 Procedural — published procedures (e.g. ThreatHunter-Playbook)
- HMM3 Innovative — analyst-formed hypotheses ⭐ v3 yaml self-declares
- HMM4 Leading — automated hypothesis generation (Phase 2 target)
Per-run runtime self-classification by the agent is post-SANS work. The v3 yaml's agentic_dart_self_classification: HMM3_innovative declares the framework's intended target level.
42 published references organized into 6 categories. v3 adds +17 net items vs v2's 25 (15 industrialization frameworks + 2 inspiration tools + 2 new vendor research entries; v2's primary_methodology consolidated 8 → 6):
- industrialization_frameworks_v3 (15, NEW in v3) — Palantir ADS, MaGMa, TaHiTI, SOC-CMM, MITRE 11 Strategies, awesome-soc (cyb3rxp), awesome-incident-response (meirwah), awesome-threat-detection (0x4D31), ThreatHunter-Playbook (OTRF), Florian Roth Detection Engineering Cheat Sheet, Crafting the InfoSec Playbook (Bollinger et al.), Atomic Red Team, Sigma schema
- related_tools_for_inspiration (2, NEW in v3) — Hayabusa, EnableWindowsLogSettings (both Yamato Security, Tokyo) cited as third-party prior art*
- primary_methodology (6, consolidated from v2's 8) — Mandiant Targeted Attack Lifecycle, Lockheed Kill Chain, MITRE ATT&CK v16, Bianco Pyramid of Pain, Diamond Model, F3EAD
- case_studies_2025 (4, carried from v2) — DFIR Report walkthroughs, M-Trends, CISA #StopRansomware advisories
- vendor_research (10, +2 vs v2: Roberto Rodriguez OTRF, Zach Mathis Yamato Security Tokyo*)
- standards (5, carried from v2) — NIST SP 800-61/86/150, ISO 27035, ENISA IH
Yamato Security is an independent Tokyo-based DFIR group; Agentic-DART has no affiliation or partnership with them. Their tools are cited as external community references and field-calibration prior art only — no code or rules are imported.
This section documents the methodological foundation that v2 first encoded and that v3 inherits unchanged. v3's industrialization scaffolds (above) sit on top of this lineage; the runtime path that the agent actually executes today is still the one encoded here. Operators forking v3 should read this section to understand why each phase, decision rule, and contradiction trigger is shaped the way it is.
v2 (released 2026-04-30, 845 lines) synthesizes every authoritative source on modern DFIR practice into a single executable playbook. It is, in effect, an audit-chained encoding of how a senior analyst with 10+ years of frontline IR experience would approach a case.
-
Mandiant M-Trends 2026 — 500K hours of 2025 IR engagements; informs the
postureblock (14-day dwell time, 22-second hand-off, 32%/11%/10% initial-access priors) - Mandiant Targeted Attack Lifecycle — 8-phase model from Initial Recon to Complete Mission
- SANS PICERL — Preparation / Identification / Containment / Eradication / Recovery / Lessons learned
- Lockheed Martin Cyber Kill Chain — Hutchins, Cloppert & Amin 2011, Intelligence-Driven Computer Network Defense
- David Bianco — Pyramid of Pain (TTPs over IOCs) + Hunting Maturity Model
- Diamond Model of Intrusion Analysis — Caltagirone, Pendergast, Betz 2013 (adversary / capability / infrastructure / victim)
- MITRE ATT&CK Enterprise v16 — 12 tactics, 200+ techniques, fully mapped
- F3EAD — Find, Fix, Finish, Exploit, Analyze, Disseminate (originally U.S. military targeting; standard in modern DFIR)
- NIST SP 800-61 / 800-86 / 800-150
- The DFIR Report — BlackSuit, Akira, Fog, Lynx, BlueSky, RansomHub, MEOWBACKCONN
- CISA #StopRansomware — Akira AA24-109A (Nov 2025)
- Verizon DBIR 2025/2026 — vulnerability exploitation +180%, third-party compromise 30% of breaches
- Sean Metcalf — Active Directory attack detection, Kerberoasting/AS-REP roasting
- Sarah Edwards — macOS forensic analysis, KnowledgeC, unified log
- Patrick Wardle — The Art of Mac Malware persistence catalog
- Hal Pomeranz — Linux IR workflows, auditd methodology
- Eric Zimmerman — Windows artifact field semantics
- Andrew Case — memory forensics, Volatility
- Florian Roth — detection corpus, Sigma rules
- JPCERT/CC — Detecting Lateral Movement through Tracking Event Logs
P0 Volatility & scope memory, sockets, credential signals
P1 Initial access vector triage exploit (32%) / vishing (11%) / IAB (10%)
P2 Timeline reconstruction MFT + AmCache + Prefetch + auditd + journal
P3 Anomaly surfacing list anomalies WITHOUT explaining them
P4 Hypothesis formation falsifiable, MITRE-named, data-source-named
P5 Kill-chain assembly >=3 tactics, monotonic timestamps, audit_id
P6 Contradiction handling UNRESOLVED -> revise (architecturally enforced)
P7 Attribution / Diamond Model adversary / capability / infrastructure / victim
P8 Recovery-denial check identity / virtualization / backup
P9 Finding emission audit_id citation enforced by serializer
Each phase has:
-
rationale— why this order. Cited to source. -
pyramid_layer— where it sits in Bianco's Pyramid (foundation / middle / top / orientation / deliverable) -
mcp_calls— whichdart-mcpfunctions to invoke -
anti_patterns— what naive analysts do wrong -
senior_analyst_heuristic— what experienced analysts actually do -
exit_criteria— when the phase is closed
senior-analyst-v3.yaml is the canonical and default playbook. Below is its top-level shape — the v2 carry-over keys (target_case_classes, posture, sequence, next_call_decisions, contradiction_triggers, stop_conditions) are unchanged from v2, and four new top-level keys are added for the v3 industrialization frameworks. v2 remains in the repo for reproducibility of pre-industrialization runs.
version: 3
name: senior-analyst-v3
created: 2026-05-01
supersedes: senior-analyst-v2
methodology_lineage: # 13 cumulative citations (v2 + v3)
- mandiant_targeted_attack_lifecycle
- lockheed_kill_chain
- mitre_attack_v16
- bianco_pyramid_of_pain
- diamond_model
- f3ead
# v3 additions:
- palantir_ads_framework
- magma_ucf
- tahiti_threat_hunting
- bianco_hunting_maturity_model
# ... (full list in file)
# === v3 industrialization additions (4 framework blocks) ===
ads_template: # Palantir 9-section detection contract
required_sections: [Goal, Categorization, Strategy_Abstract,
Technical_Context, Blind_Spots_Assumptions,
False_Positives, Validation, Priority, Response]
lint_modes: [permissive, warn, strict]
current_default: warn
magma_ucf: # FI-ISAC NL three-tier UCF
l1_business_drivers: [...]
l2_attack_patterns: [...]
l3_detection_coverage: [...]
uc_id_format: "UC-DART-NNNN"
cmmi_levels: 5
hunt_cycle: # TaHiTI H1 / H2 / H3
trigger: "any phase exits with confidence < 0.6 after iterations >= 8"
phases: [H1_initiate, H2_hunt, H3_finalize]
hunting_maturity_model: # Bianco HMM 0-4, operationalized
levels: [HMM0_initial, HMM1_minimal, HMM2_procedural,
HMM3_innovative, HMM4_leading]
agentic_dart_self_classification: HMM3_innovative
# === v2 carry-over (unchanged) ===
target_case_classes: [...] # 10 case classes (insider, remote-hands,
# LotL, ransomware, identity, vishing,
# exploit, third-party, cloud-hybrid,
# division-of-labour)
posture: # M-Trends priors
dwell_time_assumption_days: 14
initial_access_priors: [...]
attacker_speed_assumption: {...}
sequence: # 10 phases, P0-P9 (unchanged from v2)
- phase: P0_scope_and_volatility
pyramid_layer: orientation
rationale: |
Memory and network state evaporate on reboot. Process tree,
open sockets, and loaded drivers must be captured before
anything else, even before reading disk artifacts.
Senior-analyst principle (Eric Zimmerman): "Order of volatility
is not a suggestion; it's a one-way door."
mcp_calls: [get_process_tree, detect_credential_access]
anti_patterns:
- "Pulling the disk image before snapshotting memory"
- "Rebooting 'to be safe' - destroys all volatile evidence"
- "Running antivirus scan as first action - may quarantine evidence"
exit_criteria:
process_tree_captured: true
credential_access_signals_logged: true
# ... (P1-P9, see file for full)
next_call_decisions: # 24 state -> tool routing rules
- when_state: "no MFT timeline yet"
call: extract_mft_timeline
confidence_gain: 0.20
rationale: "MFT is foundational - Eric Zimmerman: 'MFT is god'"
# ...
contradiction_triggers: # 7 architectural contradictions
- id: timestomp_predates_alert
rule: "If $SI < $FN AND mismatch_ts < alert_ts, persistence pre-existed"
severity: critical
mitre: T1070.006
# ...
stop_conditions: # 6 termination conditions
- condition: confidence >= 0.92
action: emit_findings
- condition: hypothesis_revision_count >= 5
action: declare_complex_case_request_human
note: |
A case that has revised the hypothesis 5+ times is beyond what
automated reasoning should commit to. Hand off to a human
analyst with the audit chain attached.
references: {...} # 6 categorized reference groups
operator_notes: | # Senior-analyst principles
...The agent reads the playbook at startup. Each iteration, it:
- Determines the current phase based on what's been done
- Reads
next_call_decisionsto pick the next MCP call - Invokes the call through
dart-mcp(which is bounded by the architectural-first surface) - Logs result to the audit chain (
dart-audit) - Runs
dart-corrto surface contradictions - If a contradiction matches a
contradiction_trigger, the hypothesis is mandatorily revised - Checks
stop_conditionsto decide whether to emit findings
In deterministic mode the agent follows this YAML literally. In live mode Claude can deviate, but every call still goes through the typed dart-mcp surface, and the contradiction triggers + stop conditions still apply.
- Copy
senior-analyst-v3.yamlto<your-name>-v1.yaml - Update
target_case_classesfor your scope - Tune
next_call_decisionsfor your environment's priorities - Add environment-specific
contradiction_triggers - Optionally adjust
ads_template,magma_ucf, andhunt_cyclefor your SOC's maturity profile - Run with
--playbook <your-name>-v1.yaml
The agent will follow your sequencing while the architectural guarantees (read-only, audit-chained, contradiction-aware) are unchanged. A playbook cannot loosen architectural guarantees. It can only choose what to call from the surface, never expand the surface.
dart_playbook/
├── README.md
├── senior-analyst-v1.yaml # 133 lines, 4 phases (legacy; v0.5.2 fixed memory-fn refs)
├── senior-analyst-v2.yaml # 845 lines, 10 phases (methodology baseline; retained for reproducibility)
└── senior-analyst-v3.yaml # Default playbook — ten-phase methodology + 4 framework blocks (DEFAULT)
(From senior-analyst-v3.yaml::operator_notes, inherited unchanged from v2)
- Phase order is strict. Memory disappears. Volatility before disk, always.
- Hypotheses are falsifiable. "Something bad happened" is not a hypothesis. "T1003.001 LSASS dump via comsvcs.dll executed at 14:23:09 UTC" is.
- Contradictions are gold. When two artifacts disagree, that's the most valuable signal in the case. Smoothing it over is malpractice.
- Recovery-denial check is mandatory for any modern ransomware case (M-Trends 2026 #1 trend). Endpoint encryption is the diversion, not the impact.
- Attribution is multi-vector. Diamond Model with 4 corners or no attribution claim. Single-IOC attribution is what gets analysts fired.
- Findings cite audit_ids. Always. The serializer refuses anything else — that's not a guideline, that's architecture.
- dart-agent — how the playbook gets executed
- Architecture deep dive
- Case-PtH-Timestomp — worked example showing the 10 phases in action
-
dart_playbook/senior-analyst-v3.yaml— the actual file (default) -
dart_playbook/senior-analyst-v2.yaml— methodology baseline (retained)
Agentic-DART — autonomous DFIR agent · architecture-first, not prompt-first · MIT license · github.com/Juwon1405/agentic-dart
- The Memex bet ⭐ Why this design
- About the name
- Architecture-first vs prompt-first
- Architecture deep dive
- Threat model
- Glossary
- dart-mcp — typed surface (native + SIFT adapters)
- dart-agent — senior-analyst loop
- dart-corr — cross-artifact correlation
- dart-audit — SHA-256 chained log
- dart-playbook — senior-analyst sequencing rules (v3 default)
- MCP function catalog (native + SIFT adapters)
- Comparison with adjacent tools
- FAQ
- Operator guide — distro-agnostic
- Running on SIFT
- Live mode
- Accuracy report
-
Roadmap ⭐ Phase 1 ~95% complete
- Phase 1 — Agentic DFIR ⭐ dedicated page · SANS submission
-
Phase 2 — Detection engineering
- The self-learning loop ⭐ design note
- Phase 3 — Agentic SOC
- Phase 4 — Broader agentic security