-
Notifications
You must be signed in to change notification settings - Fork 5
dart corr
Surfaces contradictions between artifacts as UNRESOLVED rather than smoothing them over. This is the single most important component for the project's "architecture-first" claim — without dart-corr, the agent would just believe whatever the first source told it.
- The correlation rule pack (
dart_corr/correlation-rules.yaml) - DuckDB-backed in-process joins for time-proximity correlation
- The contradiction state machine:
OPEN→RESOLVED|UNRESOLVED - Two MCP-surface functions:
correlate_eventsandcorrelate_timeline
- Hypothesis revision — that's dart-agent's job
- Storing audit entries — that's dart-audit's job
- The artifacts themselves — those come from dart-mcp functions
When two artifacts disagree on a fact, dart-corr flags it.
Example from Case-PtH-Timestomp:
| Source | Claim |
|---|---|
| Auth events (4624) | Pass-the-Hash at 14:23:09 UTC
|
MFT $SI vs $FN
|
Timestomp at 14:21:55 UTC (11 sec earlier) |
A naïve LLM agent might pick whichever claim supports its current hypothesis. dart-corr raises UNRESOLVED and forces the agent to revise — there must be a third explanation that reconciles both, or the hypothesis is wrong.
# Illustrative — real implementation lives in dart_mcp/__init__.py
# (correlate_events, correlate_timeline). dart_corr is currently a docs-only
# scaffold; see "Files" below.
def correlate(events_a, events_b, time_window_sec=15):
contradictions = []
for a in events_a:
for b in events_b:
if abs((a.ts - b.ts).total_seconds()) <= time_window_sec:
if a.fact != b.fact: # disagreement
contradictions.append({
"claim_a": a.fact, "source_a": a.source, "ts_a": a.ts,
"claim_b": b.fact, "source_b": b.source, "ts_b": b.ts,
"status": "UNRESOLVED",
})
return contradictionsThe agent's playbook requires it to handle UNRESOLVED before emitting findings. Skipping is not an option — the finding emitter inside DeterministicAnalyst (in dart_agent/__init__.py) refuses to write a finding while a relevant UNRESOLVED contradiction is open.
dart-corr runs in-process (no server, no port). For multi-million-row MFT timelines, naïve Python joins OOM. DuckDB handles 5M+ row joins in seconds with window functions for time-proximity, all without leaving the process.
import duckdb
con = duckdb.connect(":memory:")
con.execute("INSTALL parquet; LOAD parquet")
con.execute("CREATE TABLE auth AS SELECT * FROM read_csv('auth.csv')")
con.execute("CREATE TABLE mft AS SELECT * FROM read_csv('mft.csv')")
con.execute("""
SELECT a.user, a.ts, m.path, m.timestomp
FROM auth a, mft m
WHERE a.ts BETWEEN m.ts - INTERVAL 15 SECOND AND m.ts + INTERVAL 15 SECOND
AND m.timestomp = TRUE
""").fetchall()The agent doesn't write SQL. dart-corr exposes correlate_events and correlate_timeline as typed MCP calls — the agent supplies the source files and a hypothesis ID, the engine returns the contradictions.
dart_corr/
├── README.md # design contract + usage
├── pyproject.toml # package metadata (duckdb + PyYAML)
├── correlation-rules.yaml # operator-tunable rule pack (9 default rules)
├── src/dart_corr/
│ └── __init__.py # the engine — three public correlate_* functions
└── tests/
└── test_dart_corr.py # 14 unit tests, run independently of dart_mcp
Implementation note (v0.7.1): As of v0.7.1,
dart_corris a real standalone package — not a docs-only scaffold. The three public functions (correlate_events,correlate_timeline,correlate_download_to_execution) plusload_rules()live indart_corr/src/dart_corr/__init__.pyand have 14 dedicated tests indart_corr/tests/test_dart_corr.py(all passing). The MCP wire surface is unchanged:dart_mcp.correlate_eventsand friends are thin wrappers that delegate todart_corr, withcorrelate_timelineadditionally enforcing a SQL-injection allow-list at the boundary before calling the engine. Both call paths produce identical output.
- Architecture deep dive
- Threat model
-
Case-PtH-Timestomp — worked example of
UNRESOLVEDdriving revision
Agentic-DART — autonomous DFIR agent · architecture-first, not prompt-first · MIT license · github.com/Juwon1405/agentic-dart
- The Memex bet ⭐ Why this design
- About the name
- Architecture-first vs prompt-first
- Architecture deep dive
- Threat model
- Glossary
- dart-mcp — typed surface (native + SIFT adapters)
- dart-agent — senior-analyst loop
- dart-corr — cross-artifact correlation
- dart-audit — SHA-256 chained log
- dart-playbook — senior-analyst sequencing rules (v3 default)
- MCP function catalog (native + SIFT adapters)
- Comparison with adjacent tools
- FAQ
- Operator guide — distro-agnostic
- Running on SIFT
- Live mode
- Accuracy report
-
Roadmap ⭐ Phase 1 ~95% complete
- Phase 1 — Agentic DFIR ⭐ dedicated page · SANS submission
-
Phase 2 — Detection engineering
- The self-learning loop ⭐ design note
- Phase 3 — Agentic SOC
- Phase 4 — Broader agentic security