Skip to content

History / dart corr

Revisions

  • wiki(dart-corr): reflect v0.7.1 — extracted to real package Companion to agentic-dart commit 49e772c which extracts dart_corr from a docs-only scaffold into a real standalone package with code, 14 unit tests, and an operator-tunable rule pack. Wiki changes: dart-corr.md 'Files' block — replaced the old tree (which showed a nonexistent correlation-rules.yaml and pointed implementation at dart_mcp) with the real v0.7.1 layout: pyproject.toml, correlation-rules.yaml, src/dart_corr/__init__.py, tests/test_dart_corr.py. 'Implementation note' — replaced the scaffold caveat with the v0.7.1 reality: dart_corr is a real package, the MCP wire surface is preserved through thin wrappers in dart_mcp, and correlate_timeline keeps the SQL-injection defense at the boundary. Home.md TOC entry for dart-corr — removed the '(implementation currently inside dart_mcp; mid-2026 target)' subscript. The package is real now. Architecture-deep-dive.md Package ownership table — removed the '*scaffold (v0.7.1) — implementation lives in dart_mcp*' subscript on the dart_corr row. dart_corr now genuinely owns what the table says it owns. The agentic-dart README has been updated in lockstep with the matching scaffold-removal language and the test count (79 → 93 total tests across both packages). All numbers and language now reconcile across README, Wiki, and the dart_corr package itself.

    @Juwon1405 Juwon1405 committed May 17, 2026
  • fix(dart-corr): honest scaffold status across three Wiki pages User flagged a real issue — dart_corr/ on github is a directory containing only README.md, but multiple Wiki pages describe dart-corr as if it were a functioning component with its own files. This commit brings the Wiki language in line with the actual v0.7.1 source-tree state. Three changes: (1) Wiki/dart-corr.md '## Files' section — the 'tree' diagram falsely listed dart_corr/correlation-rules.yaml as a file that exists. It does not exist in the repo. The Implementation note was correct (it pointed at dart_mcp/__init__.py) but the file tree contradicted it. Both replaced with an honest tree showing only README.md under dart_corr/, plus exact line numbers for the three real correlate_* functions inside dart_mcp. (2) Wiki/Home.md Core-components TOC entry — added an inline qualifier '(implementation currently inside dart_mcp; standalone package is a mid-2026 target — see the page)' to the dart-corr bullet, so a reader scanning the TOC does not click through expecting a fully-populated package. (3) Wiki/Architecture-deep-dive.md package-ownership table — added a subscript '*scaffold (v0.7.1) — implementation lives in dart_mcp*' to the dart_corr row, so the architectural diagram and the ownership table tell the same truth. What is NOT changed: - The architectural design (dart-corr OWNS contradiction detection as a logical responsibility) is correct and stays. - The MCP-surface functions (correlate_events, correlate_timeline, correlate_download_to_execution) are real, registered, and reachable — verified by tests/test_mcp_surface.py. - Case-PtH-Timestomp and Case-IP-KVM walkthroughs accurately describe what those functions do; the 'dart-corr' references in those pages are correct as descriptions of the logical component, not as claims about file locations. Why the discrepancy existed: v0.4-era plan was to ship dart_corr/ as a standalone package before the SANS submission. When the v0.5 timeline tightened, the correlation logic was inlined into dart_mcp (where the type system was already enforced) and the dart_corr/ extraction was deferred to mid-2026. The main README, the agentic-dart README, and dart_corr/README.md all updated honestly at that time; some Wiki pages did not. Now they do.

    @Juwon1405 Juwon1405 committed May 17, 2026
  • wiki(qa-r13-15): FAQ MITRE 10/12 fix + dart-corr DuckDB ASOF→regular JOIN == Round 13/14/15 — paired with main repo commit 4495790 == Two wiki fixes this round: ### FAQ.md — '11/12 MITRE ATT&CK enterprise tactics' over-claim Note: this fix is identical in shape to round 12's MITRE fix (already in commit ef63a96). This commit catches the second cite location in FAQ — the headline-metric paragraph at 'What's the headline metric?' — that the round-12 sweep missed. Measured by walking dart-mcp function names against MITRE tactic buckets: 10/12 enterprise tactics covered. TA0009 (Collection) and TA0011 (Command-and-Control) are roadmap items. C2 was already disclosed in the FAQ 'What would you change with more time?' answer; Collection wasn't. Fixed the headline metric to '10/12' with explicit TA list and a link to Phase-1 for the gap analysis. ### dart-corr.md — DuckDB ASOF JOIN syntax error The advertised SQL block was: ASOF JOIN mft m ON a.ts BETWEEN m.ts - INTERVAL 15 SECOND AND m.ts + INTERVAL 15 SECOND DuckDB's ASOF JOIN only accepts a single inequality (>=, <=, >, <) in the ON clause. BETWEEN is two inequalities, so this raises: BinderException: Multiple ASOF JOIN inequalities Reproduced on duckdb 1.5.2 (the version pinned in CI). The wiki narrative wants a symmetric ±15-second window for time proximity. The right shape for that is a regular JOIN with the BETWEEN clause in WHERE: FROM auth a, mft m WHERE a.ts BETWEEN m.ts - INTERVAL 15 SECOND AND m.ts + INTERVAL 15 SECOND AND m.timestomp = TRUE Verified the new block returns the expected contradiction row (alice@14:22:00 ↔ /etc/shadow timestomp@14:21:55, within window). == Verification == - Re-ran every Python block on every wiki page (7 total). 6/7 already clean; this fix brings it to 7/7. Each block now actually runs on a fresh duckdb 1.5.2 install. - Re-ran scripts/measure_accuracy.py — recall=1.0, FPR=0.0, hallucination=0 (no regression from the doc fix). == Pattern internalised == DuckDB's ASOF JOIN is a different beast from a regular range JOIN. ASOF is for 'find the most recent prior row' (single inequality); range JOINs are for 'find any row within window' (two inequalities). The wiki's narrative wanted the latter. Going forward, any wiki SQL that runs against DuckDB needs the same dry-run-on-fresh-duckdb check as the rest of the code blocks.

    @Juwon1405 Juwon1405 committed May 8, 2026
  • wiki(qa-r11): 11 hallucinations across 9 pages — function signatures, CLI flags, file refs Pairs with main repo commit c34f661. Round 11 extended round 10's 'wiki/docs cite-vs-reality' sweep to all wiki pages round 10 didn't touch. Found 11 hallucinations across 9 pages. == Defects fixed == ### dart-mcp.md — 22 function signatures wrong This page was the headline catalog of native MCP functions ('The 60 functions') and was citing every one of them with fictional kwargs like host=, target=, path=. This is the most important page after the README for anyone trying to understand the MCP surface. A judge clicking dart-mcp from the sidebar would have hit fictional signatures for nearly every function. Fixed: get_amcache(path) → get_amcache(hive_path) parse_prefetch(target) → parse_prefetch(prefetch_path) parse_shimcache(host) → parse_shimcache(system_hive) get_process_tree(host) → get_process_tree(process_csv) analyze_usb_history(host, time_window) → analyze_usb_history(system_hive, setupapi_log) parse_shellbags(host) → parse_shellbags(ntuser_hive) extract_mft_timeline(host, start, end) → extract_mft_timeline(mft_path, start, end) list_scheduled_tasks(host) → list_scheduled_tasks() detect_persistence(host) → detect_persistence() analyze_event_logs(host, event_ids, time_window) → analyze_event_logs(events_json) parse_unified_log(host, subsystem, time_window) → parse_unified_log(unifiedlog_json) parse_knowledgec(host) → parse_knowledgec(knowledgec_db) parse_fsevents(host) → parse_fsevents(fsevents_csv) parse_browser_history(host, browser) → parse_browser_history(history_db) analyze_downloads(host) → analyze_downloads(downloads_source) correlate_download_to_execution(host) → correlate_download_to_execution(downloads, executions) detect_exfiltration(host, time_window) → detect_exfiltration() analyze_windows_logons(host) → analyze_windows_logons(security_events_json) detect_lateral_movement(host) → detect_lateral_movement() analyze_kerberos_events(host) → analyze_kerberos_events(security_events_json) analyze_unix_auth(host, time_window) → analyze_unix_auth(auth_log_path) detect_privilege_escalation(host) → detect_privilege_escalation() analyze_web_access_log(path, rules) → analyze_web_access_log(access_log) detect_webshell(path) → detect_webshell(webroot) detect_brute_force_rdp(host) → detect_brute_force_rdp(security_events_json) detect_credential_access(host) → detect_credential_access() detect_ransomware_behavior(host) → detect_ransomware_behavior() detect_defense_evasion(host) → detect_defense_evasion() detect_discovery(host) → detect_discovery() correlate_timeline(start, end, sources) → correlate_timeline(events) All verified against live inputSchema.required. No-arg functions (the post-Phase-1 detect_* family) had fictional '(host)' parameters that don't exist in the schema at all. ### Case-PtH-Timestomp.md — list_scheduled_tasks(host=...) Same residual fix as docs/case-pth-timestomp.md (round 10 caught 3 of 4 fictional signatures on this page; r11 caught the last one). ### Operator-guide.md / Running-on-macOS.md — --evidence flag Both pages advertised '--evidence /path/to/evidence' as a CLI flag. Round 10 caught the same hallucination in Live-mode.md but missed these two operator-facing pages — the SIFT VM install + macOS dev-mode pages a judge would land on after the README directs them to operator-guide. Fixed both to use 'export DART_EVIDENCE_ROOT=...' (the actual env-var pattern) before invoking the agent. ### Case-IP-KVM.md / Running-on-SIFT.md / Writing-case-studies.md — missing --out All three advertised 'python3 -m dart_agent --case ID --max- iterations 25' but --out is a required argparse argument. Without it the CLI errors with 'argument --out is required'. Added --out to the example invocations on all three pages. ### FAQ.md — '36th appears or one of the 35' The 'Is the MCP surface really fixed in size?' answer used '35' as the surface-count anchor. Total surface is 60 (35 native + 25 SIFT adapters), so the 'a 36th appears' phrasing has been stale since v0.5. Fixed to 'a 61st appears or any of the 60 (35 native + 25 SIFT adapters) disappears'. The same page's overview (line 99) already cited 60 correctly, making the line-25 mistake an inter-paragraph drift inside one page — caught by re-reading from a judge's flow rather than from a count-grep. ### dart-corr.md — illustrative pseudocode framing The pseudocode block was labeled '# dart_corr/__init__.py — simplified', which an attentive reader could mistake for a pointer at a real file. dart_corr/ contains only README.md; the actual correlation code is in dart_mcp/__init__.py. The page's 'Implementation note' at the bottom already says this, but reading the pseudocode header in isolation gives the wrong impression. Reframed the comment to 'Illustrative — real implementation lives in dart_mcp/__init__.py' inline so the framing is correct at point-of-read. == Verification approach == For each function-signature fix: 1. Pulled the live inputSchema.required from list_tools() 2. Verified the kwarg names match what dart_mcp/__init__.py actually accepts 3. Where the old wiki signature included optional kwargs that don't exist (e.g., 'time_window' on detect_exfiltration), dropped them rather than mapping to a different optional For CLI fixes: confirmed against 'python3 -m dart_agent --help' output (only --case, --out, --max-iterations, --mode, --prompt, --model, --dry-run exist). == Pattern internalized == Round 10 found a few signature hallucinations on the prominent Case-PtH page. Round 11 showed they were endemic on the headline catalog page (dart-mcp.md) — every single one of 22 cited functions had a fictional kwarg. Likely cause: the wiki was drafted from a v0.3-era memory of the surface, then never re-synced to the actual schema during the v0.4/v0.5 expansions. Going forward: any wiki page that lists multiple function signatures gets re-grep'd against list_tools() schema after every surface change, not just every release.

    @Juwon1405 Juwon1405 committed May 8, 2026
  • wiki(qa-r10): kill function-signature + file-existence hallucinations across 6 pages Pairs with main repo commit 8a1917b. Round 10 was a 'judge follows every advertised command line by line' pass — surfaced 6 distinct hallucinations a SANS judge would have hit if they tried to reproduce anything from the wiki. == Defects fixed == ### Accuracy.md — broken script reference Advertised 'bash scripts/run-accuracy-suite.sh'. That script doesn't exist and never has. The actual reproducer is 'python3 scripts/measure_accuracy.py' with the standard PYTHONPATH export. A judge running the README's accuracy claim through this page would have hit: bash: scripts/run-accuracy-suite.sh: No such file or directory Replaced with the real measure_accuracy.py invocation, which was verified end-to-end (recall=1.0, FPR=0.0, hallucination_count=0, evidence_integrity_preserved=true). ### Case-PtH-Timestomp.md — 3 function-signature errors All three are the same class of mistake — the wiki cited positional/keyword args that don't exist on the actual MCP tools: 'dart-agent --hunt' → 'python3 -m dart_agent --case ... --out ... --mode deterministic' 'get_process_tree(host=...)' → 'get_process_tree(process_csv=...)' 'analyze_windows_logons(host=...)' → 'analyze_windows_logons(security_events_json=...)' 'parse_prefetch(target=...)' → 'parse_prefetch(prefetch_path=...)' These same mistakes live in docs/case-pth-timestomp.md (fixed in the paired repo commit). Verified by pulling live inputSchema.required from list_tools() for each tool. ### dart-agent.md — run_loop() and 4 fictional files The page advertised: - 'run_loop() in dart_agent/src/dart_agent/__init__.py' - A file inventory citing loop.py, decision.py, hypothesis.py, serializer.py — none of which exist. The actual structure is __init__.py + __main__.py + live.py. The senior-analyst loop is the DeterministicAnalyst class's .run() method (4 internal phases: _phase_timeline → _phase_hypothesis → _phase_validate_usb → _phase_finalize). Rewrote both the 'What it owns' bullet and the Files block to match reality. Added an explanatory note that the agent is small enough to keep its control flow in __init__.py. ### dart-audit.md — 3 hallucinations in one example The advertised AuditLogger.log() example used: - outputs={...} — actual kwarg is 'output' (singular) - cpu_ms=42 — no such kwarg - bytes_read=1024 — no such kwarg Real signature is: log(tool_name, inputs, output, iteration, token_count_in, token_count_out, finding_ids=None) Same page advertised audit_id type as 'UUID4' — actual is 8-character hex (secrets.token_hex(4)). Same page advertised 'output/<run_id>/<audit_id>.json' as the per-call output storage location — that directory layout doesn't exist; outputs are referenced by SHA-256 digest only in deterministic mode. Fixed all three. Verified the corrected example works as a copy-paste — wrote a test audit log, verified the chain, ran CLI (verify + trace) all green. ### dart-corr.md — serializer.py hallucination Page claimed UNRESOLVED contradictions are blocked by 'the serializer (dart_agent/serializer.py)'. There is no serializer.py file. The blocking happens inside DeterministicAnalyst's finding emission path in __init__.py. Rewrote the sentence to point at the real location. ### Live-mode.md — 2 hallucinations in the headline example - '--evidence /mnt/case-evidence' — no such CLI flag. Real pattern is 'export DART_EVIDENCE_ROOT=/path' before invoking the agent. - 'Claude sees exactly 35 typed forensic functions' — should be 60 (35 native + 25 SIFT adapters). Stale from the v0.4 surface, missed in earlier rounds because Live-mode.md wasn't part of the surface-count grep targets. Fixed both. Added an explicit '(Add --dry-run to use a scripted mock Claude with no API key)' line for CI / offline reproduction. == Verification approach == For each defect: 1. Read the wiki claim 2. Pulled the actual code/schema (inputSchema, argparse output, filesystem ls, AuditLogger signature via inspect) 3. Compared advertised ↔ actual 4. Fixed the wiki, then re-verified the fixed example by either running it (Accuracy.md, dart-audit.md) or by checking it would no longer raise on a copy-paste == Pattern internalised == Round 9 caught output-key hallucinations in code examples. Round 10 caught argument-name hallucinations and file-path hallucinations in tutorial prose — a different surface that print-output dry-runs don't cover. Going forward, any wiki/docs page that references a function by name + signature should be diff-checked against the live inputSchema.required list whenever the underlying code changes.

    @Juwon1405 Juwon1405 committed May 8, 2026
  • wiki: add 12 missing pages, fix all 32 broken links The wiki sidebar and Home page referenced 13 pages that didn't exist, producing the GitHub 'create new page' UI when clicked. Adds: Concepts: Glossary — DFIR / agent / MCP terms The 5 packages: dart-agent — senior-analyst wrapper loop dart-corr — cross-artifact correlation engine dart-audit — SHA-256 chained audit log dart-playbook — YAML sequencing rules (dart-mcp already existed) Reference: Comparison — vs Velociraptor / Plaso / EZ tools / SOAR / vanilla LLMs Running it: Running-on-SIFT — SANS SIFT VM 5-minute setup Running-on-macOS — macOS-specific mount conventions Live-mode — real Claude API + MCP stdio integration Case studies: Case-PtH-Timestomp — Pass-the-Hash + timestomp pre-existence Case-IP-KVM — IP-KVM remote-hands insider scenario Writing-case-studies — guide for contributing new case studies Project: Accuracy — reproducible accuracy methodology + numbers The Roadmap-Phase-2/3/4 links in Home.md were repointed to the existing Roadmap page's anchors (those were never separate pages). The Contributing link in dart-mcp.md now points to CONTRIBUTING.md in the main repo. _Sidebar.md restructured into 6 named sections so the 25-page wiki is navigable. Final broken-link count: 0.

    @Juwon1405 Juwon1405 committed Apr 30, 2026