History / dart audit

Revisions

wiki: align evidence model with code — per-case evidence_root, drop obsolete shared sample-evidence, link live demo video

Juwon1405 committed Jun 15, 2026

5d4e0a4
wiki(qa-r10): kill function-signature + file-existence hallucinations across 6 pages Pairs with main repo commit 8a1917b. Round 10 was a 'judge follows every advertised command line by line' pass — surfaced 6 distinct hallucinations a SANS judge would have hit if they tried to reproduce anything from the wiki. == Defects fixed == ### Accuracy.md — broken script reference Advertised 'bash scripts/run-accuracy-suite.sh'. That script doesn't exist and never has. The actual reproducer is 'python3 scripts/measure_accuracy.py' with the standard PYTHONPATH export. A judge running the README's accuracy claim through this page would have hit: bash: scripts/run-accuracy-suite.sh: No such file or directory Replaced with the real measure_accuracy.py invocation, which was verified end-to-end (recall=1.0, FPR=0.0, hallucination_count=0, evidence_integrity_preserved=true). ### Case-PtH-Timestomp.md — 3 function-signature errors All three are the same class of mistake — the wiki cited positional/keyword args that don't exist on the actual MCP tools: 'dart-agent --hunt' → 'python3 -m dart_agent --case ... --out ... --mode deterministic' 'get_process_tree(host=...)' → 'get_process_tree(process_csv=...)' 'analyze_windows_logons(host=...)' → 'analyze_windows_logons(security_events_json=...)' 'parse_prefetch(target=...)' → 'parse_prefetch(prefetch_path=...)' These same mistakes live in docs/case-pth-timestomp.md (fixed in the paired repo commit). Verified by pulling live inputSchema.required from list_tools() for each tool. ### dart-agent.md — run_loop() and 4 fictional files The page advertised: - 'run_loop() in dart_agent/src/dart_agent/__init__.py' - A file inventory citing loop.py, decision.py, hypothesis.py, serializer.py — none of which exist. The actual structure is __init__.py + __main__.py + live.py. The senior-analyst loop is the DeterministicAnalyst class's .run() method (4 internal phases: _phase_timeline → _phase_hypothesis → _phase_validate_usb → _phase_finalize). Rewrote both the 'What it owns' bullet and the Files block to match reality. Added an explanatory note that the agent is small enough to keep its control flow in __init__.py. ### dart-audit.md — 3 hallucinations in one example The advertised AuditLogger.log() example used: - outputs={...} — actual kwarg is 'output' (singular) - cpu_ms=42 — no such kwarg - bytes_read=1024 — no such kwarg Real signature is: log(tool_name, inputs, output, iteration, token_count_in, token_count_out, finding_ids=None) Same page advertised audit_id type as 'UUID4' — actual is 8-character hex (secrets.token_hex(4)). Same page advertised 'output/<run_id>/<audit_id>.json' as the per-call output storage location — that directory layout doesn't exist; outputs are referenced by SHA-256 digest only in deterministic mode. Fixed all three. Verified the corrected example works as a copy-paste — wrote a test audit log, verified the chain, ran CLI (verify + trace) all green. ### dart-corr.md — serializer.py hallucination Page claimed UNRESOLVED contradictions are blocked by 'the serializer (dart_agent/serializer.py)'. There is no serializer.py file. The blocking happens inside DeterministicAnalyst's finding emission path in __init__.py. Rewrote the sentence to point at the real location. ### Live-mode.md — 2 hallucinations in the headline example - '--evidence /mnt/case-evidence' — no such CLI flag. Real pattern is 'export DART_EVIDENCE_ROOT=/path' before invoking the agent. - 'Claude sees exactly 35 typed forensic functions' — should be 60 (35 native + 25 SIFT adapters). Stale from the v0.4 surface, missed in earlier rounds because Live-mode.md wasn't part of the surface-count grep targets. Fixed both. Added an explicit '(Add --dry-run to use a scripted mock Claude with no API key)' line for CI / offline reproduction. == Verification approach == For each defect: 1. Read the wiki claim 2. Pulled the actual code/schema (inputSchema, argparse output, filesystem ls, AuditLogger signature via inspect) 3. Compared advertised ↔ actual 4. Fixed the wiki, then re-verified the fixed example by either running it (Accuracy.md, dart-audit.md) or by checking it would no longer raise on a copy-paste == Pattern internalised == Round 9 caught output-key hallucinations in code examples. Round 10 caught argument-name hallucinations and file-path hallucinations in tutorial prose — a different surface that print-output dry-runs don't cover. Going forward, any wiki/docs page that references a function by name + signature should be diff-checked against the live inputSchema.required list whenever the underlying code changes.

Juwon1405 committed May 8, 2026

1c089f4
wiki: add 12 missing pages, fix all 32 broken links The wiki sidebar and Home page referenced 13 pages that didn't exist, producing the GitHub 'create new page' UI when clicked. Adds: Concepts: Glossary — DFIR / agent / MCP terms The 5 packages: dart-agent — senior-analyst wrapper loop dart-corr — cross-artifact correlation engine dart-audit — SHA-256 chained audit log dart-playbook — YAML sequencing rules (dart-mcp already existed) Reference: Comparison — vs Velociraptor / Plaso / EZ tools / SOAR / vanilla LLMs Running it: Running-on-SIFT — SANS SIFT VM 5-minute setup Running-on-macOS — macOS-specific mount conventions Live-mode — real Claude API + MCP stdio integration Case studies: Case-PtH-Timestomp — Pass-the-Hash + timestomp pre-existence Case-IP-KVM — IP-KVM remote-hands insider scenario Writing-case-studies — guide for contributing new case studies Project: Accuracy — reproducible accuracy methodology + numbers The Roadmap-Phase-2/3/4 links in Home.md were repointed to the existing Roadmap page's anchors (those were never separate pages). The Contributing link in dart-mcp.md now points to CONTRIBUTING.md in the main repo. _Sidebar.md restructured into 6 named sections so the 25-page wiki is navigable. Final broken-link count: 0.

Juwon1405 committed Apr 30, 2026

b73bb8e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

History / dart audit

Revisions