History

Revisions

wiki: drop macOS host-install guide+links — host is Linux only; macOS stays an analysis target

Juwon1405 committed Jun 15, 2026

5a38ba3
wiki: align evidence model with code — per-case evidence_root, drop obsolete shared sample-evidence, link live demo video

Juwon1405 committed Jun 15, 2026

5d4e0a4
wiki: bring release log to v1.2.0, document model-aware auth + live Sigma pack, add self-learning loop design

Juwon1405 committed Jun 15, 2026

519383f
Home: add demo video link

Juwon1405 committed Jun 15, 2026

cbac95d
docs(accuracy): move benchmark scores to docs/benchmarks single source of truth; keep invariants (hallucination 0, read-only boundary, needle-in-haystack)

Juwon1405 committed Jun 15, 2026

57aa9fb
wiki: correct tool count to live 73 (48 native + 25 SIFT); drop v1.0.2 version pin The current-surface counts were stale: 72 (47 native) -> 73 (48 native) after the Sigma matcher tool landed. Fixed in Glossary, Live-mode, Phase-1 (the live-surface line), and Roadmap. The Glossary's 'As of v1.0.2' version pin is dropped so the count needn't carry a release number. The Phase-1 changelog row for v0.7.1 keeps its then-current '72' — that's an accurate historical record, not the live count.

Juwon1405 committed Jun 15, 2026

4fe34ed
docs(wiki): present venv as optional isolation, not a recommended step install.sh no longer creates or requires a virtualenv. Reword the Operator guide so the venv is opt-in isolation rather than a recommended SIFT step, matching the installer's actual behavior.

Juwon1405 committed Jun 14, 2026

d899e2e
docs(wiki): correct 'venv-first' to reflect installer no longer forces a private venv

Juwon1405 committed Jun 14, 2026

3cb275b
docs(wiki): reconcile tactic coverage to 10/12 (was 11) FAQ/Phase-1/Roadmap claimed TA0011 (C2) was covered by detect_dns_tunneling (only TA0009 deferred = 11/12), contradicting accuracy-report + README + DEVPOST + Pages (10/12). Per the conservative scoped-rule standard, both TA0009 (Collection) and full TA0011 (C2) are Phase-2; detect_dns_tunneling adds partial DNS-tunneling C2 indicators. Dated v0.6.1 history rows left as-is.

Juwon1405 committed Jun 13, 2026

fd62af1
docs(wiki): fix stale tool-count tripwire in FAQ The surface is 72 tools, so the 'exact set' tripwire is a 73rd tool appearing, not a 68th. Also drop a doubled 'the'.

Juwon1405 committed Jun 13, 2026

e53e876
docs(wiki): update IP-KVM case page to the current case-01 The page still described the pre-restructure standalone case: a flat case-01-ipkvm-insider/ dir with invented CSV filenames, a '12 findings' count (the case has 5), and an old dart_agent CLI with a now-dead path. Align it to reality: artifacts are the bundled examples/sample-evidence/ tree; reproduction is bash examples/demo-run.sh with the audit log under examples/out/find-evil-ref-01/; the measured block matches the current measure_accuracy output (67 files, 3-entry chain). Also fix the same old case name in the case-writing guide.

Juwon1405 committed Jun 13, 2026

6e30068
docs(wiki): de-pin stale test count in threat model The threat model said '20-test suite'; the suite is now 150 tests. Drop the hard number to match how the rest of the wiki phrases it ('the full pytest suite'), so it can't drift again.

Juwon1405 committed Jun 13, 2026

11078fd
docs(wiki): update tool-surface as-of version to v1.0.2

Juwon1405 committed Jun 12, 2026

84e06f1
docs(wiki): real-case workflow uses run_eval.py --evidence + host collector model Step 6: incident host runs the Velociraptor offline collector (no install) -> evidence.zip -> adapter --source zip|image -> run_eval.py --evidence.

Juwon1405 committed Jun 11, 2026

cec324c
revert(wiki): document ANTHROPIC_API_KEY-only auth, drop 'claude login' Reusing the local Claude Code login triggers a refresh-token rotation that logs the Claude Code client out; the wiki now documents the API-key path only. Keeps the install --full and demo/output structure.

Juwon1405 committed Jun 11, 2026

6f30dc3
docs(wiki): simplify SIFT/Live-mode setup and surface 'claude login' Running-on-SIFT: install.sh --full one-shot, separate Authenticate step (API key or claude login), demo+run, and out/<tier>/<case>/<timestamp>/ outputs. Live-mode: credentials are API key or claude login; run_eval.py is the entry.

Juwon1405 committed Jun 11, 2026

98bb2eb
docs(wiki): fix remaining stale CFReDS case path and v1.0.0-as-current in Glossary

Juwon1405 committed Jun 11, 2026

3a0ca09
docs(wiki): align Accuracy/Home with canonical evidence and tiered cases Remove the public --variant / sample-evidence-realistic concept from Accuracy (single canonical evidence_root + CI fixture), retier the case tables to self-evaluation/external-evaluation, fix case links to the new index-only paths, rename ground-truth.json to truth.json, and drop a stale tool-count. Dated historical roadmap entries in Phase-1 keep their original case numbers.

Juwon1405 committed Jun 10, 2026

37bd1cc
docs: align wiki with current live-mode scope Document live mode through ANTHROPIC_API_KEY and --dry-run, remove public zero-cost/OAuth setup claims, and update Claude MCP registration to dart_mcp.server_stdio. Refresh accuracy evidence counts to 62 reference files and 67 realistic files, clarify that the measured identical result applies to case-01 F-001/F-013, and remove stale 50-file language. Update operator, SIFT, macOS, roadmap, and Phase 1 pages to the 72-tool surface and current full-suite validation model without stale 35-tool or 75-test guidance. Fix the Home architecture link and describe external entries as case-study slots instead of fully measured benchmark rows. QA: git diff --check passed for the wiki.

Juwon1405 committed Jun 10, 2026

f9dc340
docs: glossary tool-surface as of v1.0.0

Juwon1405 committed Jun 5, 2026

2b06a4b
docs(wiki): fix remaining stale 'the 60' (Live-mode) and sonnet-4 model name (Accuracy)

Juwon1405 committed Jun 5, 2026

c6b8958
docs(wiki): fix stale 35/60 surface counts and default model name Live registry is 47 native + 25 SIFT = 72; code default is claude-haiku-4-5 (sonnet-4-6 is the --model higher-fidelity override). - dart-agent.md: default claude-sonnet-4 -> claude-haiku-4-5; 35-function -> 47. - Live-mode.md: default sonnet-4 -> haiku-4-5; 60 typed -> 72; cost-example model name sonnet-4 -> sonnet-4-6 with the haiku default made explicit. - SIFT-adapter-layer.md: 35 forensic functions -> 47. (Phase-1.md v0.4/v0.5 timeline rows keep 35/60 as point-in-time history.)

Juwon1405 committed Jun 5, 2026

56610a7
docs(wiki): sync MCP-function-catalog to the live 47-native surface The category table and function list were stale at 35 native / 60 total. The live registry (test_mcp_surface asserts the exact set) is 47 native + 25 SIFT = 72. Updated category counts (macOS 4->5, Linux 3->6, Linux+macOS 1->2, Cross-platform 14->21, native total 35->47, grand total 60->72) and added the 7 functions that shipped in v0.6.1-v0.7.1 with their code descriptions and ATT&CK mappings: parse_registry_hive, grep_shell_history_for_c2, detect_credential_file_access, scan_pth_files_for_supply_chain_iocs, detect_nodejs_install_hooks, detect_pypi_typosquatting, detect_python_backdoor_persistence. Phase-1.md: "72 native" -> "47 + 25 = 72".

Juwon1405 committed Jun 5, 2026

30606a5
docs(wiki): align Accuracy and Glossary with the realistic-variant enrichment design - Accuracy.md: the realistic row claimed the generator synthesizes "security events 516" -- it does not; the security EventLog is hand-curated at ~11,530 lines. Only the two IOC-only logs (web access, unix auth) are noise-injected. Dropped the "production-shape / production-noise-injected" overstatement (the enriched ratio is ~1:30). - Glossary.md: the MCP surface is 47 native + 25 SIFT = 72, not "72 native".

Juwon1405 committed Jun 5, 2026

2ff2513
wiki(dart-corr): reflect v0.7.1 — extracted to real package Companion to agentic-dart commit 49e772c which extracts dart_corr from a docs-only scaffold into a real standalone package with code, 14 unit tests, and an operator-tunable rule pack. Wiki changes: dart-corr.md 'Files' block — replaced the old tree (which showed a nonexistent correlation-rules.yaml and pointed implementation at dart_mcp) with the real v0.7.1 layout: pyproject.toml, correlation-rules.yaml, src/dart_corr/__init__.py, tests/test_dart_corr.py. 'Implementation note' — replaced the scaffold caveat with the v0.7.1 reality: dart_corr is a real package, the MCP wire surface is preserved through thin wrappers in dart_mcp, and correlate_timeline keeps the SQL-injection defense at the boundary. Home.md TOC entry for dart-corr — removed the '(implementation currently inside dart_mcp; mid-2026 target)' subscript. The package is real now. Architecture-deep-dive.md Package ownership table — removed the '*scaffold (v0.7.1) — implementation lives in dart_mcp*' subscript on the dart_corr row. dart_corr now genuinely owns what the table says it owns. The agentic-dart README has been updated in lockstep with the matching scaffold-removal language and the test count (79 → 93 total tests across both packages). All numbers and language now reconcile across README, Wiki, and the dart_corr package itself.

Juwon1405 committed May 17, 2026

60260cc
fix(dart-corr): honest scaffold status across three Wiki pages User flagged a real issue — dart_corr/ on github is a directory containing only README.md, but multiple Wiki pages describe dart-corr as if it were a functioning component with its own files. This commit brings the Wiki language in line with the actual v0.7.1 source-tree state. Three changes: (1) Wiki/dart-corr.md '## Files' section — the 'tree' diagram falsely listed dart_corr/correlation-rules.yaml as a file that exists. It does not exist in the repo. The Implementation note was correct (it pointed at dart_mcp/__init__.py) but the file tree contradicted it. Both replaced with an honest tree showing only README.md under dart_corr/, plus exact line numbers for the three real correlate_* functions inside dart_mcp. (2) Wiki/Home.md Core-components TOC entry — added an inline qualifier '(implementation currently inside dart_mcp; standalone package is a mid-2026 target — see the page)' to the dart-corr bullet, so a reader scanning the TOC does not click through expecting a fully-populated package. (3) Wiki/Architecture-deep-dive.md package-ownership table — added a subscript '*scaffold (v0.7.1) — implementation lives in dart_mcp*' to the dart_corr row, so the architectural diagram and the ownership table tell the same truth. What is NOT changed: - The architectural design (dart-corr OWNS contradiction detection as a logical responsibility) is correct and stays. - The MCP-surface functions (correlate_events, correlate_timeline, correlate_download_to_execution) are real, registered, and reachable — verified by tests/test_mcp_surface.py. - Case-PtH-Timestomp and Case-IP-KVM walkthroughs accurately describe what those functions do; the 'dart-corr' references in those pages are correct as descriptions of the logical component, not as claims about file locations. Why the discrepancy existed: v0.4-era plan was to ship dart_corr/ as a standalone package before the SANS submission. When the v0.5 timeline tightened, the correlation logic was inlined into dart_mcp (where the type system was already enforced) and the dart_corr/ extraction was deferred to mid-2026. The main README, the agentic-dart README, and dart_corr/README.md all updated honestly at that time; some Wiki pages did not. Now they do.

Juwon1405 committed May 17, 2026

3df5ffb
wiki: Case-IP-KVM measured accuracy block matched to actual case-01 data QA round caught: Case-IP-KVM.md claimed '12 ground-truth findings' and '8 files' in the measured-accuracy block. case-01-ipkvm-insider actually has 5 ground-truth findings (verified by counting ground_truth_findings array in examples/case-studies/case-01-ipkvm-insider/ ground-truth.json), referencing 7 distinct evidence paths (3 logical artifacts x 2 variant copies + audit.jsonl). corrected: L49 '12 ground-truth findings' -> '5 ground-truth findings' L65 '12 / 12 ground-truth findings' -> '5 / 5 ground-truth findings' L68 '(8 files, all SHA-256 unchanged)' -> '(7 evidence paths, all SHA-256 unchanged)' the '12 findings' figure appears to have been a pre-v0.5 hallucination that survived previous QA rounds because the block was inside a fenced code example and reviewers tend to read those as opaque.

Juwon1405 committed May 16, 2026

382bf69
wiki QA pass: file count 49->50, test count 31->75 (current snapshots only) post-v0.7.1 QA audit caught two latent drifts: evidence file count: - Accuracy.md L64 sample-evidence-realistic '49 files' was correct at the v0.7.0 evidence-fidelity enrichment time but v0.7.1 added linux/cron/sample.crontab fixture, raising the count to 50. measure_accuracy --variant realistic now reports evidence_files_measured: 50 against ground truth F-001 + F-013, which matches the actual repo state. test count: - Operator-guide.md L55 step-by-step quick-start - Phase-1.md L50 Empirical-validation 'fresh clone' summary - Roadmap.md L60 Phase-1 validation summary - Running-on-macOS.md L57 step header + L134 Apple Silicon notes all said '31 tests' (the v0.5.2 snapshot baseline). v0.7.1 ships '75 of 75 tests passing'. updated only the present-tense fresh-clone claims; the historical v0.5.2 release row in Phase-1.md L109 ('-> 31 tests passing') is preserved verbatim as a dated milestone.

Juwon1405 committed May 16, 2026

af7ec2b
wiki: sync to v0.7.1 — 11 cases, 72 MCP functions, case-11 highlight - Accuracy.md: '61 files' -> '49 files'; new v0.7.0 section covering case-11 supply-chain attack class; new v0.7.0 case-library summary table (11 cases / 99 findings split 69 layer-1 + 30 layer-2 + 32/36 function coverage) - Glossary.md: 'As of v0.6.0' -> 'As of v0.7.1: 72 native MCP tools' - Home.md: case-studies section rewritten to mention 11 cases / 99 findings plus case-11 as recommended judge walkthrough - MCP-function-catalog.md: previously missed v0.6.1 functions (parse_macos_quarantine, parse_linux_cron_jobs, detect_dns_tunneling) + v0.7.1 functions (parse_linux_text_log, parse_linux_shell_history) now properly documented with MITRE technique mappings and references - Phase-1.md: timeline extended with v0.5.4, v0.6.0, v0.6.1, v0.7.0, v0.7.1 milestones deliberately not touched — these are version-anchored historical records: v0.5.4 CFReDS section (locked at first external benchmark), playbook 'target_case_classes: 10 case classes' (playbook scenario classes, not evidence cases), v0.4 / v0.5 release rows.

Juwon1405 committed May 16, 2026

141623f
wiki: reflect v0.6.1 TA0011 entry — detect_dns_tunneling ships Three pages had TA0011 (Command-and-Control) listed as 'deferred to Phase 2' or 'partial coverage'. v0.6.1's detect_dns_tunneling adds: - Iodine and dnscat2 tool signature detection - Shannon-entropy on subdomain labels (threshold 3.8) - Long-label heuristic (>50 chars, near DNS spec max 63) - Rare query-type flagging (TXT / NULL / CNAME with subdomain) - Per-parent-domain volume in sliding window - BIND9 / dnsmasq / generic FQDN-extraction fallback parsers This opens active TA0011 coverage at the analysis layer. Full PCAP-based C2 detection is still Phase 2, but the typed MCP surface now meaningfully covers the tactic via DNS log analysis. Pages updated: FAQ.md L99, Phase-1.md L36, Roadmap.md L41. TA0009 Collection remains the single tactic explicitly deferred — that is collector-side (live memory capture) rather than analysis-side, which is by design for an architecture that consumes pre-collected evidence.

Juwon1405 committed May 14, 2026

9803433

NewerOlder

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

History

Revisions