v0.51.0 — first real corporate-share benchmark

The first published head-to-head against upstream Snaffler on a
real Windows NTFS share, not LLM-curated paths.

The number

Tool	Caught	Missed	FPs	F1 at Red+
Upstream Snaffler	16	59	4	0.337
ShareSift v0.51	54	21	62	0.565

2525 files. 75 synthetic-but-format-shaped credentials across 16
categories. Operator triage policy (Red+).

ShareSift catches 3.4× more credentials than Snaffler. At the
cost of 15× more false positives, which is the genuine tradeoff:
the path classifier is aggressive on binary-extension noise (.msi
/.iso/.psd). Run Black-only for P=0.833 if you don't want them;
run Red+ if you don't want 59 real credentials silently missed.

Why this corpus exists

The v0.50 scorecard had one honesty caveat: the Windows precision
number (P=0.984 on snaffler-blind) came from LLM-labeled paths,
not real share content. v0.51 replaces it with:

2525 actual files on an NTFS partition built from a reproducible
JSON manifest via Stauffer's DiskForge
75 positives across 16 categories — one per ShareSift rule
generation v0.46→v0.50, plus the classic high-value categories
2420 corporate-share noise + 20 precision-stress filenames
UNC backslash form (\\corp-fs01\…) — what the rule engine sees
on real SMB shares
One docker run from the committed seed → byte-identical corpus

Honest caveat

The 16 positive categories were authored to exercise ShareSift's
rule coverage. Snaffler's defaults don't ship with rules for
German cred filenames, CMD set "VAR=val", browser-creds
meta-coverage, etc. A neutral-curated corpus would show Snaffler
at maybe 40–50% recall. The categories ShareSift covers are real
corporate-share shapes (operator-reported in Snaffler's own issue
tracker), not invented for benchmark-chasing — but the
operational gap is amplified by category selection. Full
disclosure in docs/diskforge_winshare_v1_results.md.

What didn't change

The 4-generation held-out discipline cycle is still the
methodology contribution. v3 still at 100%, v4 still at 70%
baseline. The benchmark adds the operational head-to-head story
on top.

Reproducing

git clone --branch v0.51.0 https://github.com/byevincent/ShareSift.git
cd ShareSift
uv sync --group pysnaffler-integration
bash tools/diskforge_winshare/build_corpus.sh
.venv/bin/python tools/run_full_sweep.py

Same seed = byte-identical corpus = same numbers.

Artifacts

sharesift — 77MB single-file binary (Stage 1 + rule engine)
Full source — git clone --branch v0.51.0

🤖 Generated with Claude Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.51.0 — first real corporate-share benchmark + Snaffler head-to-head

Choose a tag to compare

Sorry, something went wrong.