Skip to content

Releases: byevincent/ShareSift

v0.55.2 — Cascade walker/decode/rules

12 Jun 00:22

Choose a tag to compare

Four engagement-blocking fixes from HTB Cascade smoke test + top priorities from the rule corpus audit. Net: TightVNC .reg password catch lands Red end-to-end.

Fixed

Walker ACCESS_DENIED no longer crashes the share scan

Cascade Data share crashed on the first denied subdir (Contractors/) even though IT/ (containing the VNC password) was readable. Both share/smb.py and share/smb_impacket.py walkers now catch STATUS_ACCESS_DENIED, record skipped subtree, continue.

UTF-16LE files (.reg exports) decode correctly

extract.extract_text was UTF-8-decoding everything, garbling UTF-16 into W\\x00i\\x00n\\x00 strings where content regexes couldn't match. Now BOM-aware (UTF-16 LE/BE + UTF-8 BOM detection).

Added (3 rules)

  • ShareSiftKeepVncPasswordHex (Red) — TightVNC/UltraVNC \"Password\"=hex:... in .reg. Live-validated on HTB Cascade.
  • ShareSiftKeepRegistryAutoLogonPassword (Red) — generalizes to DefaultPassword, AutoAdminLogon, EncMasterPassword (WinSCP), PortablePassword.
  • ShareSiftKeepGitleaksHighConfidencePrefixes (Red) — Slack xox[bpe]-, GitHub gh[psuor]_/github_pat_, Stripe live, Vault hvs., Shopify, Twilio, SendGrid, npm. Closes the modern-SaaS gap Snaffler upstream predates.

Live-validated on HTB Cascade

sharesift hunt //10.129.13.58 -u r.thompson -p 'rY4n5eva':

  • All 4 shares walked (was 2 before)
  • Data\IT\Temp\s.smith\VNC Install.reg → Red with ShareSiftKeepVncPasswordHex
  • SYSVOL went from crashed to 14 files, 5 tier-flagged

Tests

+19. Full suite: 1458 passed, 29 skipped, 0 failed.

v0.55.1 — Kerberos ccache fixes from HTB Sauna

11 Jun 22:59

Choose a tag to compare

Three Kerberos ccache findings from HTB Sauna (EGOTISTICAL-BANK.LOCAL). All three surfaced live; the clock-skew fix is the biggest operational win because HTB labs commonly run ~7h ahead of attacker-box time.

Fixed

Auth(kerberos=True) no longer requires -u

The user principal lives in the ccache; pre-fix ShareSift forced redundant -u <principal> on the CLI.

impacket kerberosLogin was called without kdcHost

Without an explicit KDC host, impacket falls back to DNS lookup for <realm>:88 which fails on attacker boxes without proper resolv.conf. New Auth.kdc_host field; both share.discovery._do_login and share.smb_impacket._do_login now pass kdcHost=auth.kdc_host or target_host, falling back to the SMB target for the AD case where DC == target.

Auto clock-skew shim

New share.auth.install_kerberos_clock_offset() reads the ccache's authtime, compares to local clock, and (if offset > 60s) monkey-patches impacket.krb5.kerberosv5.datetime to add the offset to all datetime.datetime.now(tz) calls. Surgical — only impacket's krb5 module is affected; the rest of Python sees real time. Called automatically from both impacket login dispatch sites.

Live-validated

KRB5CCNAME=/tmp/fsmith.ccache sharesift hunt //10.129.13.53 --use-kcache:

  • Clock skew (~7h) → corrected by auto-shim
  • No -u required (read from ccache)
  • kdcHost defaulted to target host
  • Hunt advances past AP-REQ to KDC_ERR_S_PRINCIPAL_UNKNOWN — that's the engagement-prep SPN-on-IP issue (operator adds DC FQDN to /etc/hosts and uses FQDN as target).

Tests

+14 (test_kerberos_fixes_v0p55p1.py). Full suite: 1439 passed, 29 skipped, 0 failed.

v0.55.0 — DFS namespace root walking (Multimaster live-validated)

11 Jun 19:15

Choose a tag to compare

Closes the Multimaster DFS scenario end-to-end. After v0.54.1 let DFS shares pass the probe gate, the walker still failed with STATUS_INVALID_PARAMETER on the namespace root — smbprotocol's regular Open + query_directory doesn't work because the namespace root isn't a real directory, just a referral table.

Fixes

_list_directory DFS root fallback

When tree is DFS-capable and CREATE returns INVALID_PARAMETER, fall back to smbclient.scandir which handles the namespace-root listing via its internal _resolve_dfs.

walk() PATH_NOT_COVERED graceful skip

DFS-link descent typically fails because the resolved fileserver needs operator-managed DNS (standard engagement prep — /etc/hosts entry). Walker now catches PATH_NOT_COVERED, records the skipped link in self._skipped_dfs_links, and continues. Share scan completes cleanly.

Smbclient package shadow workaround

impacket ships a smbclient.py script in venv bin/ that shadows the smbprotocol package under uv run. New _import_real_smbclient helper strips bin dirs from sys.path during the import.

Live-validated against HTB Multimaster

sharesift hunt //10.129.13.28 -u tushikikatomo -p finance1:

  1. dfs share probe → R ✅ (v0.54.1)
  2. Namespace root listing → Development link ✅ (v0.55 fallback)
  3. Link descent → PATH_NOT_COVERED → skipped gracefully ✅ (v0.55 walk fix)
  4. Share scan completes, pipeline continues to NETLOGON + SYSVOL ✅

The v0.53 resolver correctly resolved Development → \\FSMO\Development; walking that requires FSMO in /etc/hosts (engagement prep).

Tests

+7 (test_smb_dfs_walk_v0p55.py). Full suite: 1425 passed, 29 skipped, 0 failed.

Status

DFS scenario is now end-to-end correct from probe → list root → discover links → walk-or-skip. Combined with v0.53's referral resolution and v0.54's three engagement fixes, ShareSift handles:

  • Anonymous SMB shares (Active.htb pattern)
  • Legacy SMB targets (Server 2008 R2)
  • DFS namespace roots + links

Queued for v0.56: GOAD-validated head-to-head benchmark.

v0.54.0 — three engagement fixes from HTB smoke tests

11 Jun 19:04

Choose a tag to compare

Three real bugs from yesterday's HTB Active + Multimaster smoke tests, all fixed and live-validated where possible.

v0.54.1 — DFS-namespace-root probe (LIVE-VALIDATED)

Surfaced on Multimaster's `\\\dfs` share. Regular SMB2 CREATE on a DFS namespace root returns STATUS_INVALID_PARAMETER (DFS-aware Open required). v0.53's R/W probe was filtering DFS shares out before the walker could touch them.

`SmbShare._probe_access_mask` now treats INVALID_PARAMETER as probe-inconclusive with caller-supplied fallback: read=True (DFS roots ARE walkable), write=False (namespace roots aren't writable). Validated live on `\\10.129.13.28\dfs` — share enters target list with `access: R`.

v0.54.2 — SMB3 encryption auto-fallback

Surfaced on Active.htb (Server 2008 R2, only does SMB 2.0/2.1). Default `--encrypt=True` failed with "SMB encryption is required but the connection does not support it."

`SmbShare._ensure_connected` inspects the negotiated dialect after `Connection.connect()`. Below SMB 3.0 (0x0300) and not `--require-encrypt`: session built with `require_encryption=False`. Legacy Windows targets just work. New `--require-encrypt` flag for opsec engagements where unencrypted is unacceptable.

v0.54.3 — Anonymous SMB via impacket fallback

Surfaced on Active.htb's `Replication` share. smbprotocol+pyspnego rejects empty credentials (`SpnegoError (16): Operation not supported or available`). impacket's null-session login works.

`SmbShare` now lazily constructs an `ImpacketSmbWalker` backend when `auth.anonymous=True`, delegating walk/read_bytes/probe_share_access. Mirrors the smbprotocol contract: sorted deterministic walk, UNC output, byte-cap on reads. Live re-validation pending (Active.htb despawned between fix and re-test).

Tests

+32 tests: 5 DFS-probe + 5 encrypt-fallback + 17 anonymous (split TestAnonymousDispatch + TestImpacketWalker) + 5 v0.35 updates. Full suite: 1418 passed, 29 skipped, 0 failed.

Queued for v0.55

  • DFS-aware Opens for walking INTO namespace roots (smbprotocol `tree.is_dfs_share` flag handling). v0.54.1 lets DFS shares enter the target list; v0.55 lets the walker descend into them. Multimaster's `Development` link still trips this — `_list_directory` on the namespace root needs DFS flags set.
  • GOAD-validated head-to-head benchmark.

v0.53.1 — HTB Active smoke-test patch + MD4 LDAP fix

11 Jun 18:23

Choose a tag to compare

End-to-end validation against a real AD lab. First real-AD smoke test (HTB Active, 10.129.13.21, Server 2008 R2) — ShareSift caught the GPP cpassword in Groups.xml as Red tier with the gpp_xml parser, confidence 0.99. That's the exact credential the box is designed to leak.

Three real bugs surfaced; this patch ships the highest-priority fix.

Fixed

ldap3 NTLM bind on OpenSSL 3.x

hashlib.new('md4') raised ValueError: unsupported hash type MD4 on modern Python+OpenSSL (Kali default), blocking the entire v0.52 authenticated LDAP path. share/ad.py now installs a Cryptodome.Hash.MD4-backed shim at module import. Idempotent; no-op when hashlib already supports MD4 (older OpenSSL or legacy provider enabled).

Before:

$ sharesift discover --ad-domain active.htb --dc 10.129.13.21 -u SVC_TGS -p 'X'
ldap discovery failed: ValueError: unsupported hash type MD4

After:

$ sharesift discover --ad-domain active.htb --dc 10.129.13.21 -u SVC_TGS -p 'X'
ldap: 1 enabled computer object(s)

Anonymous LDAP empty-result UX

When AD policy blocks anonymous searches (operationsError, typical on modern AD), we now print a hint pointing at -u/-p, -H, or -k instead of silently reporting 0 results.

Documented

docs/v0p53_htb_smoke_test.md — full HTB Active run writeup with the headline GPP cpassword catch, three bugs surfaced, queued v0.54 fixes.

Queued for v0.54

  1. smbprotocol anonymous fallback to impacket for SMB walks (pyspnego rejects empty creds; discover works because it uses impacket, but hunt --no-pass fails at the per-share probe).
  2. Auto-detect SMB3 capability and fallback to unencrypted (Server 2008 R2 only does SMB 2.0/2.1; current default --encrypt=True fails). New --require-encrypt flag for the opsec case.
  3. Live-DC validation of v0.53 DFS resolver (Active.htb has no DFS — DFS still unvalidated against real AD).

Tests

Full suite: 1391 passed, 29 skipped, 0 failed.

v0.53.0 — DFS referral resolution + GOAD benchmark harness

11 Jun 17:37

Choose a tag to compare

DFS just works. v0.52's hunt command now handles \\corp.local\dept\hr-shaped UNCs transparently:

# No flag needed — auto-resolved
sharesift hunt //corp.local/dept/hr -u alice -p PW \
    --output-dir /tmp/dfs-hunt

Behind the scenes: SmbShare catches STATUS_PATH_NOT_COVERED on tree-connect, queries FSCTL_DFS_GET_REFERRALS over IPC$, parses the referral chain, and retargets to the resolved fileserver. Implementation mirrors smbclient._pool.dfs_request (private API in jborean93/smbprotocol; we reimplement using public primitives so we don't bind to internals).

What shipped

DFS referral resolution

  • share/dfs.pyDfsResolution dataclass + dfs_request_via_ipc (IOCTL wire-format) + first_target_unc + resolve_dfs_path (orchestration) + is_path_not_covered
  • share/smb.pySmbShare.auto_resolve_dfs=True (default), catches PathNotCovered, chases referrals via IPC$, retries against the resolved fileserver. Original target preserved as _original_target.
  • hunt --detect-dfs is now informational-only — auto-resolution runs regardless.

GOAD benchmark harness

For when you stand up GOAD (or any AD lab):

python tools/goad_benchmark.py \
    --ad-domain sevenkingdoms.local --dc 192.168.56.10 \
    -u khal.drogo -p horse \
    --snaffler-tsv ./snaffler_run.tsv \
    --output-dir ./goad_bench_$(date +%Y-%m-%d)

Produces scorecard.md with per-category recall comparison across 19 buckets (GPP cpassword, KeePass, AWS, browser stores, SCCM NAA, etc.) clustering Snaffler's rule labels and ShareSift's rule IDs around shared credential shapes. See docs/goad_benchmark_methodology.md for the lab setup recipe.

Tests

+36 tests (18 DFS resolution + 18 GOAD harness). Full suite: 1391 passed, 29 skipped, 0 failed.

Honest caveats

  • DFS resolution mocked-only — no live-DC validation yet. The first run against a real domain DFS namespace will surface any wire-format edge cases (V4-specific server_type bits, multi-target priority ordering when proximity differs).
  • GOAD benchmark harness pure-function-tested — the actual subprocess.run invocation and TSV-file roundtrip await the lab being up.
  • v0.52 LDAP smoke test still pending — until ShareSift is pointed at a real AD (HTB, GOAD, work), the LDAP + DFS paths are mock-validated only.

What v0.53 doesn't handle

  • Interlink referrals (referral chains across namespaces)
  • Referral caching (every connection re-queries)
  • Sticky target hints (always picks first entry, no failover)
  • Multi-DC LDAP failover

All queued for v0.54+.

See docs/v0p53_results.md for the full sprint writeup.

v0.52.0 — Snaffler-replacement enumeration sprint

11 Jun 04:29

Choose a tag to compare

One command Snaffler replacement. ShareSift becomes a self-contained Linux-native attacker workflow:

sharesift hunt --ad-domain corp.local --dc dc01.corp.local \
    -u alice -p PW --output-dir ./engagement

Takes a domain + creds and returns ranked credential findings across every joined host's readable shares. No Snaffler binary, no nxc --shares glue, no shell pipe.

What shipped

Capability Module / CLI
LDAP-based AD computer object enumeration share/ad.py
AD-wide share discovery sharesift discover --ad-domain corp.local -u U -p P
End-to-end Snaffler-replacement sweep sharesift hunt --ad-domain corp.local -u U -p P --output-dir ./out
Pass-the-Hash via LDAP NTLM share/ad.py (lm:nt password encoding)
Kerberos via LDAP SASL GSSAPI share/ad.py (KRB5CCNAME ccache)
DFS detection utilities (opt-in) hunt --detect-dfs

Operator workflows

AD-wide credential hunt:

sharesift hunt --ad-domain corp.local --dc dc01.corp.local \
    -u alice -p PW --output-dir ./engagement

Pass-the-Hash from dumped NT hash:

sharesift hunt --ad-domain corp.local \
    -u svc_backup -H 'aad3b...:1c63...' \
    --output-dir ./engagement

Kerberos via existing ccache:

kinit alice@CORP.LOCAL
sharesift hunt --ad-domain corp.local --use-kcache \
    --output-dir ./engagement

Findings from the foundation audit

Most of the originally-scoped v0.52-v0.55 sprint (R/W ACL probe fixing Snaffler #184, Snaffler skip-list, Kerberos ccache, NetrShareEnum) was already shipped in v0.39 + v0.40. Real gaps were three: LDAP discovery, DFS, hunt command. Sprint compressed from ~5 weeks to one session.

Honest scope caveats

  • LDAP path tested against ldap3 mocks, not a live DC. First-run on GOAD will validate.
  • DFS referral resolution not yet shipped — detection utilities only, opt-in via --detect-dfs (heuristic false-positives on every FQDN host). Full referral chasing queues for v0.53.
  • No live-AD head-to-head benchmark yet. sharesift hunt vs Snaffler.exe -s -d corp.local on a GOAD-class lab queues for v0.55.

Tests

46 new (24 LDAP discovery + 11 DFS detection + 11 hunt orchestration). Full suite: 1299 passed, 51 skipped, 0 failed.

See docs/v0p52_results.md and docs/v0p52_snaffler_replacement_plan.md for the full sprint writeup.

v0.51.0 — first real corporate-share benchmark + Snaffler head-to-head

10 Jun 21:50

Choose a tag to compare

v0.51.0 — first real corporate-share benchmark

The first published head-to-head against upstream Snaffler on a
real Windows NTFS share, not LLM-curated paths.

The number

Tool Caught Missed FPs F1 at Red+
Upstream Snaffler 16 59 4 0.337
ShareSift v0.51 54 21 62 0.565

2525 files. 75 synthetic-but-format-shaped credentials across 16
categories. Operator triage policy (Red+).

ShareSift catches 3.4× more credentials than Snaffler. At the
cost of 15× more false positives, which is the genuine tradeoff:
the path classifier is aggressive on binary-extension noise (.msi
/.iso/.psd). Run Black-only for P=0.833 if you don't want them;
run Red+ if you don't want 59 real credentials silently missed.

Why this corpus exists

The v0.50 scorecard had one honesty caveat: the Windows precision
number (P=0.984 on snaffler-blind) came from LLM-labeled paths,
not real share content. v0.51 replaces it with:

  • 2525 actual files on an NTFS partition built from a reproducible
    JSON manifest via Stauffer's DiskForge
  • 75 positives across 16 categories — one per ShareSift rule
    generation v0.46→v0.50, plus the classic high-value categories
  • 2420 corporate-share noise + 20 precision-stress filenames
  • UNC backslash form (\\corp-fs01\…) — what the rule engine sees
    on real SMB shares
  • One docker run from the committed seed → byte-identical corpus

Honest caveat

The 16 positive categories were authored to exercise ShareSift's
rule coverage. Snaffler's defaults don't ship with rules for
German cred filenames, CMD set "VAR=val", browser-creds
meta-coverage, etc. A neutral-curated corpus would show Snaffler
at maybe 40–50% recall. The categories ShareSift covers are real
corporate-share shapes (operator-reported in Snaffler's own issue
tracker), not invented for benchmark-chasing — but the
operational gap is amplified by category selection. Full
disclosure in docs/diskforge_winshare_v1_results.md.

What didn't change

The 4-generation held-out discipline cycle is still the
methodology contribution. v3 still at 100%, v4 still at 70%
baseline. The benchmark adds the operational head-to-head story
on top.

Reproducing

git clone --branch v0.51.0 https://github.com/byevincent/ShareSift.git
cd ShareSift
uv sync --group pysnaffler-integration
bash tools/diskforge_winshare/build_corpus.sh
.venv/bin/python tools/run_full_sweep.py

Same seed = byte-identical corpus = same numbers.

Artifacts

  • sharesift — 77MB single-file binary (Stage 1 + rule engine)
  • Full source — git clone --branch v0.51.0

🤖 Generated with Claude Code

v0.48.0 — close v0.47 held-out underfit, cleanly

10 Jun 03:31

Choose a tag to compare

ShareSift v0.48.0 — same-day follow-up to v0.47. Closes the held-out underfit by running the discipline experiment properly: lock NEW held-out FIRST, then write rules from OLD held-out failures only, then validate.

TL;DR

Gate v0.47 v0.48
Corpus (training) 18/19 (95%) 18/19 (95%)
Held-out v1 4/11 (36%) 10/11 (91%)
Held-out v2 (new locked) n/a 7/10 (70%)
MSF3 / MSF2 / DiskForge recall 1.000 / 1.000 / 0.923 1.000 / 1.000 / 0.923 (held)
v0.48 rule FP contribution n/a 0 across all three

The generalization signal: ShareSiftKeepBrowserSavedCreds was authored as "generalize Firefox to other Chromium-base browsers." It directly closed 2 held-out v2 probes (Chrome + Edge Login Data) that were locked BEFORE the rule was written — pattern-level generality catching parallel patterns. That's the discipline working as intended.

Full writeup in docs/v0p48_results.md.

Seven new rules (close OLD held-out, sourced #78/#135/#67/#46)

Rule Tier Match Closes
ShareSiftKeepCiscoEnableSecret Red Content #78 (Cisco IOS enable secret/password/type-7)
ShareSiftKeepCiscoSnmpCommunity Red Content #78 (SNMP RW community)
ShareSiftKeepCiscoSnmpCommunityRo Yellow Content #78 (SNMP RO community)
ShareSiftKeepFileZillaSavedSites Black FilePath #135 (sitemanager.xml saved FTP/SFTP)
ShareSiftKeepFileZillaRecentServers Yellow FilePath #135 (recentservers.xml)
ShareSiftKeepDotNetAppSettingsConnString Red Content #67 (.NET appsettings.json conn string)
ShareSiftKeepBrowserSavedCreds Black FilePath #46 (Chrome/Edge/Brave/Opera Login Data)

Both extra_rules.json (engine) and extra_rules.py (pysnaffler compat).

New held-out v2 (locked test set)

benchmarks/snaffler_issues/heldout_v2.jsonl — 10 probes from previously-unread Snaffler PR sources:

  • #198 (CMD set PASSWORD=)
  • #155 (Azure CLI az login --password)
  • #124 (XML <password> with nested tag)
  • #98 (loose "credential" filename keyword)
  • #46 (Chrome + Edge Login Data — Firefox cousins)

Pre-rule baseline: 5/10 (the v0.47 KeepDoubleDashPassphrase already generalized to Azure CLI patterns — free signal). Post-rule: 7/10 (browser-creds meta-rule catches Chrome + Edge).

eval_snaffler_issues.py grows --set {corpus,heldout,heldout_v2,all}.

What's NOT in v0.48 (deliberate discipline)

3 held-out v2 fails come from sources I MINED for held-out v2:

  • heldout-v2-198-cmd-set-pgpassword-quotedset "PGPASSWORD=val"
  • heldout-v2-98-credential-in-filenamecredentials_2024.xlsx
  • heldout-v2-98-credentials-exportCustomerCredentialsExport.csv

Adding rules for these in v0.48 would be tuning toward held-out v2 (discipline violation). They become v0.49 candidates — a future held-out v3 will validate them against patterns I haven't yet read.

This is how a discipline-honest research cycle should grow: each version locks the next test set BEFORE writing the rules that close the previous one.

Existing benchmark impact

Benchmark v0.47 R v0.48 R v0.48 rule FP
MSF3 1.000 1.000 0
MSF2 1.000 1.000 0
DiskForge 0.923 0.923 0

Zero v0.48 rules fired on any of the three (neither TP nor FP). The Cisco IOS / FileZilla / ADO / browser-creds patterns don't appear in those substrates — MSF3 is AD Windows-shaped, MSF2 Linux Metasploitable, DiskForge a forensic disk image. Rules are surgical to corporate-share patterns.

Binary

77.2 MB single-file binary attached (sharesift). Verified:

wget https://github.com/byevincent/ShareSift/releases/latest/download/sharesift
chmod +x sharesift
./sharesift --version  # sharesift 0.48.0

v0.49 candidate list

  1. Close held-out v2 remaining gaps (CMD set "VAR=val" quoted variant, loose "credential" filename keyword)
  2. Lock held-out v3 from yet-unread sources (#112 SCCM, #140 Kerberos, #139 MDE Linux)
  3. After v0.49: three generations of held-out signal = calibrated confidence in "corporate-share benchmark progress"

🤖 Generated with Claude Code

v0.46.0 — drop-on-Kali binary + DB exporters

10 Jun 00:10

Choose a tag to compare

ShareSift v0.46.0 — combined ship covering engagement-DB exporters and the PyInstaller single-file binary breakthrough.

Headline

Workflow Before After
Get findings into the report tool grep + hand-format sharesift export --format ghostwriter
Get findings into SysReptor not supported sharesift export --format sysreptor
Drop ShareSift on a fresh Kali box pipx install + 100MB deps wget .../sharesift && chmod +x
Binary size 1.5 GB (v0.38 attempt) 77 MB (20× smaller)
Tests passing 1309 1309

Single-file binary (77 MB)

wget https://github.com/byevincent/ShareSift/releases/latest/download/sharesift
chmod +x sharesift
./sharesift --version
# sharesift 0.46.0

Covers score-paths, scan-files (rule + extractor), to-snaffler-tsv, sort, query, export. Operators wanting SMB-direct, network discovery, verify, content-classifier, or report rendering use pipx install 'sharesift[smb,network-enum,content-inference,verify,report]' instead.

The size shrink came from a minimal build venv (no torch transitive pulls) + aggressive PyInstaller excludes. Two gotchas worth recording: strip and upx corrupt scipy's OpenBLAS shared lib (binary crashes at import); --clean breaks PyInstaller's PYZ archive. Both documented in docs/v0p46_results.md.

Engagement DB exporters

Three new formats off the v0.41 SQLite datastore:

sharesift export --db engagement.db --format markdown --output findings.md
sharesift export --db engagement.db --format ghostwriter --output findings.csv
sharesift export --db engagement.db --format sysreptor --output sysreptor.json
  • Markdown — pastes into Dradis, GhostWriter, SysReptor, Notion, Slack, plain delivery docs
  • GhostWriter CSV — direct CSV import; columns match the findings-page schema, tier maps to severity
  • SysReptor JSONprojects/v1 envelope with lowercased severities

All three sort tier > host > share > rel_path.

Path-prefix dedup deferred

Diagnostic showed MSF3 top-12-30 dominated by 19 copies of an Internet Explorer cache backup. Fixing requires either a path-prefix penalty or rule-action awareness (treat Yellow-from-Relay as Green); both are research-y patterns. v0.28's falsified extension-frequency hypothesis is the cautionary precedent. Top-10 already at 0.80 — not worth disturbing for a marginal gain. Re-open if a future benchmark shows the duplicate-backup pattern materially hurting top-K precision.

What's in the binary

Bundled at runtime:

  • Stage 1 path classifiers (Windows + Linux LightGBM models, ~39 MB combined)
  • Rule sets: snaffler_default.json (88 base) + extra_rules.json (v0.12 blind-spot + Gitleaks modern SaaS + v0.42 Linux gap closure)

Excluded (use pipx extras instead):

  • Content classifier (torch, ~1.5 GB)
  • SMB-direct (smbprotocol, ~30 MB)
  • Network discovery (impacket, ~100 MB)
  • Verifiers (requests/paramiko/ldap3/jwt/boto3, ~50 MB)
  • Report rendering (jinja2)

Changelog

See CHANGELOG.md and docs/v0p46_results.md for the full write-up.

Honest assessment vs Snaffler

v0.45's assessment said ShareSift was technically on-par for most engagement workflows but lagged Snaffler on two fronts: "drop binary on a box" and "feed straight into the report." v0.46 closes both. Open gaps for v0.47+: status heartbeat on long scans, HTML report's Markdown twin, path-prefix dedup with rule-action awareness.

🤖 Generated with Claude Code