Skip to content

v0.47.0

@byevincent byevincent tagged this 10 Jun 02:56
Seven new rules in extra_rules.json (+ Python mirror for
pysnaffler compat) targeting the Snaffler-issues benchmark gaps:

- ShareSiftKeepFirefoxSavedCreds      (Black, FilePath, #46)
- ShareSiftKeepGppPolicyXml           (Black, FilePath, #31)
- ShareSiftKeepGermanCredFilenames    (Red,   FileName, #53)
- ShareSiftKeepWireguardPrivateKey    (Black, Content,  #119)
- ShareSiftKeepOpenvpnAuthUserPassRef (Red,   Content,  #119)
- ShareSiftKeepCiscoAnyconnectXml     (Yellow, FileName,#119)
- ShareSiftKeepDoubleDashPassphrase   (Red,   Content,  #158)

Scorer fix: ``eval_snaffler_issues.py`` was running only one engine
per probe. Production cascade runs both stages and takes max-tier.
The scorer now mirrors that — fixed during corpus iteration when
new FilePath rules appeared to "not fire" on path probes.

Honest scoreboard against the discipline gates:

  Corpus:    8/19 (42%) → 18/19 (95%)
  Held-out:  1/11 (9%)  → 4/11 (36%) — BELOW 50% gate
  MSF3 R:    1.000      → 1.000      held
  MSF2 R:    0.971      → 1.000      +1 catch (/root/reset_logs.sh)
  DiskForge: 0.923      → 0.923      held
  v0.47 rule FP contribution: 0 across all three benchmarks

Held-out below gate is an underfitting result, not overfitting:
audit shows zero FPs from any v0.47 rule on existing benchmarks.
Shipping per option-2 discipline call — surface the gap publicly
rather than tune toward held-out (post-hoc rule-shaping would
defeat the experiment).

Full reasoning + v0.48 candidate list in docs/v0p47_results.md.

Version: 0.46.0 → 0.47.0. Tests: 1309 passed.
Assets 2
Loading