v7.7.16
styxx 7.7.16 — abstain_on_confab: the closed-loop detect-and-abstain primitive
Cumulative release: ships the single-pass confab gates (7.7.14) and the retrieval arm / two-signal audit_claim (7.7.15) alongside the new detect-and-abstain loop (7.7.16).
pip install -U styxxNew in 7.7.16 — styxx.abstain_on_confab + AbstainDecision
from styxx import single_pass_confab, calibrate_single_pass, abstain_on_confab
cal = calibrate_single_pass(confab_entropies, correct_entropies) # per-model threshold
score = single_pass_confab(first_token_logits, entropy_threshold=cal.entropy_threshold)
decision = abstain_on_confab(model_answer, score)
decision.answer # the answer, OR "I'm not sure." if the confab gate fired
decision.abstained # boolThe deployable, framework-free form of the closed-loop honesty primitive — gate a candidate answer through a CALIBRATED confab detector and return an honest abstention when it fires.
The detector is load-bearing — and the API enforces it. A pre-registered white-box experiment (FINDING_honesty_knob_2026_05_30, SURVIVED, n=32/24 powered) showed the underlying mechanistic abstention intervention has no intrinsic selectivity — applied ungated it dissolves correct answers as readily as confabulations (raw selectivity −0.08; a blanket lobotomy). Only the calibrated detector (gate AUC 0.924) makes abstention targeted: the gated loop catches-and-abstains 0.75 of confabs while sparing 0.875 of correct answers. So abstain_on_confab refuses an uncalibrated score (ValueError) — you must calibrate_single_pass first. Detection is not optional diagnosis; it is the prerequisite for safe intervention.
Scope: abstention, not correction (repair-to-truth is a closed negative — depth-steering is correctness-inert). A fail-safe, not a fix.
Also included
- 7.7.15 —
retrieval_check+audit_claim(verify_retrieval=True): a two-signal gate (model-internal confab detector for unstable errors + external-grounding arm for stable factual misconceptions). - 7.7.14 —
single_pass_confab/span_confab/calibrate_single_pass: confab detection from ONE forward pass (~10× cheaper than N=10 resampling), white-box first-token + closed-model span variants.
Full Changelog: v7.7.13...v7.7.16