You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#81 (PR #88) ships first-token confidence suppression with a default threshold of `0.10`. That number is a guess — there is no field distribution of "what does the top-1 raw-logit softmax look like on real prompts?" to anchor it.
Proposal
Add a debug-only mode (settings toggle or env var, behind `#if DEBUG` or a hidden defaults key) that emits one log line per generation with the first-token top-1 probability and the token string, regardless of whether suppression fires.
Then dogfood for a session, plot the histogram, and pick the threshold where the long tail of "junk continuations" starts.
Why it matters
Right now the gate is opt-in with a guess threshold; even users who turn it on won't know if 0.10 is too lenient or too strict. We need one short telemetry pass to set a defensible default before considering default-on.
Acceptance
Debug-only logging path that emits top-1 probability + token for every first-token sample.
Problem
#81 (PR #88) ships first-token confidence suppression with a default threshold of `0.10`. That number is a guess — there is no field distribution of "what does the top-1 raw-logit softmax look like on real prompts?" to anchor it.
Proposal
Add a debug-only mode (settings toggle or env var, behind `#if DEBUG` or a hidden defaults key) that emits one log line per generation with the first-token top-1 probability and the token string, regardless of whether suppression fires.
Then dogfood for a session, plot the histogram, and pick the threshold where the long tail of "junk continuations" starts.
Why it matters
Right now the gate is opt-in with a guess threshold; even users who turn it on won't know if 0.10 is too lenient or too strict. We need one short telemetry pass to set a defensible default before considering default-on.
Acceptance