Skip to content

feat(dictionary): graduated trust for non-dictionary learned words (#39)#64

Merged
AsafMah merged 1 commit into
devfrom
feat/c4-smart-graduated-trust
Jun 7, 2026
Merged

feat(dictionary): graduated trust for non-dictionary learned words (#39)#64
AsafMah merged 1 commit into
devfrom
feat/c4-smart-graduated-trust

Conversation

@AsafMah

@AsafMah AsafMah commented Jun 7, 2026

Copy link
Copy Markdown
Owner

Implements #39 (C4-smart) — completes the C4 dictionary epic (#18) on top of the #59 flag/Add/Block UI.

Problem

A learned word in no real dictionary (main/contacts/apps/personal) could out-rank a real dictionary word with better geometry after a single misfire — the "junk hijacks a real word" bug, worst in glide mode.

Fix (graduated trust)

As the last ranking step in getSuggestionResults (after session boost, so it can't be undone): an uncurated USER_HISTORY candidate that still outscores the best real-dictionary candidate is capped just below it, until its user-history frequency crosses a confirmation threshold (~3 repetitions). So:

  • a one-off junk word cannot out-rank a real word — a hard cap, independent of native score magnitude/sign (review caught that score-halving wasn't a guarantee and could be undone by the session boost);
  • a deliberately repeated new word still learns and, once confirmed, keeps full score;
  • when no real candidate exists, new words are left untouched (still offerable) — doesn't block new-word learning.

Details

  • Decision is a pure companion helper shouldPenalizeUnconfirmedWord(uncurated, freq), unit-tested; uncurated reuses isInNonHistoryDictionary (from fix(dictionary): stop deleted/junk words from resurrecting (incl. via swipe) #43); threshold (120 ≈ 3 uses) is a documented tunable constant.
  • Heavy lookups (isInNonHistoryDictionary/getFrequency) run only for history candidates that actually outscore a real word — a small subset, gated behind cheap checks.
  • Gated by new pref PREF_GRADUATED_TRUST (default on, Settings → Dictionaries).

Verification

  • Compiles; :app:testOfflineRunTestsUnitTest = 229 tests, 4 failed (all the Windows-only ParserTest set; pass on Linux CI) — zero new failures; new graduatedTrust test passes.
  • Cross-model review run; the cap-not-scale + post-session-boost design is a direct response to its findings.
  • Needs on-device playtesting to confirm the threshold and 'real candidate' set feel right (native ranking isn't JVM-coverable). I can build + install it on the S24+ on request.

Base dev, leaving for review + merge.

A learned word in no real dictionary (main/contacts/apps/personal) could out-rank
a real dictionary word with better geometry after a single misfire. Now, as the
LAST ranking step in getSuggestionResults (after session boost, so it can't be
undone), an uncurated USER_HISTORY candidate that still outscores the best
real-dictionary candidate is CAPPED just below it — until its user-history
frequency crosses a confirmation threshold (~3 repetitions). This guarantees a
one-off junk word can't hijack a real word regardless of native score magnitude,
while a deliberately repeated new word still learns and keeps full score. When no
real candidate exists, new words are left untouched (still offerable).

- Capping (not score-scaling) avoids any dependence on native score calibration or
  sign; applied post-session-boost so the boost can't re-promote junk.
- Decision is a pure companion helper (shouldPenalizeUnconfirmedWord), unit-tested;
  uncurated check reuses isInNonHistoryDictionary; threshold is a tunable constant.
- Gated by new pref PREF_GRADUATED_TRUST (default on).

The actual ranking effect needs the native scorer, so the threshold and the
'real candidate' set want on-device playtesting.
@AsafMah AsafMah merged commit 1b7fdad into dev Jun 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant