fix(vocab-quiz): tighten rotation curve for fresh words (#191) by davidortinau · Pull Request #198 · davidortinau/SentenceStudio

davidortinau · 2026-05-03T02:31:01Z

Closes #191.

Summary

Fresh words were rotating out of vocab quiz rounds at turn 4 with all-correct answers — too fast for practice intent. This PR pushes the earliest legal rotation for a fresh word to turn 5 without regressing already-known words. Two knobs are tuned.

Production changes (2 lines)

File	Change
`VocabularyProgressService.cs:21`	`EFFECTIVE_STREAK_DIVISOR` `7.0f` → `12.0f`
`VocabularyQuizItem.cs:33-55`	Tier 2 trigger `OR` → `AND`, floor `(2,1)` → `(4,2)`

Why two knobs. The divisor change alone slows mastery growth but Tier 2's OR trigger plus weak floor still let a fresh word slip out via a single Text correct + streak alone. The Tier 2 tightening closes that escape hatch.

Simulator (fresh word, all-correct turns)

tools/quiz-rotation-sim/sim.py reproduces production math exactly. Run: python3 tools/quiz-rotation-sim/sim.py.

Turn	Current (÷7, OR/2,1)	This PR (÷12, AND/4,2)
4	mastery 0.714 → ROTATES (bug)	mastery 0.417, no rotate
5	mastery 1.000	mastery 0.583 → ROTATES
6	—	mastery 0.667

Already-known words unchanged. Mastery ≥ 0.80 + streak ≥ 8 still hits Tier 1 with a single text correct (Tier 1 logic unchanged). No spaced-repetition penalty for words already mastered.

No user-data regression. Stored MasteryScore cannot decrease from the divisor change because mastery is monotonic on correct: max(streakScore, mastery) at VocabularyProgressService.cs:154. Only words mid-climb grow more slowly going forward — which is the intent.

Test impact

Jayne's repro (Repro191_NewWord_AllCorrect_DoesNotRotateOutBeforeFifthTurn, from PR test: failing repro for vocab quiz scoring bugs (#189, #191) #195): FAIL → PASS ✅. This PR is branched off test/vocab-quiz-scoring-repro-189-191 so the fix lands together with its verification harness.
~10 mastery-math fixtures updated to track the new divisor (5 MC + 2 Text → 8 MC + 2 Text demonstrations of IsKnown; divisor literals /7.0f → /12.0f).
New test: Tier2_TriggerRequiresBothMasteryAndStreak covers the OR → AND change.
Renamed: Tier2_MidMastery_Rotates_2CorrectWith1Text → Tier2_MidMastery_Rotates_4CorrectWith2Text, plus Tier2_MidMastery_BlockedByLowSessionCorrect for the new floor.
All 520 unit tests pass.

Captain & SLA review

Captain approved the two-knob proposal (full markdown lives in .squad/decisions/inbox/wash-vocab-quiz-scoring-proposal-191.md, gitignored). Language-tutor SLA review chose the turn-5 floor over a more aggressive turn-6 floor as the right balance between within-session retention demonstration and learner spaced-repetition load.

Out of scope (separate issue)

Decouple MasteryScore from SessionRotationReady so session pacing and long-term mastery tracking become independent levers — the higher-leverage architectural fix the tutor flagged. Tracked in Decouple MasteryScore from SessionRotationReady (cross-session evidence requirement) #197 — out of scope for this PR.
Wrong-answer mastery decay path is untouched (only the correct-answer divisor changes).
No schema changes; obsolete-field write paths in ProgressService are not touched (sync compat).

Manual verification before merge

Recommend a Mac Catalyst smoke per .claude/skills/e2e-testing/references/quiz-activities.md: load a fresh word, answer correctly several turns in a row, confirm rotation timing matches the simulator (rotates at turn 5, not turn 4) and Learning Details panel reflects the slower mastery climb.

CI note

Pre-existing Linux MAUI/wasm-tools workload install failure on main is unrelated; per Captain's standing order, gh pr merge --admin is authorized if all unit tests are green and only that workload step fails.

Stream B Step 1 (Jayne). Adds 4 integration tests that pin down the expected post-state of VocabularyProgress after well-defined quiz interactions, run against a real EF Core + in-memory SQLite stack via PlanGenerationTestFixture (same pattern as MasteryAlgorithmIntegrationTests). #189 — Attempt counting / accuracy: Repro189_SingleCorrectRecognitionAttempt_ProducesExpectedPanelState — PASS Repro189_SingleCorrectRecognition_LegacyProductionFieldsRemainZero — PASS Both pass on main, which proves the ProgressService math is correct. Captain's '2 production attempts / 50% accuracy' panel reading therefore points at the UI panel reading legacy/wrong fields or a duplicate-call path — fix belongs in Stream A (Kaylee), not the service. Tests stay as regression guards for the service contract. #191 — Latter rounds rapidly empty: Repro191_NewWord_AllCorrect_DoesNotRotateOutBeforeFifthTurn — FAIL on main Repro191_CharacterizeCurrentBehavior_FreshWordRotatesAtTurnN — PASS (snapshot) Captured failure: a brand-new word receiving 4 all-correct answers (3 MC followed by 1 Text — which is the mode the quiz auto-selects once CurrentStreak >= 3) flips ReadyToRotateOut=True at turn 4. VocabularyQuizItem Tier 2 (mastery>=0.50 OR streak>=3, plus only SessionCorrectCount>=2 and SessionTextCorrect>=1) is the trigger. This is the over-aggressive rotation #191 describes. Test will pass after Wash tightens the Tier 2 gates. No production code changes.

Closes #191. Fresh words were rotating out of quiz rounds at turn 4 with all-correct answers, yielding only ~3 effective practice repetitions before the word disappeared. Two knobs are tuned to push the earliest legal rotation to turn 5 without regressing already-known words. Production changes (2 lines): 1. VocabularyProgressService.cs: EFFECTIVE_STREAK_DIVISOR 7.0f -> 12.0f Slows the mastery climb so MasteryScore reaches Tier 1 (>= 0.80) on turn 8+ rather than turn 6, and crosses the 0.50 promotion floor on turn 6 rather than turn 4. 2. VocabularyQuizItem.cs: Tier 2 trigger OR -> AND, floor (2,1) -> (4,2) - Trigger: mastery >= 0.50 && streak >= 3 (was OR). Closes a corner case where a single Text correct on a fresh word could drop the word into Tier 2 via streak alone. - Floor: SessionCorrectCount >= 4 && SessionTextCorrect >= 2 (was >= 2 && >= 1). Requires demonstrably more session evidence before a mid-mastery word is allowed to rotate out. Simulator: tools/quiz-rotation-sim/sim.py reproduces production math exactly. Headline (fresh, all-correct): | Turn | Current (/7, OR/2,1) | Proposed (/12, AND/4,2) | |------|---------------------|--------------------------| | 4 | mastery 0.714 -> ROTATES (bug) | mastery 0.417, no | | 5 | mastery 1.000 | mastery 0.583 -> ROTATES | Already-known words (mastery >= 0.80, streak >= 8) still rotate at the first qualifying turn (Tier 1 unchanged). Existing user MasteryScore data cannot regress: mastery is monotonic on correct (`max(streakScore, mastery)` in RecordAttemptAsync line 154). Tests: - Jayne's Repro191_NewWord_AllCorrect_DoesNotRotateOutBeforeFifthTurn flips FAIL -> PASS (PR #195 verification harness). - ~10 mastery-math fixtures bumped to track the new divisor (5 MC + 2 Text -> 8 MC + 2 Text for IsKnown demonstrations; divisor literals /7.0f -> /12.0f). - VocabQuizFilteringTests: Tier 2 floor test renamed and a new test Tier2_TriggerRequiresBothMasteryAndStreak added for the AND change. - All 520 unit tests pass. Language-tutor SLA review approved the turn-5 floor (vs turn-6) as the right balance between learner spaced-repetition load and within-session retention demonstration. Follow-up (separate issue, not in this PR): decouple MasteryScore from SessionRotationReady so session pacing and long-term mastery tracking are independent levers. Branched off PR #195 (Jayne's repro) so the fix lands together with its verification harness.

- PR #196 (Stream A UI fixes): closes #189/#190/#192/#193/#194 - PR #198 (Stream B scoring fix): closes #191 - PR #195 (test-only draft): superseded, closed - Follow-ups filed: #197 (decouple Mastery from SessionRotation), #199 (test helper DifficultyWeight bug) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

davidortinau added 5 commits May 2, 2026 20:30

squad(jayne): log Stream B Step 1 outcome (vocab quiz repro #189 #191)

277e10e

squad(wash): log Stream B Step 3 — #191 fix shipped via PR #198

91aacff

squad(wash): note PR #198 body cross-link to #197

02ef06a

davidortinau mentioned this pull request May 3, 2026

Test helper: MakeAttempt does not set DifficultyWeight, masking 1.5× Text weighting #199

Open

3 tasks

davidortinau merged commit 626383a into main May 3, 2026
2 of 6 checks passed

davidortinau deleted the fix/vocab-quiz-scoring-191-rotation-curve branch May 3, 2026 14:08

davidortinau mentioned this pull request May 3, 2026

test: failing repro for vocab quiz scoring bugs (#189, #191) #195

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(vocab-quiz): tighten rotation curve for fresh words (#191)#198

fix(vocab-quiz): tighten rotation curve for fresh words (#191)#198
davidortinau merged 5 commits intomainfrom
fix/vocab-quiz-scoring-191-rotation-curve

davidortinau commented May 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidortinau commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Production changes (2 lines)

Simulator (fresh word, all-correct turns)

Test impact

Captain & SLA review

Out of scope (separate issue)

Manual verification before merge

CI note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

davidortinau commented May 3, 2026 •

edited

Loading