feat(explain): break down history boost into unigram/bigram/whole-path#247
Merged
Conversation
Add `HistoryBoostBreakdown` so `explain` exposes which component of history learning contributes to a path's cost: per-segment unigram sum, per-segment bigram sum, and whole-path ×5 boost. `history_rerank` now goes through the same helper, keeping the math in one place. Aimed at diagnosing reports of weakened history learning — the breakdown lets us tell whether bigram learning is silent, whole-path is over- or under-firing, or per-segment normalization is washing things out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR refactors history-based reranking to expose a per-path breakdown of history boost contributions (unigram/bigram/whole-path) and surfaces that breakdown in explain output (text + JSON) to help diagnose whether history learning is firing as intended.
Changes:
- Added
HistoryBoostBreakdown+compute_history_boost()in the reranker to compute per-component history contributions. - Extended
ExplainPathto includehistory_breakdown, and updatedformat_text()to render the detailed breakdown line. - Expanded
explaintests to validate breakdown fields and the “no history → zero breakdown” case.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| engine/crates/lex-core/src/converter/reranker.rs | Adds a reusable history boost breakdown struct + helper, and refactors history_rerank() to use it. |
| engine/crates/lex-core/src/converter/explain.rs | Adds history_breakdown to explain output and prints a detailed breakdown line; updates tests accordingly. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Re-export `HistoryBoostBreakdown` via `explain::HistoryBoostBreakdown` so downstream crates (lex-cli) can name the type behind the public `ExplainPath::history_breakdown` field. The definition stays in the crate-private `reranker` module. - Capture history breakdown at the post-rerank / pre-history-rerank boundary via the observer, rather than recomputing on the final path. The recompute could disagree with the actual subtracted boost in two cases: paths whose segments were merged by `group_segments`, and rewriter-added candidates (numeric / katakana / kanji variants) that were synthesised after `history_rerank` ran. The snapshot is keyed by `surface_key()` (preserved through grouping) and absent for rewriter- added paths, which fall back to a zero breakdown. - Add regression test `test_explain_unrelated_paths_have_zero_history_boost` asserting non-matching surfaces always report zero history boost. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Track the segment count `history_rerank` used as its normalization denominator alongside the breakdown snapshot, and surface it on `ExplainPath` as `history_segment_count`. The `format_text` "/N segs" display now uses this value so the displayed denominator matches the one applied during normalization even when `group_segments` later merges adjacent segments. Without the change the display could disagree with `history_boost` — e.g. a 4-segment path showing "/2 segs" after grouping while the actual boost was computed against 4 — which is exactly the kind of incoherence the breakdown was added to dispel. Regression test asserts `history_segment_count == segments.len()` and `history_boost == applied(history_segment_count)` for the no-grouping path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thread a single `now` through postprocess into history_rerank so the observer's precomputed breakdown is guaranteed to align with the boost actually subtracted in the pipeline, eliminating sub-second drift around the second boundary. - reranker: rename `history_rerank` → `history_rerank_at(paths, history, now)`; the convenience wrapper that captured `now` internally is gone because the only callers were postprocess (which now pins the value) and tests (which switch to passing `now_epoch()` directly). - postprocess: capture `now` in `PostprocessContext.now` and pass it to `history_rerank_at`. Production wrapper computes the value once. - explain: the observer and the postprocess context share the same `now` value, so `compute_history_boost` and the pipeline see the identical decay input. - Contract test asserts `history_rerank_at` subtracts exactly what `compute_history_boost(..., now).applied(seg_count)` reports for the same `now`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
compute_history_boost()+HistoryBoostBreakdownをrerankerに追加し、history_rerankがそれ経由でブーストを適用するよう refactor (挙動変更なし、テストで担保)ExplainPathにhistory_breakdownフィールド追加、format_textが boost ありのパスで内訳 (uni_sum/bi_sum/whole×5//seg_count) を表示explainの既存テストを breakdown 検証付きに拡張、履歴なしケースの空 breakdown を確認するテストを追加動機
「最近、変換履歴の学習が弱い気がする」という体感報告を切り分けるための前準備。現状の
explain出力は history boost の合算値しか出さないので、/seg_count) が boost を洗い流しているのかを区別できない。具体例が出てきたタイミングで
lextool explain ... --history ...を叩けば、上記のどの軸が原因か即座に判別できる状態にする。出力例
bi_sumが+0なのは単一セグメント (bigram pair なし) のため。複数セグメントパスでは bigram_sum が分離して出る。Test plan
cargo test --workspace --all-features(482 tests pass)cargo clippy --workspace --all-features -- -D warningscargo fmt --all --checktest_explain_with_historyで whole_path_boost > 0 / bigram_sum == 0 を検証test_explain_history_breakdown_empty_without_historyで history なし時の zero 性を検証lextool explain data/lexime.dict きょう --conn data/lexime.conn --jsonでhistory_breakdownフィールドが出ることを確認 (履歴なし時は全 0)