Skip to content

refactor: feature-weight model with lextool tune#210

Merged
send merged 15 commits intomainfrom
refactor/feature-weight-model
Apr 5, 2026
Merged

refactor: feature-weight model with lextool tune#210
send merged 15 commits intomainfrom
refactor/feature-weight-model

Conversation

@send
Copy link
Copy Markdown
Owner

@send send commented Apr 5, 2026

Summary

  • features.rs: reranker のコスト調整ロジックを PathFeatures / FeatureWeights に分離し、重みの独立管理を可能にした
  • lextool tune: accuracy コーパスに対して FeatureWeights の grid search を行うサブコマンドを追加。Viterbi 候補を事前計算し、1,296 組の重みを <1秒 で評価する
  • 重み適用: tune 結果に基づきデフォルト重みを更新(structure=0, lv=1000, te=2000, sk=0)。accuracy 100% 維持

Before (default weights)

structure=25, length_variance=2000, te_kanji=3500, single_kanji=4000
tune pass rate: 96.7% (59/61)

After (tuned weights)

structure=0, length_variance=1000, te_kanji=2000, single_kanji=0
tune pass rate: 98.4% (60/61)

tune で当たらない 1 件 (にじゅうさん) は NumericRewriter が補っており、フルパイプラインでは pass。

Test plan

  • cargo fmt --all --check && cargo clippy --workspace --all-features -- -D warnings
  • cargo test --workspace --all-features — 全 411 テスト pass
  • mise run accuracy — 100.0% (61/61)
  • lextool tune — 実辞書+コーパスで実行確認

🤖 Generated with Claude Code

send and others added 3 commits April 6, 2026 00:41
Introduce PathFeatures and FeatureWeights to separate feature extraction
from weight application. The reranker now computes features via
extract_features() and applies weighted_cost() instead of inline
penalty calculations.

Features:
- structure_cost: capped transition cost sum
- length_variance: segment length uniformity
- script_cost: kanji+kana preference
- te_kanji_count: te-form kanji penalty occurrences
- single_kanji_count: single-char kanji penalty occurrences

Weights are loaded from settings and can be tuned independently.
This is preparation for automated weight tuning via accuracy corpus.

Accuracy: 61/61 pass (4 skip), history 6/6 — identical to baseline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a tune subcommand that pre-computes Viterbi candidates once per
reading, then evaluates all weight combinations using pure arithmetic.
Default grid: 6×6×6×6 = 1,296 combos, finishes in under a second.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update defaults based on lextool tune results against accuracy corpus:
  structure:       25 → 0
  length_variance: 2000 → 1000
  te_kanji:        3500 → 2000
  single_kanji:    4000 → 0

Accuracy unchanged (100%, 61/61). Reranker tests updated to verify
feature extraction directly rather than depending on specific weight
magnitudes for ranking assertions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 5, 2026 16:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors reranker scoring into an explicit feature/weight model and adds a new lextool tune subcommand to grid-search feature weights against the accuracy corpus, then updates the default reranker weights based on tuning results.

Changes:

  • Split reranker cost adjustments into PathFeatures extraction and FeatureWeights application.
  • Add lex-core::converter::tune + lextool tune to precompute candidates and evaluate many weight combinations quickly.
  • Update default reranker weights in settings/tests (lv=1000, te=2000, sk=0).

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
engine/crates/lex-core/src/settings.rs Updates settings test expectations for tuned reranker defaults.
engine/crates/lex-core/src/default_settings.toml Updates default reranker weights.
engine/crates/lex-core/src/converter/features.rs Introduces feature extraction + weight application for reranking.
engine/crates/lex-core/src/converter/reranker.rs Switches reranker step 3 to feature-based weighted adjustment; updates tests accordingly.
engine/crates/lex-core/src/converter/tune.rs Adds precompute + grid-search implementation for weight tuning.
engine/crates/lex-core/src/converter/mod.rs Exposes features internally and tune publicly.
engine/crates/lex-core/src/converter/explain.rs Adjusts explain logic after refactor, but currently contains TODO/placeholder behavior.
engine/crates/lex-cli/src/bin/lextool.rs Adds the tune CLI subcommand and text/JSON reporting.
Comments suppressed due to low confidence (1)

engine/crates/lex-core/src/converter/explain.rs:157

  • explain_segments sets single_char_kanji_penalty to a hard-coded 0 with a TODO. This changes the diagnostic output to no longer reflect the actual reranker behavior for single-char kanji penalties. Either restore the previous per-segment penalty calculation (including the dictionary compound exemption), or rename/remove the field from ExplainSegment until it can be reported accurately.
            // te_form and single_char are path-level counts; for per-segment
            // display we inline the check here.
            let te_penalty = if let Some(c) = conn {
                if let Some(prev) = prev_seg {
                    if c.is_function_word(prev.left_id)
                        && (prev.surface == "て" || prev.surface == "で")
                        && seg.surface.chars().any(crate::unicode::is_kanji)
                    {
                        settings().reranker.te_form_kanji_penalty
                    } else {
                        0
                    }
                } else {
                    0
                }
            } else {
                0
            };
            let _ = &features; // suppress unused; features used for validation
            let sc_penalty = 0i64; // TODO: per-segment single_char needs refactor
            ExplainSegment {
                reading: seg.reading.clone(),
                surface: seg.surface.clone(),
                word_cost: seg.word_cost as i64,
                segment_penalty: settings().cost.segment_penalty,
                script_cost: script_cost(&seg.surface, seg.reading.chars().count()),
                connection_cost: connection,
                te_form_kanji_penalty: te_penalty,
                single_char_kanji_penalty: sc_penalty,
                left_id: seg.left_id,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread engine/crates/lex-core/src/converter/features.rs
Comment thread engine/crates/lex-core/src/converter/explain.rs Outdated
Comment thread engine/crates/lex-core/src/converter/tune.rs Outdated
send and others added 3 commits April 6, 2026 01:19
- Remove stale "Default: 25" doc comment on FeatureWeights.structure
- Return &str from top1_surface to avoid 130K string clones in grid search
- Split evaluate_weights into count_passes (hot loop) and collect_surfaces
  (called only for default + best) to avoid redundant evaluation
- Remove unused extract_features call in explain_segments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fall back to default weights instead of panicking when the grid
produces zero combinations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract is_single_char_kanji_penalised() in features.rs and reuse it
from both extract_features() and explain_segments(), eliminating the
duplicated logic.  The per-segment penalty was hardcoded to 0 after the
feature extraction refactor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread engine/crates/lex-core/src/converter/features.rs
Comment thread engine/crates/lex-core/src/converter/tune.rs Outdated
Comment thread engine/crates/lex-core/src/converter/reranker.rs
- Extract compute_structure_cost() and pass precomputed value to
  extract_features() to avoid duplicate transition-cost computation
  in the reranker hot path
- Add deterministic tie-break to grid_search sort (prefer weights
  closer to defaults when pass_count is equal)
- Clarify from_settings() doc: structure/script are compile-time
  constants, not loaded from settings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread engine/crates/lex-core/src/converter/tune.rs
Comment thread engine/crates/lex-core/src/converter/explain.rs Outdated
- Extract is_te_form_kanji_penalised() in features.rs, reuse from
  both extract_features() and explain_segments() (eliminates duplication)
- Use FeatureWeights::from_settings() as grid_search baseline instead
  of Default, so the tune command matches actual production behaviour

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@send send requested a review from Copilot April 5, 2026 16:48
Avoids silent divergence if the constant values are changed later.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread engine/crates/lex-core/src/converter/tune.rs
Comment thread engine/crates/lex-core/src/converter/tune.rs
Comment thread engine/crates/lex-core/src/converter/reranker.rs Outdated
- Remove structure/script from WeightGrid (fixed in from_settings)
- Add 5 unit tests for tune module (precompute, grid_search, empty
  grid, top1_surface, weight_distance)
- Remove redundant extract_features import in reranker tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@send send requested a review from Copilot April 5, 2026 17:01
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread engine/crates/lex-core/src/settings.rs
send and others added 2 commits April 6, 2026 02:08
The serde default functions for te_form_kanji_penalty (3500 → 2000)
and single_char_kanji_penalty (4000 → 0) still had old values, which
would silently apply if a custom settings TOML omitted those keys.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract DEFAULT_LENGTH_VARIANCE_WEIGHT, DEFAULT_TE_FORM_KANJI_PENALTY,
DEFAULT_SINGLE_CHAR_KANJI_PENALTY, and DEFAULT_STRUCTURE_COST_TRANSITION_CAP
as pub constants in settings.rs.  Both the serde fallback functions and
FeatureWeights::default() now reference these constants, eliminating the
risk of values drifting out of sync.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread engine/crates/lex-core/src/converter/features.rs
When both te_kanji and single_kanji weights are 0 (current defaults),
pass dict=None to extract_features so the per-segment loop skips
dictionary compound lookups entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread engine/crates/lex-core/src/converter/reranker.rs Outdated
te-form scoring uses connection matrix only, not dictionary lookups.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@send send merged commit 1664f4e into main Apr 5, 2026
14 checks passed
@send send deleted the refactor/feature-weight-model branch April 5, 2026 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants