Skip to content

refactor(hmm): extract _rebuild_regime_rank_cache helper to close DRY violation between _label_regimes and load #2

@Pepeye

Description

@Pepeye

Context

Closed architect nit from #1. Both HMMEngine._label_regimes() (shared/core/hmm/hmm_engine.py:343) and HMMEngine.load() (shared/core/hmm/hmm_engine.py:~700) inline the same three-line sequence:

regime_returns = self.model.means_[:, 0]
sorted_ids = np.lexsort((np.arange(len(regime_returns)), regime_returns))
self._regime_id_to_rank = {int(rid): rank for rank, rid in enumerate(sorted_ids)}

If the sort key, the tiebreaker, or the mean-column convention ever changes in one place and not the other, load() would produce a different regime→rank mapping than a freshly-trained engine. The existing test_load_rebuilds_cache regression catches this before it ships, but the duplication is an avoidable trap.

Why the architect's literal suggestion is wrong

Calling _label_regimes() from load() would silently reset 5 RegimeInfo fields (probability, recommended_strategy_type, max_leverage_allowed, max_position_size_pct, min_confidence_to_act) to defaults, clobbering the values just restored from pickle.

Fix

Extract a private helper:

def _rebuild_regime_rank_cache(self) -> np.ndarray:
    if self.model is None:
        raise RuntimeError("Cannot rebuild rank cache without a trained model")
    regime_returns = self.model.means_[:, 0]
    sorted_ids = np.lexsort((np.arange(len(regime_returns)), regime_returns))
    self._sorted_regime_ids = sorted_ids
    self._regime_id_to_rank = {int(rid): rank for rank, rid in enumerate(sorted_ids)}
    return sorted_ids
  • _label_regimes() calls it at the top, uses the returned sorted_ids for the label loop, drops the tail assignment.
  • load() calls it after restoring the model, drops the inline lexsort block.

Acceptance criteria

  • HMMEngine._rebuild_regime_rank_cache() exists, asserts model is trained, and returns sorted_ids.
  • _label_regimes() calls the helper; no inline np.lexsort at its tail.
  • load() calls the helper; no inline np.lexsort in the load path.
  • New regression test: RegimeInfo fields (probability, recommended_strategy_type, max_leverage_allowed, max_position_size_pct, min_confidence_to_act) round-trip through save/load with non-default values intact.
  • uv run ruff check shared/core/hmm tests/hmm exits 0.
  • uv run ruff format --check shared/core/hmm tests/hmm exits 0.
  • uv run ty check shared/core/hmm tests/hmm reports no in-scope errors.
  • uv run python -m pytest tests/hmm -q all tests pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions