perf(asr): cache MLX model across sessions to avoid repeated loads by erning · Pull Request #30 · missuo/koe

erning · 2026-03-29T14:35:39Z

Problem

The MLX model is reloaded from disk on every session's connect() call. For Qwen3-ASR models (0.6B–1.7B parameters), this adds significant latency that dominates the "press and speak" interaction.

Solution

Track the loaded model path in MLXAsrManager. When loadModel is called with the same path that's already loaded, return immediately. The cached path is cleared on unloadModel.

Files changed

Packages/KoeMLX/Sources/KoeMLX/MLXAsrManager.swift

Test plan

make build passes
First session loads model (log: "model loaded from ...")
Subsequent sessions skip loading (log: "model already loaded ... reusing")
Changing model path in config triggers a fresh load

The MLX model was reloaded from disk on every session connect, adding significant latency for Qwen3-ASR models. Track the loaded model path and skip loading when the same path is requested again.

perf(asr): cache MLX model across sessions to avoid repeated loads

52be4ca

The MLX model was reloaded from disk on every session connect, adding significant latency for Qwen3-ASR models. Track the loaded model path and skip loading when the same path is requested again.

missuo merged commit 5852917 into missuo:main Mar 29, 2026

erning deleted the perf/mlx-model-cache branch March 29, 2026 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(asr): cache MLX model across sessions to avoid repeated loads#30

perf(asr): cache MLX model across sessions to avoid repeated loads#30
missuo merged 1 commit into
missuo:mainfrom
erning:perf/mlx-model-cache

erning commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erning commented Mar 29, 2026

Problem

Solution

Files changed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants