fix(wizard): move LLM test into Rust core & add no_reasoning_control option by erning · Pull Request #60 · missuo/koe

erning · 2026-04-02T06:31:27Z

Summary

Fixes #56 — LLM correction silently falls back to raw ASR text when the LLM call times out, but the wizard's Test Connection passes because it uses a completely different code path.

Root cause: The wizard sent a minimal "Hi" message via Obj-C with a hardcoded 15s timeout, while the runtime sends full system/user prompts through Rust with a default 8s timeout. Reasoning models like GLM-5-turbo take 15+ seconds (ignoring reasoning_effort: "none"), so the runtime always timed out while the test always passed.

Changes

Move LLM test into Rust core — sp_llm_test() FFI function that shares the exact same correct() code path as runtime: same prompts, dictionary, timeout, temperature, top_p. The Obj-C wizard just calls this and displays the JSON result.
Always report elapsed time — test_correction() returns (Result<String>, Duration) so elapsed time is available even on timeout/error (e.g., "LLM correction timed out (8.0s)" instead of "(0.0s)").
Add no_reasoning_control config option — New LlmNoReasoningControl enum:
- reasoning_effort (default): sends "reasoning_effort": "none" — works for OpenAI o-series
- thinking: sends "thinking": {"type": "disabled"} — works for GLM and similar models
- none: sends nothing

Test plan

cargo build succeeds
Xcode build succeeds
Test Connection with valid config shows "Connection successful! (X.Xs)"
Test Connection with timeout_ms: 1000 shows timeout error with correct elapsed time
Setting no_reasoning_control: thinking with GLM-5-turbo responds in ~2s instead of 15s+
Setting no_reasoning_control: none sends no reasoning parameters
Hold-to-talk and tap-to-toggle still work

…runtime (missuo#56) The wizard's Test Connection previously used a separate Obj-C HTTP implementation that sent a minimal "Hi" message with a hardcoded 15s timeout — completely different from what the runtime actually sends. This caused the test to pass while real LLM correction silently failed (e.g. timeout on reasoning models like GLM-5-turbo). Move the test logic into koe-core so it shares the exact same code path as runtime correction: same correct() function, same config-driven timeout/temperature/top_p, same system_prompt and user_prompt template loaded from disk, same dictionary. Elapsed time is always reported, including on timeout.

Add LlmNoReasoningControl enum with three modes: - reasoning_effort (default): sends "reasoning_effort": "none" for OpenAI - thinking: sends "thinking": {"type": "disabled"} for GLM and similar - none: sends nothing This lets users disable reasoning on non-OpenAI models like GLM-5-turbo, which ignore reasoning_effort and waste 15+ seconds on thinking tokens.

erning added 2 commits April 2, 2026 14:24

missuo merged commit 8224043 into missuo:main Apr 6, 2026

erning deleted the fix/llm-test-connection-representative branch April 7, 2026 02:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(wizard): move LLM test into Rust core & add no_reasoning_control option#60

fix(wizard): move LLM test into Rust core & add no_reasoning_control option#60
missuo merged 2 commits into
missuo:mainfrom
erning:fix/llm-test-connection-representative

erning commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erning commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

erning commented Apr 2, 2026 •

edited

Loading