Skip to content

fix(wizard): move LLM test into Rust core & add no_reasoning_control option#60

Merged
missuo merged 2 commits into
missuo:mainfrom
erning:fix/llm-test-connection-representative
Apr 6, 2026
Merged

fix(wizard): move LLM test into Rust core & add no_reasoning_control option#60
missuo merged 2 commits into
missuo:mainfrom
erning:fix/llm-test-connection-representative

Conversation

@erning

@erning erning commented Apr 2, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes #56 — LLM correction silently falls back to raw ASR text when the LLM call times out, but the wizard's Test Connection passes because it uses a completely different code path.

Root cause: The wizard sent a minimal "Hi" message via Obj-C with a hardcoded 15s timeout, while the runtime sends full system/user prompts through Rust with a default 8s timeout. Reasoning models like GLM-5-turbo take 15+ seconds (ignoring reasoning_effort: "none"), so the runtime always timed out while the test always passed.

Changes

  • Move LLM test into Rust coresp_llm_test() FFI function that shares the exact same correct() code path as runtime: same prompts, dictionary, timeout, temperature, top_p. The Obj-C wizard just calls this and displays the JSON result.
  • Always report elapsed timetest_correction() returns (Result<String>, Duration) so elapsed time is available even on timeout/error (e.g., "LLM correction timed out (8.0s)" instead of "(0.0s)").
  • Add no_reasoning_control config option — New LlmNoReasoningControl enum:
    • reasoning_effort (default): sends "reasoning_effort": "none" — works for OpenAI o-series
    • thinking: sends "thinking": {"type": "disabled"} — works for GLM and similar models
    • none: sends nothing

Test plan

  • cargo build succeeds
  • Xcode build succeeds
  • Test Connection with valid config shows "Connection successful! (X.Xs)"
  • Test Connection with timeout_ms: 1000 shows timeout error with correct elapsed time
  • Setting no_reasoning_control: thinking with GLM-5-turbo responds in ~2s instead of 15s+
  • Setting no_reasoning_control: none sends no reasoning parameters
  • Hold-to-talk and tap-to-toggle still work

erning added 2 commits April 2, 2026 14:24
…runtime (missuo#56)

The wizard's Test Connection previously used a separate Obj-C HTTP
implementation that sent a minimal "Hi" message with a hardcoded 15s
timeout — completely different from what the runtime actually sends.
This caused the test to pass while real LLM correction silently failed
(e.g. timeout on reasoning models like GLM-5-turbo).

Move the test logic into koe-core so it shares the exact same code path
as runtime correction: same correct() function, same config-driven
timeout/temperature/top_p, same system_prompt and user_prompt template
loaded from disk, same dictionary. Elapsed time is always reported,
including on timeout.
Add LlmNoReasoningControl enum with three modes:
- reasoning_effort (default): sends "reasoning_effort": "none" for OpenAI
- thinking: sends "thinking": {"type": "disabled"} for GLM and similar
- none: sends nothing

This lets users disable reasoning on non-OpenAI models like GLM-5-turbo,
which ignore reasoning_effort and waste 15+ seconds on thinking tokens.
@missuo missuo merged commit 8224043 into missuo:main Apr 6, 2026
@erning erning deleted the fix/llm-test-connection-representative branch April 7, 2026 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

我的安装的程序,只会语音识别的原始输出,任何prompt都不生效

2 participants