Release Extension 0.5.8 · lcomplete/huntly

Thinking Mode for Chinese AI Providers

Raw streaming pipeline for thinking-capable providers: Qwen, Zhipu (GLM), and MiniMax now use a direct fetch-based streaming path instead of the Vercel AI SDK. This allows request-body flags such as enable_thinking to be passed explicitly, unlocking native thinking/reasoning mode for these providers without requiring an API format override.
Improved loading indicator: The "preparing response" dots now stay visible throughout the entire reasoning phase — not just before the first token arrives — so the UI always reflects that the model is actively thinking.

Full Changelog: ext/v0.5.7...ext/v0.5.8