ThinkingLevelSupportEN

Overview

Some models support both thinking mode and normal mode
You can switch by setting the Thinking Level in the API Settings
Setting different thinking levels (OFF/LOW/MEDIUM/HIGH) will affect the model's thinking behavior, response time, and Token consumption

In Model Management -> Basic Settings, you can find the Thinking Level configuration item
- OFF: Disable the thinking feature and use the standard response mode
- LOW: Enable thinking and allocate a smaller Token budget, suitable for simple logic or saving consumption
- MEDIUM: Allocate a medium Token budget, balancing translation quality and speed
- HIGH: Allocate a higher Token budget, suitable for complex semantic understanding and rare topics
The Model ID must be set correctly for the thinking level configuration to take effect

Model	`OFF`	`LOW`	`MEDIUM`	`HIGH`
GLM series	Off	On	On	On
Kimi series	Off	On	On	On
DeepSeek series	Off	On	On	On
Mimo-v2 series	Off	On	On	On
Qwen-3.5 series	Off	On	On	On
GPT-5 series	none	low	medium	high
Doubao series	minimal	low	medium	high
Gemini 2.5 Pro	128 Tokens	384 Tokens	768 Tokens	1024 Tokens
Gemini 2.5 Flash	0 Tokens	384 Tokens	768 Tokens	1024 Tokens
Gemini 2.5 Flash Lite	0 Tokens	512 Tokens	768 Tokens	1024 Tokens
Gemini 3 Pro	Low	Low	Low	High
Gemini 3.1 Pro	Low	Low	Medium	High
Gemini 3 Flash / Gemini 3.1 Flash	minimal	low	medium	high
Claude Sonnet 3.7 / Claude Haiku 4.x / Claude Sonnet 4.x / Claude Opus 4.x	Off	1024 Tokens	1536 Tokens	2048 Tokens