-
Notifications
You must be signed in to change notification settings - Fork 100
ThinkingLevelSupportEN
neavo edited this page May 19, 2026
·
2 revisions
- Some models support both thinking mode and normal mode
- You can switch by setting the Thinking Level in the API Settings
- Setting different thinking levels (OFF/LOW/MEDIUM/HIGH) will affect the model's thinking behavior, response time, and Token consumption
- In Model Management -> Basic Settings, you can find the
Thinking Levelconfiguration item- OFF: Disable the thinking feature and use the standard response mode
- LOW: Enable thinking and allocate a smaller Token budget, suitable for simple logic or saving consumption
- MEDIUM: Allocate a medium Token budget, balancing translation quality and speed
- HIGH: Allocate a higher Token budget, suitable for complex semantic understanding and rare topics
- The Model ID must be set correctly for the thinking level configuration to take effect
| Model | OFF |
LOW |
MEDIUM |
HIGH |
|---|---|---|---|---|
| GLM series | Off | On | On | On |
| Kimi series | Off | On | On | On |
| DeepSeek series | Off | On | On | On |
| Mimo-v2 series | Off | On | On | On |
| Qwen-3.5 series | Off | On | On | On |
| GPT-5 series | none | low | medium | high |
| Doubao series | minimal | low | medium | high |
| Gemini 2.5 Pro | 128 Tokens | 384 Tokens | 768 Tokens | 1024 Tokens |
| Gemini 2.5 Flash | 0 Tokens | 384 Tokens | 768 Tokens | 1024 Tokens |
| Gemini 2.5 Flash Lite | 0 Tokens | 512 Tokens | 768 Tokens | 1024 Tokens |
| Gemini 3 Pro | Low | Low | Low | High |
| Gemini 3.1 Pro | Low | Low | Medium | High |
| Gemini 3 Flash / Gemini 3.1 Flash | minimal | low | medium | high |
| Claude Sonnet 3.7 / Claude Haiku 4.x / Claude Sonnet 4.x / Claude Opus 4.x | Off | 1024 Tokens | 1536 Tokens | 2048 Tokens |