-
Notifications
You must be signed in to change notification settings - Fork 103
ThinkingLevelSupportEN
neavo edited this page Feb 26, 2026
·
2 revisions
- Some models support both thinking mode and normal mode
- You can switch by setting the Thinking Level in the API Settings
- Setting different thinking levels (OFF/LOW/MEDIUM/HIGH) will affect the model's thinking behavior, response time, and Token consumption
In Model Management -> Basic Settings, you can find the Thinking Level configuration item
- OFF: Disable the thinking feature and use the standard response mode
- LOW: Enable thinking, allocate a smaller Token budget (e.g., 1024 Tokens), suitable for simple logic or saving consumption
- MEDIUM: Allocate a medium Token budget (e.g., 1536 Tokens), balancing translation quality and speed
- HIGH: Allocate the maximum Token budget (e.g., 2048 Tokens or higher), suitable for complex semantic understanding and rare topics
- Models that only support the thinking toggle
- GLM series
-
LOWMEDIUMHIGH- ON -
OFF- OFF
-
- Kimi series
-
LOWMEDIUMHIGH- ON -
OFF- OFF
-
- DeepSeek series
-
LOWMEDIUMHIGH- ON -
OFF- OFF
-
- GLM series
- Models that only support thinking levels
- Gemini 2.5 Pro / Gemini 3 Pro / Gemini 3.1 Pro
-
HIGH- Takes effect normally -
OFFLOWMEDIUM- Equivalent toLOW
-
- Gemini 2.5 Pro / Gemini 3 Pro / Gemini 3.1 Pro
- Models that fully support both thinking toggle and thinking levels
- doubao-seed-1.6 / doubao-seed-1.8 / doubao-seed-2.0 series
- Gemini 2.5 Flash / Gemini 3 Flash
- Claude Sonnet 3.7 / Claude Haiku 4.x / Claude Sonnet 4.x / Claude Opus 4.x