ThinkingLevelSupportEN

Jump to bottom

neavo edited this page Feb 26, 2026 · 2 revisions

Overview

Some models support both thinking mode and normal mode
You can switch by setting the Thinking Level in the API Settings
Setting different thinking levels (OFF/LOW/MEDIUM/HIGH) will affect the model's thinking behavior, response time, and Token consumption

API Settings

In Model Management -> Basic Settings, you can find the Thinking Level configuration item

OFF: Disable the thinking feature and use the standard response mode
LOW: Enable thinking, allocate a smaller Token budget (e.g., 1024 Tokens), suitable for simple logic or saving consumption
MEDIUM: Allocate a medium Token budget (e.g., 1536 Tokens), balancing translation quality and speed
HIGH: Allocate the maximum Token budget (e.g., 2048 Tokens or higher), suitable for complex semantic understanding and rare topics

Supported Models

Models that only support the thinking toggle
- GLM series
  - LOW MEDIUM HIGH - ON
  - OFF - OFF
- Kimi series
  - LOW MEDIUM HIGH - ON
  - OFF - OFF
- DeepSeek series
  - LOW MEDIUM HIGH - ON
  - OFF - OFF
Models that only support thinking levels
- Gemini 2.5 Pro / Gemini 3 Pro / Gemini 3.1 Pro
  - HIGH - Takes effect normally
  - OFF LOW MEDIUM - Equivalent to LOW
Models that fully support both thinking toggle and thinking levels
- doubao-seed-1.6 / doubao-seed-1.8 / doubao-seed-2.0 series
- Gemini 2.5 Flash / Gemini 3 Flash
- Claude Sonnet 3.7 / Claude Haiku 4.x / Claude Sonnet 4.x / Claude Opus 4.x