Skip to content

ThinkingLevelSupportEN

neavo edited this page May 19, 2026 · 2 revisions

Overview

  • Some models support both thinking mode and normal mode
  • You can switch by setting the Thinking Level in the API Settings
  • Setting different thinking levels (OFF/LOW/MEDIUM/HIGH) will affect the model's thinking behavior, response time, and Token consumption

API Settings

  • In Model Management -> Basic Settings, you can find the Thinking Level configuration item
    • OFF: Disable the thinking feature and use the standard response mode
    • LOW: Enable thinking and allocate a smaller Token budget, suitable for simple logic or saving consumption
    • MEDIUM: Allocate a medium Token budget, balancing translation quality and speed
    • HIGH: Allocate a higher Token budget, suitable for complex semantic understanding and rare topics
  • The Model ID must be set correctly for the thinking level configuration to take effect

Supported Models

Model OFF LOW MEDIUM HIGH
GLM series Off On On On
Kimi series Off On On On
DeepSeek series Off On On On
Mimo-v2 series Off On On On
Qwen-3.5 series Off On On On
GPT-5 series none low medium high
Doubao series minimal low medium high
Gemini 2.5 Pro 128 Tokens 384 Tokens 768 Tokens 1024 Tokens
Gemini 2.5 Flash 0 Tokens 384 Tokens 768 Tokens 1024 Tokens
Gemini 2.5 Flash Lite 0 Tokens 512 Tokens 768 Tokens 1024 Tokens
Gemini 3 Pro Low Low Low High
Gemini 3.1 Pro Low Low Medium High
Gemini 3 Flash / Gemini 3.1 Flash minimal low medium high
Claude Sonnet 3.7 / Claude Haiku 4.x / Claude Sonnet 4.x / Claude Opus 4.x Off 1024 Tokens 1536 Tokens 2048 Tokens

Clone this wiki locally