-
-
Notifications
You must be signed in to change notification settings - Fork 299
Description
Describe the bug
Gemini 3 models use thinkingLevel (string: "low" or "high") to control reasoning effort, but the code sends thinkingBudget (numeric) which Gemini 3 ignores. This means reasoning_effort has no effect on Gemini 3 thinking behavior.
CLI Type
Gemini CLI, Antigravity
Model Name
gemini-3-pro-preview (and all gemini-3-* variants)
LLM Client
Any OpenAI-compatible client using /v1/chat/completions
Request Information
# Test with reasoning_effort: "high"
curl -s --max-time 180 "http://localhost:8317/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer test" \
-d '{
"model": "gemini-3-pro-preview",
"messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a goat, and a cabbage. The boat can only carry the farmer and one item at a time. If left alone, the wolf will eat the goat, and the goat will eat the cabbage. How can the farmer get everything across safely? Show each step."}],
"max_tokens": 2000,
"reasoning_effort": "high"
}'
# Test with reasoning_effort: "low"
curl -s --max-time 180 "http://localhost:8317/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer test" \
-d '{
"model": "gemini-3-pro-preview",
"messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a goat, and a cabbage. The boat can only carry the farmer and one item at a time. If left alone, the wolf will eat the goat, and the goat will eat the cabbage. How can the farmer get everything across safely? Show each step."}],
"max_tokens": 2000,
"reasoning_effort": "low"
}'Expected behavior
reasoning_effort: "high"should produce significantly more reasoning tokens thanreasoning_effort: "low"- The
thinkingLevelparameter should be sent to Gemini 3 API (notthinkingBudget)
Actual behavior (before fix)
| reasoning_effort | Parameter Sent | reasoning_tokens |
|---|---|---|
"high" |
thinkingBudget: 32768 | 1310 |
"low" |
thinkingBudget: 1024 | 1143 |
~13% difference - essentially the same, parameter is ignored.
Actual behavior (after fix)
| reasoning_effort | Parameter Sent | reasoning_tokens |
|---|---|---|
"high" |
thinkingLevel: "high" | 1150 |
"low" |
thinkingLevel: "low" | 132 |
~9x difference - parameter is respected.
OS Type
Irrelevant
Additional context
Gemini 2.5 models use thinkingBudget (numeric: -1, 0, 1024, 8192, 32768, etc.)
Gemini 3 models use thinkingLevel (string: "low" or "high")
The code was treating all Gemini models the same, sending thinkingBudget to both.
Root Cause
The translator code in gemini-cli_openai_request.go and antigravity_openai_request.go did not differentiate between Gemini 2.5 and Gemini 3 models. It sent thinkingBudget to all models, but Gemini 3 ignores this parameter and only responds to thinkingLevel.
Proposed Fix
File: internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go
File: internal/translator/antigravity/openai/chat-completions/antigravity_openai_request.go
Before (broken):
// Reasoning effort -> thinkingBudget/include_thoughts
re := gjson.GetBytes(rawJSON, "reasoning_effort")
hasOfficialThinking := re.Exists()
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
switch re.String() {
case "none":
out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 0)
case "auto":
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
case "low":
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
case "medium":
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
case "high":
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
default:
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
}
}After (fixed):
// Reasoning effort -> thinkingBudget/include_thoughts (Gemini 2.5) or thinkingLevel (Gemini 3)
re := gjson.GetBytes(rawJSON, "reasoning_effort")
hasOfficialThinking := re.Exists()
isGemini3 := util.IsGemini3Model(modelName)
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
if isGemini3 {
// Gemini 3 uses thinkingLevel ("low" or "high") instead of thinkingBudget.
// Only "low" and "high" are valid; other values don't set thinkingLevel.
switch re.String() {
case "low":
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingLevel", "low")
case "high":
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingLevel", "high")
}
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
} else {
// Gemini 2.5 and others use thinkingBudget
var budget int
includeThoughts := true
switch re.String() {
case "none":
budget = 0
includeThoughts = false
case "low":
budget = util.NormalizeThinkingBudget(modelName, 1024)
case "medium":
budget = util.NormalizeThinkingBudget(modelName, 8192)
case "high":
budget = util.NormalizeThinkingBudget(modelName, 32768)
default: // "auto" and other values
budget = -1
}
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
if includeThoughts {
out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
} else {
out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
}
}
}New utility function added to internal/util/gemini_thinking.go:
// IsGemini3Model returns true if the model is a Gemini 3 model (uses thinkingLevel instead of thinkingBudget).
func IsGemini3Model(model string) bool {
lower := strings.ToLower(model)
return strings.HasPrefix(lower, "gemini-3-")
}Test Date: 2025-11-26
Test Environment: CLIProxyAPI via Docker Compose
Model: gemini-3-pro-preview