[Bug] Gemini 3 Does Not Utilize Reasoning Effort

**Describe the bug**
Gemini 3 models use `thinkingLevel` (string: `"low"` or `"high"`) to control reasoning effort, but the code sends `thinkingBudget` (numeric) which Gemini 3 ignores. This means `reasoning_effort` has no effect on Gemini 3 thinking behavior.

**CLI Type**
Gemini CLI, Antigravity

**Model Name**
gemini-3-pro-preview (and all gemini-3-* variants)

**LLM Client**
Any OpenAI-compatible client using `/v1/chat/completions`

**Request Information**
```bash
# Test with reasoning_effort: "high"
curl -s --max-time 180 "http://localhost:8317/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test" \
  -d '{
    "model": "gemini-3-pro-preview",
    "messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a goat, and a cabbage. The boat can only carry the farmer and one item at a time. If left alone, the wolf will eat the goat, and the goat will eat the cabbage. How can the farmer get everything across safely? Show each step."}],
    "max_tokens": 2000,
    "reasoning_effort": "high"
  }'

# Test with reasoning_effort: "low"
curl -s --max-time 180 "http://localhost:8317/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test" \
  -d '{
    "model": "gemini-3-pro-preview",
    "messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a goat, and a cabbage. The boat can only carry the farmer and one item at a time. If left alone, the wolf will eat the goat, and the goat will eat the cabbage. How can the farmer get everything across safely? Show each step."}],
    "max_tokens": 2000,
    "reasoning_effort": "low"
  }'
```

**Expected behavior**
- `reasoning_effort: "high"` should produce significantly more reasoning tokens than `reasoning_effort: "low"`
- The `thinkingLevel` parameter should be sent to Gemini 3 API (not `thinkingBudget`)

**Actual behavior (before fix)**
| reasoning_effort | Parameter Sent | reasoning_tokens |
|------------------|----------------|------------------|
| `"high"` | thinkingBudget: 32768 | 1310 |
| `"low"` | thinkingBudget: 1024 | 1143 |

~13% difference - essentially the same, parameter is **ignored**.

**Actual behavior (after fix)**
| reasoning_effort | Parameter Sent | reasoning_tokens |
|------------------|----------------|------------------|
| `"high"` | thinkingLevel: "high" | 1150 |
| `"low"` | thinkingLevel: "low" | 132 |

~9x difference - parameter is **respected**.

**OS Type**
Irrelevant

**Additional context**
Gemini 2.5 models use `thinkingBudget` (numeric: -1, 0, 1024, 8192, 32768, etc.)
Gemini 3 models use `thinkingLevel` (string: "low" or "high")

The code was treating all Gemini models the same, sending `thinkingBudget` to both.

**Root Cause**
The translator code in `gemini-cli_openai_request.go` and `antigravity_openai_request.go` did not differentiate between Gemini 2.5 and Gemini 3 models. It sent `thinkingBudget` to all models, but Gemini 3 ignores this parameter and only responds to `thinkingLevel`.

**Proposed Fix**

File: `internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go`
File: `internal/translator/antigravity/openai/chat-completions/antigravity_openai_request.go`

**Before (broken):**
```go
// Reasoning effort -> thinkingBudget/include_thoughts
re := gjson.GetBytes(rawJSON, "reasoning_effort")
hasOfficialThinking := re.Exists()
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
    switch re.String() {
    case "none":
        out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 0)
    case "auto":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    case "low":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    case "medium":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    case "high":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    default:
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    }
}
```

**After (fixed):**
```go
// Reasoning effort -> thinkingBudget/include_thoughts (Gemini 2.5) or thinkingLevel (Gemini 3)
re := gjson.GetBytes(rawJSON, "reasoning_effort")
hasOfficialThinking := re.Exists()
isGemini3 := util.IsGemini3Model(modelName)
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
    if isGemini3 {
        // Gemini 3 uses thinkingLevel ("low" or "high") instead of thinkingBudget.
        // Only "low" and "high" are valid; other values don't set thinkingLevel.
        switch re.String() {
        case "low":
            out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingLevel", "low")
        case "high":
            out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingLevel", "high")
        }
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    } else {
        // Gemini 2.5 and others use thinkingBudget
        var budget int
        includeThoughts := true
        switch re.String() {
        case "none":
            budget = 0
            includeThoughts = false
        case "low":
            budget = util.NormalizeThinkingBudget(modelName, 1024)
        case "medium":
            budget = util.NormalizeThinkingBudget(modelName, 8192)
        case "high":
            budget = util.NormalizeThinkingBudget(modelName, 32768)
        default: // "auto" and other values
            budget = -1
        }

        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
        if includeThoughts {
            out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
        } else {
            out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
        }
    }
}
```

**New utility function added to `internal/util/gemini_thinking.go`:**
```go
// IsGemini3Model returns true if the model is a Gemini 3 model (uses thinkingLevel instead of thinkingBudget).
func IsGemini3Model(model string) bool {
    lower := strings.ToLower(model)
    return strings.HasPrefix(lower, "gemini-3-")
}
```

---

*Test Date: 2025-11-26*
*Test Environment: CLIProxyAPI via Docker Compose*
*Model: gemini-3-pro-preview*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug] Gemini 3 Does Not Utilize Reasoning Effort #353

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

reasoning_effort	Parameter Sent	reasoning_tokens
`"high"`	thinkingBudget: 32768	1310
`"low"`	thinkingBudget: 1024	1143

reasoning_effort	Parameter Sent	reasoning_tokens
`"high"`	thinkingLevel: "high"	1150
`"low"`	thinkingLevel: "low"	132

Uh oh!

[Bug] Gemini 3 Does Not Utilize Reasoning Effort #353

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions