Skip to content

[Bug] Gemini 3 Does Not Utilize Reasoning Effort #353

@nestharus

Description

@nestharus

Describe the bug
Gemini 3 models use thinkingLevel (string: "low" or "high") to control reasoning effort, but the code sends thinkingBudget (numeric) which Gemini 3 ignores. This means reasoning_effort has no effect on Gemini 3 thinking behavior.

CLI Type
Gemini CLI, Antigravity

Model Name
gemini-3-pro-preview (and all gemini-3-* variants)

LLM Client
Any OpenAI-compatible client using /v1/chat/completions

Request Information

# Test with reasoning_effort: "high"
curl -s --max-time 180 "http://localhost:8317/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test" \
  -d '{
    "model": "gemini-3-pro-preview",
    "messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a goat, and a cabbage. The boat can only carry the farmer and one item at a time. If left alone, the wolf will eat the goat, and the goat will eat the cabbage. How can the farmer get everything across safely? Show each step."}],
    "max_tokens": 2000,
    "reasoning_effort": "high"
  }'

# Test with reasoning_effort: "low"
curl -s --max-time 180 "http://localhost:8317/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test" \
  -d '{
    "model": "gemini-3-pro-preview",
    "messages": [{"role": "user", "content": "A farmer needs to cross a river with a wolf, a goat, and a cabbage. The boat can only carry the farmer and one item at a time. If left alone, the wolf will eat the goat, and the goat will eat the cabbage. How can the farmer get everything across safely? Show each step."}],
    "max_tokens": 2000,
    "reasoning_effort": "low"
  }'

Expected behavior

  • reasoning_effort: "high" should produce significantly more reasoning tokens than reasoning_effort: "low"
  • The thinkingLevel parameter should be sent to Gemini 3 API (not thinkingBudget)

Actual behavior (before fix)

reasoning_effort Parameter Sent reasoning_tokens
"high" thinkingBudget: 32768 1310
"low" thinkingBudget: 1024 1143

~13% difference - essentially the same, parameter is ignored.

Actual behavior (after fix)

reasoning_effort Parameter Sent reasoning_tokens
"high" thinkingLevel: "high" 1150
"low" thinkingLevel: "low" 132

~9x difference - parameter is respected.

OS Type
Irrelevant

Additional context
Gemini 2.5 models use thinkingBudget (numeric: -1, 0, 1024, 8192, 32768, etc.)
Gemini 3 models use thinkingLevel (string: "low" or "high")

The code was treating all Gemini models the same, sending thinkingBudget to both.

Root Cause
The translator code in gemini-cli_openai_request.go and antigravity_openai_request.go did not differentiate between Gemini 2.5 and Gemini 3 models. It sent thinkingBudget to all models, but Gemini 3 ignores this parameter and only responds to thinkingLevel.

Proposed Fix

File: internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go
File: internal/translator/antigravity/openai/chat-completions/antigravity_openai_request.go

Before (broken):

// Reasoning effort -> thinkingBudget/include_thoughts
re := gjson.GetBytes(rawJSON, "reasoning_effort")
hasOfficialThinking := re.Exists()
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
    switch re.String() {
    case "none":
        out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", 0)
    case "auto":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    case "low":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 1024))
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    case "medium":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 8192))
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    case "high":
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", util.NormalizeThinkingBudget(modelName, 32768))
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    default:
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", -1)
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    }
}

After (fixed):

// Reasoning effort -> thinkingBudget/include_thoughts (Gemini 2.5) or thinkingLevel (Gemini 3)
re := gjson.GetBytes(rawJSON, "reasoning_effort")
hasOfficialThinking := re.Exists()
isGemini3 := util.IsGemini3Model(modelName)
if hasOfficialThinking && util.ModelSupportsThinking(modelName) {
    if isGemini3 {
        // Gemini 3 uses thinkingLevel ("low" or "high") instead of thinkingBudget.
        // Only "low" and "high" are valid; other values don't set thinkingLevel.
        switch re.String() {
        case "low":
            out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingLevel", "low")
        case "high":
            out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingLevel", "high")
        }
        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
    } else {
        // Gemini 2.5 and others use thinkingBudget
        var budget int
        includeThoughts := true
        switch re.String() {
        case "none":
            budget = 0
            includeThoughts = false
        case "low":
            budget = util.NormalizeThinkingBudget(modelName, 1024)
        case "medium":
            budget = util.NormalizeThinkingBudget(modelName, 8192)
        case "high":
            budget = util.NormalizeThinkingBudget(modelName, 32768)
        default: // "auto" and other values
            budget = -1
        }

        out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
        if includeThoughts {
            out, _ = sjson.SetBytes(out, "request.generationConfig.thinkingConfig.include_thoughts", true)
        } else {
            out, _ = sjson.DeleteBytes(out, "request.generationConfig.thinkingConfig.include_thoughts")
        }
    }
}

New utility function added to internal/util/gemini_thinking.go:

// IsGemini3Model returns true if the model is a Gemini 3 model (uses thinkingLevel instead of thinkingBudget).
func IsGemini3Model(model string) bool {
    lower := strings.ToLower(model)
    return strings.HasPrefix(lower, "gemini-3-")
}

Test Date: 2025-11-26
Test Environment: CLIProxyAPI via Docker Compose
Model: gemini-3-pro-preview

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions