Skip to content

Ollama qwen3 tool_calls dropped when thinking field is present #18922

@rsp2k

Description

@rsp2k

Bug: Ollama qwen3 tool_calls dropped when thinking field is present

Description

When using qwen3 models through LiteLLM's Ollama provider, tool calls are not forwarded to the client. The response contains only content: "{}" while the actual tool_calls are dropped.

This appears to be caused by qwen3's unique thinking field in responses. When Ollama returns both thinking and tool_calls, LiteLLM only returns the empty content.

Environment

  • LiteLLM Version: 1.80.13
  • Ollama Version: Latest
  • Model: qwen3:14b
  • Provider: ollama

Steps to Reproduce

1. Direct Ollama API call (works correctly):

curl -s http://localhost:11434/api/chat -d '{
  "model": "qwen3:14b",
  "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather",
      "parameters": {
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"]
      }
    }
  }],
  "stream": false
}'

Ollama returns correctly:

{
  "message": {
    "role": "assistant",
    "content": "",
    "thinking": "Okay, the user is asking about the weather in Tokyo...",
    "tool_calls": [
      {
        "id": "call_bju50t97",
        "function": {
          "name": "get_weather",
          "arguments": {"location": "Tokyo"}
        }
      }
    ]
  }
}

2. Same request through LiteLLM (fails):

curl -s http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "tools": [{"type": "function", "function": {"name": "get_weather", ...}}],
    "tool_choice": "auto"
  }'

LiteLLM returns (broken):

{
  "choices": [{
    "message": {
      "content": "{}",
      "role": "assistant"
    }
  }]
}

Note: tool_calls is completely missing from the response.

Expected Behavior

LiteLLM should forward tool_calls from Ollama's response to the client, regardless of whether a thinking field is present.

Comparison with qwen2.5 (works)

qwen2.5 does NOT have a thinking field in its responses, and tool calling works correctly through LiteLLM:

{
  "choices": [{
    "message": {
      "content": "",
      "role": "assistant",
      "tool_calls": [{"function": {"name": "get_weather", "arguments": "{\"location\": \"Tokyo\"}"}}]
    }
  }]
}

Root Cause Hypothesis

The Ollama response handler in LiteLLM likely doesn't account for the thinking field that qwen3 includes. When parsing the response, it may be incorrectly handling the message structure.

LiteLLM Config

model_list:
  - model_name: qwen3
    litellm_params:
      model: ollama/qwen3:14b
      api_base: http://ollama-server:11434

Impact

  • qwen3 cannot be used for agentic/tool-calling workflows through LiteLLM
  • Users must fall back to qwen2.5 or call Ollama directly

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions