-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
Bug: Ollama qwen3 tool_calls dropped when thinking field is present
Description
When using qwen3 models through LiteLLM's Ollama provider, tool calls are not forwarded to the client. The response contains only content: "{}" while the actual tool_calls are dropped.
This appears to be caused by qwen3's unique thinking field in responses. When Ollama returns both thinking and tool_calls, LiteLLM only returns the empty content.
Environment
- LiteLLM Version: 1.80.13
- Ollama Version: Latest
- Model: qwen3:14b
- Provider: ollama
Steps to Reproduce
1. Direct Ollama API call (works correctly):
curl -s http://localhost:11434/api/chat -d '{
"model": "qwen3:14b",
"messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}
}],
"stream": false
}'Ollama returns correctly:
{
"message": {
"role": "assistant",
"content": "",
"thinking": "Okay, the user is asking about the weather in Tokyo...",
"tool_calls": [
{
"id": "call_bju50t97",
"function": {
"name": "get_weather",
"arguments": {"location": "Tokyo"}
}
}
]
}
}2. Same request through LiteLLM (fails):
curl -s http://localhost:4000/v1/chat/completions \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3",
"messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
"tools": [{"type": "function", "function": {"name": "get_weather", ...}}],
"tool_choice": "auto"
}'LiteLLM returns (broken):
{
"choices": [{
"message": {
"content": "{}",
"role": "assistant"
}
}]
}Note: tool_calls is completely missing from the response.
Expected Behavior
LiteLLM should forward tool_calls from Ollama's response to the client, regardless of whether a thinking field is present.
Comparison with qwen2.5 (works)
qwen2.5 does NOT have a thinking field in its responses, and tool calling works correctly through LiteLLM:
{
"choices": [{
"message": {
"content": "",
"role": "assistant",
"tool_calls": [{"function": {"name": "get_weather", "arguments": "{\"location\": \"Tokyo\"}"}}]
}
}]
}Root Cause Hypothesis
The Ollama response handler in LiteLLM likely doesn't account for the thinking field that qwen3 includes. When parsing the response, it may be incorrectly handling the message structure.
LiteLLM Config
model_list:
- model_name: qwen3
litellm_params:
model: ollama/qwen3:14b
api_base: http://ollama-server:11434Impact
- qwen3 cannot be used for agentic/tool-calling workflows through LiteLLM
- Users must fall back to qwen2.5 or call Ollama directly