Skip to content

[Bug]: gemini-2.5-pro doesn't take reasoning parameter error #11557

@ming-jeng

Description

@ming-jeng

What happened?

Gemini 2.5 Pro supports reasoning control via thinking_budget (docs)

But when I use the reasoning parameters from litellm such as:

  1. reasoning_effort = "low"
  2. thinking={"type": "enabled", "budget_tokens": 1000},

I do:
litellm._turn_on_debug() gemini_pro_response =completion( model="vertex_ai/gemini-2.5-pro-preview-05-06", messages=[{"role": "user", "content": "What is the capital of France?"}], thinking={"type": "enabled", "budget_tokens": 1000}, )

I get this error:
"message": "Unable to submit request because Thinking can't be disabled for this model. Remove 'thinking_config.thinking_budget' from your request and try again.. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini",

Relevant log output

�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - 

�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - �[92mRequest to litellm:�[0m
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - �[92mlitellm.completion(model='vertex_ai/gemini-2.5-pro-preview-05-06', messages=[{'role': 'user', 'content': 'What is the capital of France?'}], thinking={'type': 'enabled', 'budget_tokens': 1000})�[0m
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - 

�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:460 - self.optional_params: {}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - SYNC kwargs[caching]: False; litellm.cache: None; kwargs.get('cache')['no-cache']: False
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:4299 - checking potential_model_names in litellm.model_cost: {'split_model': 'gemini-2.5-pro-preview-05-06', 'combined_model_name': 'vertex_ai/gemini-2.5-pro-preview-05-06', 'stripped_model_name': 'gemini-2.5-pro-preview-05', 'combined_stripped_model_name': 'vertex_ai/gemini-2.5-pro-preview-05', 'custom_llm_provider': 'vertex_ai'}
�[92m11:29:52 - LiteLLM:INFO�[0m: utils.py:2958 - 
LiteLLM completion() model= gemini-2.5-pro-preview-05-06; provider = vertex_ai
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:2961 - 
LiteLLM: Params passed to completion() {'functions': None, 'function_call': None, 'temperature': None, 'top_p': None, 'stream': None, 'stream_options': None, 'stop': None, 'max_tokens': None, 'max_completion_tokens': None, 'modalities': None, 'prediction': None, 'audio': None, 'presence_penalty': None, 'frequency_penalty': None, 'logit_bias': None, 'user': None, 'response_format': None, 'seed': None, 'tools': None, 'tool_choice': None, 'max_retries': None, 'logprobs': None, 'top_logprobs': None, 'extra_headers': None, 'api_version': None, 'parallel_tool_calls': None, 'allowed_openai_params': None, 'reasoning_effort': None, 'additional_drop_params': None, 'messages': [{'role': 'user', 'content': 'What is the capital of France?'}], 'thinking': {'type': 'enabled', 'budget_tokens': 1000}, 'web_search_options': None, 'custom_llm_provider': 'vertex_ai', 'drop_params': None, 'model': 'gemini-2.5-pro-preview-05-06', 'n': None}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:2964 - 
LiteLLM: Non-Default params passed to completion() {'thinking': {'type': 'enabled', 'budget_tokens': 1000}}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - Final returned optional params: {'thinkingConfig': {'includeThoughts': True, 'thinkingBudget': 1000}}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:460 - self.optional_params: {'thinking': {'type': 'enabled', 'budget_tokens': 1000}}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:305 - Checking cached credentials for project_id: pg-gemini-api-research
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:310 - Cached credentials found for project_id: pg-gemini-api-research.
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:314 - Using cached credentials
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:344 - Validating credentials for project_id: pg-gemini-api-research
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:4299 - checking potential_model_names in litellm.model_cost: {'split_model': 'gemini-2.5-pro-preview-05-06', 'combined_model_name': 'vertex_ai/gemini-2.5-pro-preview-05-06', 'stripped_model_name': 'gemini-2.5-pro-preview-05', 'combined_stripped_model_name': 'vertex_ai/gemini-2.5-pro-preview-05', 'custom_llm_provider': 'vertex_ai'}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:907 - �[92m

POST Request Sent from LiteLLM:
curl -X POST \
https://us-central1-aiplatform.googleapis.com/v1/projects/pg-gemini-api-research/locations/us-central1/publishers/google/models/gemini-2.5-pro-preview-05-06:generateContent \
-H 'Content-Type: ap****on' -H 'Authorization: Be****16' \
-d '{'contents': [{'role': 'user', 'parts': [{'text': 'What is the capital of France?'}]}], 'generationConfig': {'thinkingConfig': {'includeThoughts': True, 'thinkingBudget': 1000}}}'
�[0m

�[92m11:29:52 - LiteLLM:DEBUG�[0m: exception_mapping_utils.py:2268 - Logging Details: logger_fn - None | callable(logger_fn) - False
�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:2164 - Logging Details LiteLLM-Failure Call: []

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.

Not sure if related to #10254

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.71.2

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions