-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
What happened?
Gemini 2.5 Pro supports reasoning control via thinking_budget (docs)
But when I use the reasoning parameters from litellm such as:
- reasoning_effort = "low"
- thinking={"type": "enabled", "budget_tokens": 1000},
I do:
litellm._turn_on_debug() gemini_pro_response =completion( model="vertex_ai/gemini-2.5-pro-preview-05-06", messages=[{"role": "user", "content": "What is the capital of France?"}], thinking={"type": "enabled", "budget_tokens": 1000}, )
I get this error:
"message": "Unable to submit request because Thinking can't be disabled for this model. Remove 'thinking_config.thinking_budget' from your request and try again.. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini",
Relevant log output
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 -
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - �[92mRequest to litellm:�[0m
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - �[92mlitellm.completion(model='vertex_ai/gemini-2.5-pro-preview-05-06', messages=[{'role': 'user', 'content': 'What is the capital of France?'}], thinking={'type': 'enabled', 'budget_tokens': 1000})�[0m
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 -
�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:460 - self.optional_params: {}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - SYNC kwargs[caching]: False; litellm.cache: None; kwargs.get('cache')['no-cache']: False
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:4299 - checking potential_model_names in litellm.model_cost: {'split_model': 'gemini-2.5-pro-preview-05-06', 'combined_model_name': 'vertex_ai/gemini-2.5-pro-preview-05-06', 'stripped_model_name': 'gemini-2.5-pro-preview-05', 'combined_stripped_model_name': 'vertex_ai/gemini-2.5-pro-preview-05', 'custom_llm_provider': 'vertex_ai'}
�[92m11:29:52 - LiteLLM:INFO�[0m: utils.py:2958 -
LiteLLM completion() model= gemini-2.5-pro-preview-05-06; provider = vertex_ai
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:2961 -
LiteLLM: Params passed to completion() {'functions': None, 'function_call': None, 'temperature': None, 'top_p': None, 'stream': None, 'stream_options': None, 'stop': None, 'max_tokens': None, 'max_completion_tokens': None, 'modalities': None, 'prediction': None, 'audio': None, 'presence_penalty': None, 'frequency_penalty': None, 'logit_bias': None, 'user': None, 'response_format': None, 'seed': None, 'tools': None, 'tool_choice': None, 'max_retries': None, 'logprobs': None, 'top_logprobs': None, 'extra_headers': None, 'api_version': None, 'parallel_tool_calls': None, 'allowed_openai_params': None, 'reasoning_effort': None, 'additional_drop_params': None, 'messages': [{'role': 'user', 'content': 'What is the capital of France?'}], 'thinking': {'type': 'enabled', 'budget_tokens': 1000}, 'web_search_options': None, 'custom_llm_provider': 'vertex_ai', 'drop_params': None, 'model': 'gemini-2.5-pro-preview-05-06', 'n': None}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:2964 -
LiteLLM: Non-Default params passed to completion() {'thinking': {'type': 'enabled', 'budget_tokens': 1000}}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:338 - Final returned optional params: {'thinkingConfig': {'includeThoughts': True, 'thinkingBudget': 1000}}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:460 - self.optional_params: {'thinking': {'type': 'enabled', 'budget_tokens': 1000}}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:305 - Checking cached credentials for project_id: pg-gemini-api-research
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:310 - Cached credentials found for project_id: pg-gemini-api-research.
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:314 - Using cached credentials
�[92m11:29:52 - LiteLLM:DEBUG�[0m: vertex_llm_base.py:344 - Validating credentials for project_id: pg-gemini-api-research
�[92m11:29:52 - LiteLLM:DEBUG�[0m: utils.py:4299 - checking potential_model_names in litellm.model_cost: {'split_model': 'gemini-2.5-pro-preview-05-06', 'combined_model_name': 'vertex_ai/gemini-2.5-pro-preview-05-06', 'stripped_model_name': 'gemini-2.5-pro-preview-05', 'combined_stripped_model_name': 'vertex_ai/gemini-2.5-pro-preview-05', 'custom_llm_provider': 'vertex_ai'}
�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:907 - �[92m
POST Request Sent from LiteLLM:
curl -X POST \
https://us-central1-aiplatform.googleapis.com/v1/projects/pg-gemini-api-research/locations/us-central1/publishers/google/models/gemini-2.5-pro-preview-05-06:generateContent \
-H 'Content-Type: ap****on' -H 'Authorization: Be****16' \
-d '{'contents': [{'role': 'user', 'parts': [{'text': 'What is the capital of France?'}]}], 'generationConfig': {'thinkingConfig': {'includeThoughts': True, 'thinkingBudget': 1000}}}'
�[0m
�[92m11:29:52 - LiteLLM:DEBUG�[0m: exception_mapping_utils.py:2268 - Logging Details: logger_fn - None | callable(logger_fn) - False
�[92m11:29:52 - LiteLLM:DEBUG�[0m: litellm_logging.py:2164 - Logging Details LiteLLM-Failure Call: []
�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.
Not sure if related to #10254
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.71.2
Twitter / LinkedIn details
No response
tandav
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working