Skip to content

Context msg length error #4728

@tkfischer

Description

@tkfischer

Env = Docker setup on Mac + default settings + OpenAI 4o - mini

Simple second or third query in same chat with document upload (PDF size = well within LLM context length)
Getting below exception. Later tried with Gemini with 1M context window. Getting same error but maybe on third or fourth query.
Happens almost always when continuing with prior chat.
Doc takes up only 85K of 1M (see attached).
Very simple query with few words like "what is foo ?" from the document.

Image

File "/usr/local/lib/python3.11/site-packages/litellm/main.py", line 3150, in completion
raise exception_type(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2214, in exception_type
raise e
File "/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 284, in exception_type
raise ContextWindowExceededError(
litellm.exceptions.ContextWindowExceededError: litellm.ContextWindowExceededError: litellm.BadRequestError: ContextWindowExceededError: OpenAIException - This model's maximum context length is 128000 tokens. However, your messages resulted in 172152 tokens. Please reduce the length of the messages.
During task with name 'basic_use_tool_response' and id '73fa3cc3-3ac2-e5c3-7490-9a1b563c3440'
INFO: 05/07/2025 08:30:15 PM timing.py 76: [API:jzknMf_4] stream_chat_message took 8.46039366722107 seconds
INFO: 05/07/2025 08:30:41 PM h11_impl.py 499: [API:TAGQTjlW] 192.168.97.9:46014 - "GET /health HTTP/1.1" 200


Gemini

raise exception_type(
      ^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2214, in exception_type
raise e
File "/usr/local/lib/python3.11/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 1217, in exception_type
raise BadRequestError(
litellm.exceptions.BadRequestError: litellm.BadRequestError: VertexAIException BadRequestError - b'{\n "error": {\n "code": 400,\n "message": "The input token count (1580921) exceeds the maximum number of tokens allowed (1048575).",\n "status": "INVALID_ARGUMENT"\n }\n}\n'
During task with name 'basic_use_tool_response' and id '79e947f4-0b1d-af0a-638e-4bc43a471576'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions