Skip to content

fix: Ollama Truncation#262

Merged
GangGreenTemperTatum merged 2 commits intomainfrom
fix/ollama-truncation
Sep 4, 2025
Merged

fix: Ollama Truncation#262
GangGreenTemperTatum merged 2 commits intomainfrom
fix/ollama-truncation

Conversation

@monoxgas
Copy link
Copy Markdown
Contributor

@monoxgas monoxgas commented Sep 3, 2025

Notes

Ollama has a known behavior where it performs silent truncation of input messages rather than return an error or any API indication.

To make a best effort on resolution:

  • Emit a warning if input_tokens for a litellm response is far lower than the length of the messages we sent.
  • Docs updates to call out the behavior.

See:


Generated Summary

  • Introduced a new GeneratorWarning class and integrated warning handling to detect potential Ollama input truncation.
  • Updated get_generator functions (in both documentation and implementation) to accept a dict for params, converting it to GenerateParams when needed.
  • Added token truncation detection in LiteLLMGenerator with a dedicated _warn_on_input_truncation method that warns when input tokens appear truncated.
  • Refactored XML field name resolution in the Model by adding a _get_field_xml_name helper, replacing inline logic and adding alias support.
  • Updated documentation: added a new GeneratorWarning section, revised generator API examples, and switched from "ollama/qwen3" to "ollama_chat/qwen3" including a warning block regarding truncation.
  • Added a new regex exclusion rule in .secrets.baseline to ignore files in the examples/ directory.
  • Bumped the project version to 3.3.3 in pyproject.toml.

This summary was generated with ❤️ by rigging

Generated Summary

  • Added a new GeneratorWarning class and integrated warnings for Ollama’s silent input truncation in the LiteLLM generator.
  • Updated the get_generator interface to accept dict inputs for params in addition to GenerateParams.
  • Modified documentation for the generator API, including updated examples and a warning block regarding Ollama’s context length limitations.
  • Refactored XML field name resolution by introducing a _get_field_xml_name method to standardize field name extraction in Model methods.
  • Adjusted the .secrets.baseline file by adding file exclusion rules and updating secret line numbers.
  • Removed lru_cache from the get_generator function in the generator base module.
  • Bumped the package version from 3.3.2 to 3.3.3.

This summary was generated with ❤️ by rigging

@monoxgas monoxgas requested a review from a team as a code owner September 3, 2025 21:58
@dreadnode-renovate-bot dreadnode-renovate-bot Bot added area/docs Changes to documentation and guides area/python Changes to Python package configuration and dependencies type/docs Documentation updates and improvements labels Sep 3, 2025
Copy link
Copy Markdown
Contributor

@GangGreenTemperTatum GangGreenTemperTatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested and confirmed working as per our chat 👌
FWIW i also tested and confirmed working on somewhat related PR 👍

@GangGreenTemperTatum GangGreenTemperTatum merged commit fe7c47d into main Sep 4, 2025
4 of 7 checks passed
@monoxgas monoxgas deleted the fix/ollama-truncation branch November 13, 2025 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docs Changes to documentation and guides area/python Changes to Python package configuration and dependencies type/docs Documentation updates and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants