Skip to content

v0.31.0

Choose a tag to compare

@benbrandt benbrandt released this 04 Jun 11:24
· 19 commits to main since this release

Breaking Changes

  • Updated tokenizers to v0.23 and tiktoken-rs to v0.12. Some Hugging Face tokenizers include truncation settings, and tokenizers v0.23 may return the truncated size for those tokenizers instead of exposing overflow encodings. Disable truncation with tokenizer.with_truncation(None) before constructing a splitter if chunk sizes should reflect the full input text.

Full Changelog: v0.30.1...v0.31.0