Skip to content

Chunklet-py v2.2.0 "The Unification Edition"

Choose a tag to compare

@speedyk-005 speedyk-005 released this 22 Feb 19:55
· 91 commits to main since this release

What's New?

Check out What's New for the full scoop.

✨ Quick Summary

  • Unified API — Consistent method names across all chunkers (chunk_text, chunk_file, chunk_texts, chunk_files)
  • PlainTextChunker merged into DocumentChunker — Handle both text and documents with one class
  • SentenceSplitter renamesplit() renamed to split_text(), also added split_file()
  • Shorter CLI flags-l for --lang, -h for --host, -m for --metadata, -t for --tokenizer-timeout
  • Visualizer overhaul — Fullscreen mode, 3-row layout, smoother hovers
  • Code chunking improvements — Fixed comment artifacts, added string protection
  • More code languages — ColdFusion, VB.NET, PHP 8 attributes, Pascal support
  • Dependency fixes — No more pkg_resources headaches
  • Direct imports — Now you can do from chunklet import DocumentChunker without performance issues
  • Test coverage — From 87% to 90.67%

Install

# Upgrade to latest
pip install chunklet-py -U

# Or install a specific version
pip install chunklet-py==2.2.0

Migration

Upgrading from v2.1.x? Here's what changed:

Old New
chunker.chunk() chunker.chunk_text() or chunker.chunk_file()
chunker.batch_chunk() chunker.chunk_texts() or chunker.chunk_files()
splitter.split() splitter.split_text()

The old methods still work — they'll just yell at you with a deprecation warning.

Full Changelog

Everything else is in the changelog.