Skip to content

v1.6.3

Choose a tag to compare

@chonk-lain chonk-lain released this 20 Apr 20:48
33dd1b9

lancedb handshake

πŸš€ Chonkie v1.6.3

Caution

Known Bug: import chonkie fails with ModuleNotFoundError: No module named 'pandas' when installed without the [table] extra. This is caused by an unconditional top-level pandas import in utils/table_converter.py. Please upgrade to v1.6.4 which fixes this issue.

✨ Features

  • LanceDB Handshake : Introduced a new handshake mechanism for LanceDB integration by @chonk-lain in #546
  • Metadata Enhancements : Added filename to metadata for better traceability by @chonk-lain in #554
  • Markdown Support Improvements : Added MarkdownDocument support for CodeChunker and fixed no-op behavior in TableChunker by @chonknick in #563
  • Table Utilities : Added a table-to-JSON converter by @anaslimem in #531

🧠 Improvements

  • Chunking Consistency : Deduplicated delimiter-based text splitting across chunkers by @anaslimem in #510
  • Model Loading Robustness : Improved error handling for neural model and tokenizer loading by @chimchim89 in #472
  • Refactor Handshake IDs : Moved _generate_default_id into BaseHandshake by @chimchim89 in #455

πŸ› Fixes

  • CJK Delimiter Handling : Fixed handling of single-character delimiters in RecursiveChunker._split_text by @nightcityblade in #537

πŸ“š Documentation

  • JavaScript Docs : Added JavaScript documentation by @chonk-lain in #545
  • Semantic Chunker Examples : Fixed embedding examples by @narumiruna in #544
  • README Cleanup : Removed outdated full API documentation link by @narumiruna in #543
  • General Docs Updates : Refactored and improved documentation by @chonk-lain in #542 and #557
  • Contribution Guidelines : Added PR checklist to CONTRIBUTING.md by @swamy18 in #465

πŸ”§ Maintenance & Dependencies

πŸ™Œ New Contributors

Full Changelog: v1.6.2...v1.6.3