chunkshop 0.3.2 (last Postgres-only release)
The last release of the Postgres-only line. 0.4.x introduces modular
sink backends — MariaDB, Clickhouse, SQLite, with PG kept at full parity.
Highlights
- Universal
if_oversize: ChunkerConfigfield on every chunker config
in both Python and Rust. Routes any chunk whoseembedded_contentor
original_contentexceeds the effective ceiling through a fallback
chunker. Chains up to 5 levels deep (deeper raises an explicit error). fixed_overlap.max_chars(optional) — char-bounded as well as
word-bounded.- Wrapper effective ceiling —
neighbor_expand/summary_embed/
hierarchical_summaryresolve their ceiling as
cfg.max_chars > base.max_chars > None. Wrappers inherit by default. - Dedup'd WARN-once-per-cell when
if_oversizeis unset and an
oversize chunk would be emitted. Names the chunker, the ceiling, and a
copy-paste suggestion. No log spam. - Coarse-row exemption on
hierarchical_summary— coarse rows
(one-per-group) are skipped from the check by design. - Rust
semanticchunker now logstracing::warn!on hard-split,
matching Python'ssemantic.py:120. Parity gap closed. - Recursion guard —
if_oversizechains beyond depth 5 raise
OversizeRecursionError(Python) /Error::OversizeRecursion(Rust). - NEW
docs/samples/if-oversize/— runnable demo showing both the
WARN behavior (no fallback) and the fallback chain (with fallback). docs/chunkers.mdoversize-behavior table refreshed with a concrete
Setting if_oversize section.
Install
pip install chunkshop==0.3.2
cargo install chunkshop-rs --version 0.3.2What's next: 0.4.x — modular backends
Postgres stops being load-bearing in the sink layer. MariaDB is the
first sibling backend to ship; PG stays at full parity. Spec:
docs/spec/v4.0-modular-backends.md (see commit 4b22380).
Full changelog: https://github.com/yonk-labs/chunkshop/blob/v0.3.2/CHANGELOG.md
Compare: v0.3.1...v0.3.2