Skip to content

v0.5.0a14 — chunkshop 0.9.1 floor: bound symbol_aware over-parse

Latest

Choose a tag to compare

@TheYonk TheYonk released this 09 Jun 17:14
· 22 commits to main since this release
91bf165

Fixes the #79 root cause: large/generated code corpora no longer OOM the embedding step

chunkshop 0.9.0's path-less language detection parsed ~2× more files as code, so a single generated/minified file could explode into thousands of symbol_aware chunks and exhaust memory when embedding them.

chunkshop 0.9.1 (chunkshop#71/#72) adds two on-by-default guards — a content-detection fallback for generated/minified files and max_symbols_per_file=2000. This release raises the [chunkshop] extra floor to chunkshop>=0.9.1, so pg-raggraph's chunk_strategy="chunkshop:symbol_aware" path inherits the protection with no config change.

  • Verified: a 3,000-function generated .ts drops 3,000 → 74 chunks; normal code untouched.
  • 467 unit + 17 integration green under chunkshop 0.9.1.

Together with v0.5.0a13 (cross-file resolver spill-to-DB), both halves of #79 are addressed.

Full changelog: see CHANGELOG.md.