Release BrainPalace 26.6.15 · bxw91/brainpalace

Multi-language BM25

BrainPalace now tokenizes each document with its own natural-language analyzer (normalize → tokenize → stopwords → stem/lemmatize) instead of a language-agnostic tokenizer.

Highlights

~27 Snowball/PyStemmer languages (en, de, fr, es, ru, it, pt, nl, sv, fi, hu, ro, tr, ar, …) + a vendored Croatian (hr) stemmer; stopwords via stopwordsiso; unknown codes fall back to English.
New bm25: config block: language, engine (stem|lemma), detect, detect_min_confidence.
CLI: init --language/--bm25-engine, folders add --language, query --language, status shows language/engine. MCP query tool gains language.
Croatian lemma tier: pip install 'brainpalace[lemma-hr]' (simplemma, Serbo-Croatian hbs).
Engine: BM25 now uses bm25s directly (dropped the LlamaIndex BM25Retriever wrapper). Existing indexes auto-migrate from the stored corpus on first start — no manual action.

See docs/CHANGELOG.md for full details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BrainPalace 26.6.15

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Multi-language BM25

Uh oh!