v0.3.0 — Ukrainian text optimizer for LLMs
dormouse v0.3.0
Ukrainian text optimizer for LLMs — fewer tokens, better comprehension.
Key metrics
- 73% token savings (cloud squeeze with seq2seq)
- 88% lexicon coverage on 53K text corpus
- 98.2% exact match seq2seq expression translation
- 150% quality preservation — GPT understands squeezed better than original UA
Assets
Download these files automatically on first use, or manually place in ~/.cache/dormouse/v0.3.0/:
lexicon.db— 47K entry UA→EN lexiconexpr_seq2seq.pt— GRU encoder-decoder for expression translationexpr_vocab_src.json/expr_vocab_tgt.json— source/target vocabulariesexpr_config.json— model configuration
Install
pip install dormouse