Skip to content

v0.3.0 — Ukrainian text optimizer for LLMs

Choose a tag to compare

@ChuprinaDaria ChuprinaDaria released this 05 May 10:44
· 24 commits to main since this release

dormouse v0.3.0

Ukrainian text optimizer for LLMs — fewer tokens, better comprehension.

Key metrics

  • 73% token savings (cloud squeeze with seq2seq)
  • 88% lexicon coverage on 53K text corpus
  • 98.2% exact match seq2seq expression translation
  • 150% quality preservation — GPT understands squeezed better than original UA

Assets

Download these files automatically on first use, or manually place in ~/.cache/dormouse/v0.3.0/:

  • lexicon.db — 47K entry UA→EN lexicon
  • expr_seq2seq.pt — GRU encoder-decoder for expression translation
  • expr_vocab_src.json / expr_vocab_tgt.json — source/target vocabularies
  • expr_config.json — model configuration

Install

pip install dormouse