Skip to content

translit 0.5.0

Choose a tag to compare

@raeq raeq released this 06 Jun 13:14
· 151 commits to main since this release
12c7417

translit 0.5.0

This release sharpens what translit is: Unicode adversarial-text defense and canonicalization, powered by Rust — TR39 visual confusable mapping, homoglyph / bidi / zalgo / invisible-character stripping, and standards-based Latin/Cyrillic/Greek transliteration. It also adds context-aware transliteration for abjad scripts and fixes a long-standing Linux packaging bug.

Highlights

Adversarial-text defense, front and center. translit maps confusables by appearance (TR39: Cyrillic р → Latin p), the mapping that actually reverses a homoglyph attack — unlike unidecode/anyascii/ftfy, which map phonetically and can't. The new Adversarial-Text Defense guide covers the phonetic-vs-visual distinction and the XMR benchmark evidence.

from translit import strip_obfuscation, normalize_confusables, is_safe_hostname

strip_obfuscation("рroduсt")          # → "product"   (Cyrillic р→p, с→c via TR39)
normalize_confusables("раypal")        # → "paypal"
safe, details = is_safe_hostname("аpple.com")   # → (False, …)  leading Cyrillic а

Context-aware transliteration for Arabic, Persian, and Hebrew. transliterate(text, context=True) uses dictionary-based vowel restoration (bigram → unigram → context-free) to produce readable romanization instead of consonant skeletons. Opt in with pip install translit-rs[arabic] / [hebrew] / [context].

Fixed

  • Linux x86_64 wheels are now built as cp39-abi3. Earlier releases only shipped a cp38-cp38 x86_64 Linux wheel, forcing a source build (Rust toolchain) on Python 3.9+. pip install translit-rs now gets a prebuilt wheel on Linux x86_64 like every other platform. (#26)
  • Documentation corrections (consistent language-profile count; verified homoglyph examples).

Security

  • All third-party GitHub Actions pinned to commit SHAs across CI and the release pipeline; added Dependabot to keep them current. Dev/docs dependency bumps (Pygments 2.20.0, pytest 9.0.3).

Compatibility

No breaking changes. No public API, language codes, or script coverage were removed — translit-rs still has zero runtime dependencies. CJK/Indic/other scripts remain available as best-effort, unidecode-compatible coverage.

Install

pip install translit-rs

Full changelog: https://github.com/raeq/translit/blob/main/CHANGELOG.md