A Python library for transliterating Chechen text from Cyrillic to Latin script using the Chechen Latin alphabet.
pip install ce-translit
import ce_translit
# Simple usage - transliterate Chechen text
text = "Нохчийн мотт"
result = ce_translit.transliterate(text)
print(result) # Outputs: "Noxçiyŋ mott"
- Simple API: Clean, single-function interface
- Linguistically Accurate: Handles all Chechen-specific rules
- Context-Aware: Special handling for letter position rules
- Customizable: Advanced options for specialized use cases
- Pure Python: No external dependencies
- Memory Efficient: Uses minimal memory and efficient string handling
import ce_translit
# Transliterate a single word
word_result = ce_translit.transliterate("дош") # "doş"
# Transliterate a sentence
sentence = "Муха ду хьал де?"
sentence_result = ce_translit.transliterate(sentence) # "Muxa du ẋal de?"
from ce_translit import Transliterator
# Create a custom transliterator with your own rules
custom_transliterator = Transliterator(
# Custom letter mapping
mapping={
**Transliterator()._mapping, # First define base mapping
# Then override specific mappings
"й": "j",
# Append completely new mappings
"1": "j"
},
# Override blacklist (Words that should keep the regular 'н' at the end)
blacklist=["дин", "гӏан", "сан"],
# Override unsurelist (Words that should use 'ŋ[REPLACE]' at the end)
unsurelist=["шун", "бен", "цӏен"]
)
# Use the custom transliterator
result = custom_transliterator.transliterate("1аж дин шун")
If you omit **Transliterator()._mapping**
from the custom mapping, the custom transliterator will only use the custom mappings you provide.
from ce_translit import Transliterator
# Define your own list
my_blacklist = ["дин", "гӏан", "сан"]
# Create a custom transliterator with defined blacklist
custom_transliterator = Transliterator(blacklist=my_blacklist)
result = custom_transliterator.transliterate("дин")
The library handles several special rules in Chechen transliteration:
-
Letter 'е':
- At the start of a word → 'ye' (ex: "елар" → "yelar")
- After 'ъ' → 'ye' (ex: "шелъелча" → "şelyelça")
- In other positions → 'e' (ex: "мела" → "mela")
-
Letter 'н' at end of words:
- Regular handling → 'ŋ' (ex: "сан" → "saŋ")
- Blacklisted words keep 'n' (ex: "хан" → "xan")
- Unsurelist words use 'ŋ[REPLACE]' (ex: "шун" → "şuŋ[REPLACE]")
-
Standalone 'а':
- When 'а' is a standalone word → 'ə' (ex: "а" → "ə")
-
Special Character Combinations:
- 'къ' → 'q̇'
- 'хь' → 'ẋ'
- 'гӏ' → 'ġ'
The library is optimized for both startup time and runtime performance:
- Data is loaded once at import time
- Efficient string handling for minimal memory usage
- Uses sets for O(1) lookups in blacklists and unsure lists
# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate
# Install development tools
pip install --upgrade hatch pytest
# Run tests
hatch run test
# Build the package
hatch build
# Test the built package
pip install --force-reinstall dist/ce_translit-1.0.0-py3-none-any.whl
# Install test dependencies
pip install pytest
# Run tests
pytest
ce-translit-py/
├── src/
│ └── ce_translit/
│ ├── __init__.py # Public API
│ ├── _transliterator.py # Core implementation
│ ├── data/
│ │ └── cyrl_latn_map.json # Character mapping
├── tests/
│ └── test_transliterator.py
├── LICENSE
├── README.md
└── pyproject.toml
This project is licensed under the MIT License.
Contributions are welcome! Feel free to submit issues or pull requests on the GitHub repository.
- ce-translit-js - JavaScript version of this library