Releases: ZomiCommunity/zomi-nlp
Release list
v0.4.1 — Documentation Overhaul & Native Pipeline Enhancements
This patch release focuses on improving the documentation for Zomi NLP, making the project easier to understand, use, and extend.
What’s New
- Major expansion of the README feature list
- New sections: Coming Soon, Native Pipeline Components, CoNLL‑U Export, CLI Usage
- Updated configuration examples for consistency
- Roadmap converted to a clear version table
- Added detailed planned features for v0.5.0
- Added summary of changes
Why This Release
Although no code changes were introduced, the documentation improvements are substantial enough to justify a patch release. This ensures users browsing PyPI or GitHub get accurate, complete, and up‑to‑date information.
Zomi NLP v0.4.0 - Complete Native Pipeline
🎉 Zomi NLP v0.4.0 - Complete Native Pipeline
This release introduces a complete rule-based native Zomi NLP pipeline with no external dependencies!
✨ What's New
Core Native Components
| Component | Description | Features |
|---|---|---|
| ZomiTokenizer | Pure Python tokenizer | Clitic splitting, reduplication, compounds, punctuation |
| ZomiPOSTagger | Rule-based POS tagger | 600+ lexicon entries, context-aware rules |
| ZomiLemmatizer | Morphological lemmatization | Clitic removal, affix stripping, irregular forms |
| ZomiDependencyParser | Modular dependency parser | Zomi grammar rules, ergative markers |
| ZomiNER | Named Entity Recognition | PERSON, LOCATION, GPE, DATE, NUMERIC |
| ZomiMorphologicalAnalyzer | Morpheme segmentation | Prefix/suffix detection, feature extraction |
Native Pipeline Architecture
User Input → Tokenizer → Tagger → Lemmatizer → Parser → NER → ZomiDoc
↑ ↑ ↑ ↑ ↑
└───────────┴───────────┴─────────┴───────┘
All Native! No External Dependencies
CLI Improvements
- New
zomi-nlp --doctorcommand for diagnostics - Better error messages with actionable fixes
- Installation status reports
📦 Installation
# Minimal install (native only, no dependencies)
pip install zomi-nlp
# With optional backends (spaCy/Stanza for fallback)
pip install 'zomi-nlp[full]'🚀 Quick Start
from zomi_nlp import load
# Load native pipeline (auto-selects best backend)
nlp = load()
# Process Zomi text
text = "Tuni ka pai hi."
doc = nlp(text)
for token in doc:
print(f"{token.text:<12} {token.pos_:<8} {token.lemma_:<12} {token.ent_type_ or '':<8}")Output:
Tuni DATE tuni DATE
ka PRON ka N/A
pai VERB pai N/A
hi PART hi N/A
. PUNCT . N/A
📊 Performance
| Metric | Value |
|---|---|
| Speed | ~10,000 tokens/second |
| Memory | ~50MB |
| Dependencies | None (optional spaCy/Stanza) |
| Test coverage | 64% (95+ tests) |
🔧 Commands
# Check installation status
zomi-nlp --check
# Diagnose issues
zomi-nlp --doctor
# Process text from CLI
zomi-nlp "Tuni ka pai hi."📚 Documentation
🔄 Full Changelog
Added
ZomiTokenizer - Complete tokenization module
ZomiPOSTagger - Native POS tagging with 600+ lexicon
ZomiLemmatizer - Rule-based lemmatization
ZomiDependencyParser - Modular dependency parsing
ZomiNER - Rule-based named entity recognition
ZomiMorphologicalAnalyzer - Morpheme analysis
lexicons/ module with centralized word data
--doctor CLI command for diagnostics
95+ comprehensive tests
Changed
Native backend now prioritized over spaCy/Stanza
Reorganized native/ directory structure
Improved feature parsing with LRU caching
Better error messages for missing dependencies
Fixed
NER over-matching (no more "Pasian sian" issues)
Duplicate tokenization in pipeline
Morphological analyzer feature merging
CLI argument parsing for --doctor
🎯 Roadmap to v1.0
-
v0.2.0 - spaCy/Stanza backends
-
v0.3.0 - ZomiRuleBasedParser
-
v0.4.0 - Complete native pipeline | This version
-
v0.5.0 - Word embeddings
-
v0.6.0 - ML-based components
-
v1.0.0 - Production ready
🙏 Contributors
Zomi NLP Community
Zomi language speakers and linguists
Zomi NLP v0.3.0 - ZomiRuleBasedParser Backend
🎉 Zomi NLP v0.3.0 - Native Rule-Based Parser!
This release introduces the ZomiRuleBasedParser - a pure Python, rule-based Zomi NLP backend with no external dependencies!
✨ What's New
- ✅ ZomiRuleBasedParser - Complete rule-based parser (600+ lexicon entries)
- ✅ Clitic handling - Splits
ve,ta,hiam, etc. - ✅ Dependency parsing - Full CoNLL-U output
- ✅ Constituency trees - Phrase structure generation
- ✅ 16-column CoNLL-U export - With metadata (TEXT_EN, GENRE, SOURCE, etc.)
- ✅
--doctorCLI command - Diagnose installation issues - ✅ Better error messages - Clear, actionable fixes
🐛 Fixed
- Backend adapter parameter mapping
- ZomiToken field naming consistency
- CLI argument parsing for
--doctor
📦 Installation
pip install zomi-nlp==0.3.0🚀 Quick Start
from zomi_nlp import ZomiPipeline
# Auto-selects ZomiRuleBasedParser
nlp = ZomiPipeline()
doc = nlp("Ka pai ve.")
for token in doc:
print(f"{token.text}: {token.pos_}")v0.3.0rc1
Zomi NLP v0.2.1 - Documentation Update
v0.2.1 - Documentation Update
Fixed
- PyPI badge now points to official PyPI (not TestPyPI)
- Badge displays correctly on README
Note
This is a documentation-only release. No code changes.
The package on PyPI remains v0.2.0.
Installation
pip install zomi-nlpv0.1.6-alpha3 (Pre-release)
v0.1.6-alpha3 - 2026-04-23
- fix: correct release workflow for TestPyPI uploads
v0.1.6-alpha2 - 2026-04-23
- fix: sync pyproject version with latest release
v0.1.0-alpha1 - 2026-04-23
- style: apply ruff safe fix
- refactor: switch to PEP 621 versioning (Option A)
v0.1.6-alpha1 - 2026-04-23
v0.1.6 - 2026-04-23
- ci: replace deprecated readme-renderer check with twine check
v0.1.5 - 2026-04-23
- style: apply mypy fixes
v0.1.4 - 2026-04-23
- style: apply ruff safe fixes
v0.1.3 - 2026-04-23
- fix: resolve mypy errors across adapters and utils, add proper type annotations, unify TypedDicts, and improve optional spaCy/stanza handling
v0.1.2 - 2026-04-23
- fix: resolve Ruff warnings and clean up stanza/spacy availability checks
- style: apply ruff unsafe fixes
Changelog
v0.1.1 - 2026-04-22
- fix(scripts): point VERSION_FILE to zomi_nlp/version.py instead of local path
- chore: update bump_version script
- feat(scripts): add bump_version script under zomi_nlp/scripts
- fix: Update github action workflow
- fix: Resolve pydocstyle conflicts in pyproject.toml
- fix: running linting fix
v0.1.6-alpha2 (Pre-release)
v0.1.6-alpha2 - 2026-04-23
- fix: sync pyproject version with latest release
v0.1.0-alpha1 - 2026-04-23
- style: apply ruff safe fix
- refactor: switch to PEP 621 versioning (Option A)
v0.1.6-alpha1 - 2026-04-23
v0.1.6 - 2026-04-23
- ci: replace deprecated readme-renderer check with twine check
v0.1.5 - 2026-04-23
- style: apply mypy fixes
v0.1.4 - 2026-04-23
- style: apply ruff safe fixes
v0.1.3 - 2026-04-23
- fix: resolve mypy errors across adapters and utils, add proper type annotations, unify TypedDicts, and improve optional spaCy/stanza handling
v0.1.2 - 2026-04-23
- fix: resolve Ruff warnings and clean up stanza/spacy availability checks
- style: apply ruff unsafe fixes
Changelog
v0.1.1 - 2026-04-22
- fix(scripts): point VERSION_FILE to zomi_nlp/version.py instead of local path
- chore: update bump_version script
- feat(scripts): add bump_version script under zomi_nlp/scripts
- fix: Update github action workflow
- fix: Resolve pydocstyle conflicts in pyproject.toml
- fix: running linting fix
v0.1.6-alpha1 (Pre-release)
v0.1.6-alpha1 - 2026-04-23
v0.1.6 - 2026-04-23
- ci: replace deprecated readme-renderer check with twine check
v0.1.5 - 2026-04-23
- style: apply mypy fixes
v0.1.4 - 2026-04-23
- style: apply ruff safe fixes
v0.1.3 - 2026-04-23
- fix: resolve mypy errors across adapters and utils, add proper type annotations, unify TypedDicts, and improve optional spaCy/stanza handling
v0.1.2 - 2026-04-23
- fix: resolve Ruff warnings and clean up stanza/spacy availability checks
- style: apply ruff unsafe fixes
Changelog
v0.1.1 - 2026-04-22
- fix(scripts): point VERSION_FILE to zomi_nlp/version.py instead of local path
- chore: update bump_version script
- feat(scripts): add bump_version script under zomi_nlp/scripts
- fix: Update github action workflow
- fix: Resolve pydocstyle conflicts in pyproject.toml
- fix: running linting fix
v0.1.6-alpha4
v0.1.6-alpha4 - 2026-04-23
- fix: correct release workflow for test upload
- fix: correct release workflow for release
- fix: correct release workflow for TestPyPI uploads
v0.1.6-alpha3 - 2026-04-23
- fix: correct release workflow for TestPyPI uploads
v0.1.6-alpha2 - 2026-04-23
- fix: sync pyproject version with latest release
v0.1.0-alpha1 - 2026-04-23
- style: apply ruff safe fix
- refactor: switch to PEP 621 versioning (Option A)
v0.1.6-alpha1 - 2026-04-23
v0.1.6 - 2026-04-23
- ci: replace deprecated readme-renderer check with twine check
v0.1.5 - 2026-04-23
- style: apply mypy fixes
v0.1.4 - 2026-04-23
- style: apply ruff safe fixes
v0.1.3 - 2026-04-23
- fix: resolve mypy errors across adapters and utils, add proper type annotations, unify TypedDicts, and improve optional spaCy/stanza handling
v0.1.2 - 2026-04-23
- fix: resolve Ruff warnings and clean up stanza/spacy availability checks
- style: apply ruff unsafe fixes
Changelog
v0.1.1 - 2026-04-22
- fix(scripts): point VERSION_FILE to zomi_nlp/version.py instead of local path
- chore: update bump_version script
- feat(scripts): add bump_version script under zomi_nlp/scripts
- fix: Update github action workflow
- fix: Resolve pydocstyle conflicts in pyproject.toml
- fix: running linting fix