This repository contains State of the Art Language models and Classifier for Nepali, which is official language of Nepal and one of the official status gained language of India.
The models trained here have been used in Natural Language Toolkit for Indic Languages (iNLTK)
Architecture/Dataset | Nepali Wikipedia Articles |
---|---|
ULMFiT | 31.5 |
TransformerXL | 29.3 |
Dataset | Accuracy | Kappa Score |
---|---|---|
Nepali News Dataset | 98.5 | 97.7 |
Architecture | Visualization |
---|---|
ULMFiT | Embeddings projection |
TransformerXL | Embeddings projection |
Download pretrained Language Models from here
Download classifier from here
Trained tokenizer using Google's sentencepiece
Download the trained model and vocabulary from here