This curated list is an attempt to organize research and available tools in Nepali Natural Language Processing. This is by no means an exhaustive list, so contributions are welcome.
- DevNet: An Efficient CNN Architecture for Handwritten Devanagari Character Recognition (Guha et al., Nov 2019)
- Optical Character Recognition System for Nepali Language Using ConvNet (Sharma, Manish K. and Bhattarai, Bidhan. 2017)
- Dictionary Based Nepali Word Recognition using Neural Network (Dawadi et al., 2017)
- Nepali Character Recognition Using Deep Belief Nets (Neupane Aadesh, 2017)
- Improving Nepali OCR performance by using hybrid recognition approaches (Pant, Nirajan & Bal, B.K., Jul 2016)
- Literature Review of Segmentation Problems in Nepali Optical Character Recognition (Bal, Bal Krishna and Pant, Nirajan 2016)
- Deep Learning Based Large Scale Handwritten Devanagari Character Recognition (Pant et. al., 2015)
- Off-line Nepali Handwritten Character Recognition Using Multilayer Perceptron and Radial Basis Function Neural Networks (Pant et al., 2012)
- Research Report on the Nepali OCR (Bal B.K. & Rupakheti P. Sep 2009)
- Combining Multiple Feature Extraction Techniques for Handwritten Devnagari Character Recognition (Arora et al., 2008)
- Improving Nepali News Classification Using Bidirectional Encoder Representation from Transformers (Kafle et al., Nov 2022
- Comparative Analysis of Nepali News Classification using LSTM, Bi-LSTM and Transformer Model (Wagle, S.S. & Thapa S., Oct 2021)
- Vector representation based on a supervised codebook for Nepali documents classification (Sitaula et al., Mar 2021)
- Detecting Clickbaits on Nepali News using SVM and RF (Dam et al., Mar 2021)
- An Analysis of Classification Algorithms for Nepali News (Acharya K. and Shakya S., Jul. 2020)
- Plagiarism Detection Framework Using Monte Carlo Based Artificial Neural Network for Nepali Language (Bachchan, R.K. & Timalsina A.K., Oct 2018)
- Nepali SMS filtering Using Decision Trees, Neural Network and Support Vector Machine (Shahi T.B & Shakya S., Oct 2018)
- Nepali Text Document Classification Using Deep Neural Network (S. Subba, N. Paudel, and T. Shahi, Jun. 2019)
- Improving Nepali News Recommendation Using Classification Based on LSTM Recurrent Neural Networks (Basnet A. and Timalsina A., 2018)
- Plagiarism Detection Framework Using Monte Carlo Based Artificial Neural Network for Nepali Language (Bachchan R. K. & Timalsina A. K., 2018)
- Automated News Classification using N-gram Model and Key Features of Nepali Language (Dangol et al., 2018)
- Nepali News Classification using Naïve Bayes, Support Vector Machines and Neural Networks (Shahi and Pant, 2018)
- Nepali Multi-Class Text Classification (Singh Oyesh M., 2018)
- Trend Analysis of Technology News in Nepali Newspapers. A case study: The Kathmandu Post (Adhikari S. and Timalsina A., 2017)
- Improving Nepali Document Classification by Neural Network (Kafle et el., 2016)
- Mobile SMS Spam Filtering for Nepali Text Using Naïve Bayesian and Support Vector Machine (Shahi and Yadav, 2014)
- A lexicon pool augmented Naive Bayes Classifier for Nepali Text (Thakur and Singh, 2014)
- A Comparative Analysis of Particle Swarm Optimization and K-means Algorithm For Text Clustering Using Nepali Wordnet (Sarkar et al., 2014)
- Development of Nepali Character Database for CharacterRecognition based on Clustering (Neupane, Adesh; 2014)
- Semantic Text Clustering Using Enhanced Vector Space Model Using Nepali Language (Chiranjibi Sitaula, 2012)
- A Machine Learning Approach to Anaphora Resolution in Nepali Language (Senapati et al., July 2020)
- Anaphoric Resolution in Nepali (Dev Bahadur Poudel and Bivod Aale Magar)
- LDC-IL: The Indian repository of resources for language technology (Choudhary Narayan, Jan 2021)
- Construction and annotation of a corpus of contemporary Nepali (Yadava et al., 2008)
- Annotation Projection-based Dependency Parser Development for Nepali (Rai, Pooja & Chatterji, Sanjay, Dec 2022
- A Conceptual Graph Approach to the Parsing of Projective Sentences (Pradhan et al., 2019)
- Parsing in Nepali Language Using Linear Programming Problem (A. Yajnik, F. Bhutia, and S. Borah, 2019)
- Parsing Techniques using Paninian Framework on Nepali Language (A. Yajnik and D. Sharma, Nov. 2015)
- Report on Nepali Computational Grammar (Rupakheti et al.)
- Argument Structure of Nepali Verbs: A Study on Lexico-Semantic Ambiguity (KM Manger, 2018)
- Inflection and derivation in Nepali Noun, adjective and adverb 1 Inflection and derivation in Nepali (Dr. Laxmi Prasad Khatiwada, 2013)
- Collation Sequence in Nepali - PAN Localization (Gurung S. & Khatiwada L.P.)
- A collocation-based approach to Nepali postpositions (Hardie, Andrew Jun. 2008)
- Collocational properties of adpositions in Nepali and English (Hardie, Andrew Jun. 2007)
- (Book) Contemporary issues in Nepalese linguistics (Yadava et al., 2005)
- Architectural and System Design of the Nepali Grammar Checker (B.K. Bal, P. Shrestha and M.P. Pustakalaya)
- Structure of Nepali Grammar (Bal Krishna Bal, 2004)
- Report on Nepali Computational Grammar (Rupakheti et al.)
- (Book) Beyond Preferred Argument Structure: Sentences, pronouns, and given referents in Nepali (Genetti C. & Crain L. D., Sep 2003)
- Benefactive Constructions in Nepali (Madhav P. Poudel, 2000)
- Aspects of Nepali Grammar (Santa Barbara Papers in Linguistics, Volume 6, 1994 Dept. of Linguistics, UCSB)
- (Book) A Descriptive Grammar of Nepali and an Analyzed Corpus (Acharya Jayaraj, Jun. 1991)
- NepBERTa: Nepali Language Model Trained in a Large Corpus (Timilsina et al., Nov 2022
- Nepali Encoder Transformers: An Analysis of Auto Encoding Transformer Language Models for Nepali Text Classification (Maskey et al., Jun 2022
- Comparative Evaluation of Transformer-Based Nepali Language Models (Tamrakar S.R. & Silpasuwanchai C., 2022)
- Preprocessing of Nepali News Corpus for Downstream Tasks (Awale et al., Aug 2022
- The Use of N-Gram Language Model in Predicting Nepali Words (Khadka, Bal Ram, May 2022
- Encoder Decoder based Nepali News Headline Generation (Mishra et al., Sep 2020)
- Cross-lingual Language Model Pretraining (Conneau A. & Lample G., 2019)
- An Experience in Developing the Nepali Sense Tagged Corpus (Sarkar et. al., 2015)
- Some Challenges of Automated Annotation in A Multilingual Scenario (Roy et.al, 2014)
- A Proposed Nepali Synset Entry and Extraction Tool (Roy et. al., 2012)
- Extending corpus annotation of Nepali: advances in tokenisation and lemmatisation (Hardie et al., 2011)
- Nepali Lexicon (Khatiwada, Laxmi P. and Gurung S., 2007)
- Nepali Lexicon Development (Bista et al.)
- Morph Analyzer of Verbs in Nepali Language (Bhutia et al., 2021)
- A vowel based word splitter to improve performance of existing Nepali morphological analyzers on words borrowed from Sanskrit (Adhikari M. & Neupane A., Apr 2020)
- Design of a Morph Analyzer for Non-Declinable Adjectives of Nepali Language (Borah et al., 2017)
- Development of a Morph Analyser for Nepali noun token (Chhetri et al., 2015)
- Building Morphological Analyzer for Nepali (Rai R. and Bhat, S.M., 2012)
- A Computational Analysis of Nepali Morphology: A Model For Natural Language Processing (Balaram Prasain, PhD, 2011)
- A Morphological Analyzer and a Stemmer for Nepali (Bal Krishna Bal)
- A morphosyntactic categorisation scheme for the automated analysis of Nepali (Hardie et al., Dec 2009)
- Morphological analysis of verbs in Nepali (Basnet, S.B. and Pandey, S.B., 2009)
- Nelralec/Bhasha Sanchar Working Paper 2 Categorisation for automated morphosyntactic analysis of Nepali: introducing the Nelralec Tagset (NT-01) (Hardie et al., 2005)
- Named Entity Recognition for Nepali: Data Sets and Algorithms (Niraula, Nobal & Chapagain, Jeevan. May 2022)
- Named Entity Recognition (NER) for Nepali (Bal et al., 2019)
- Named Entity Recognition for Nepali Language (Singh et al., 2019)
- Named Entity Recognition for Nepali Text Using Support Vector Machines (Bam and Shahi, 2014)
- Named Entity Recognition for Nepali language: A Semi Hybrid Approach (Dey et al., 2014)
- Probabilistic and Neural Network Based POS Tagging of Ambiguous Nepali text: A Comparative Study (Pradhan A. & Yajnik A., Feb 2021)
- Fine-grained part-of-speech tagging in Nepali text (Shrestha I. & Dhakal S. S., 2021)
- Nepali POS Tagging using Deep Learning Approaches (Sayami et al., Dec 2019)
- A Deep Learning Approach for Part-of-Speech Tagging in Nepali Language (Prabha et al., 2018)
- General Regression Neural Network Based PoS Tagging for Nepali Text (Archit Yajnik, Apr 2018)
- ANN Based POS Tagging For Nepali Text (Archit Yajnik, 2018)
- Part of Speech Tagging Using Statistical Approach for Nepali Text (Archit Yajnik, 2017)
- Enhancing the Performance of Part of Speech tagging of Nepali language through Hybrid approach(Sinha et al., 2015)
- Hidden Markov Model based Part of Speech Tagging for Nepali language (Paul et al., 2015)
- Support Vector Machines based Part of Speech Tagging for Nepali Text (Shahi and Dhamala, 2013)
- Hidden Markov Model Based Probabilistic Part Of Speech Tagging For Nepali Text (Jaishi M.R. Jan 2009)
- Multi-channel CNN to classify nepali covid-19 related tweets using hybrid features (Sitaula, C. & Shahi, T.B., Mar 2022)
- A Hybrid Feature Extraction Method for Nepali COVID-19-Related Tweets Classification (Sitaula et al., Mar 2022)
- Deep Learning-Based Methods for Sentiment Analysis on Nepali COVID-19-Related Tweets (Sitaula et al., Nov 2021)
- Aspect Based Abusive Sentiment Detection in Nepali Social Media Texts (Singh et al., Dec 2020)
- Named-Entity Based Sentiment Analysis of Nepali News Media Texts (Bal et al., Dec 2020)
- Aspect Based Sentiment Analysis of Nepali Text Using Support Vector Machine and Naive Bayes (Tamrakar et al., Nov 2020)
- Twitter Sentiment analysis during COVID-19 Outbreak in Nepal (Pokharel B.P., Jun 2020)
- Sentiment Analysis in Nepali: Exploring Machine Learning and Lexicon-based Approaches (Piryani, Rajesh et al. , Jan 2020)
- Classifying sentiments in Nepali subjective texts (Thapa et al., 2016)
- Detecting Sentiment in Nepali texts: A bootstrap approach for Sentiment Analysis of texts in the Nepali language (Gupta et al., 2015)
- Sentiment Analysis on Nepali Movie Reviews using Machine Learning (A. Yadav and A. K. Pant, 2014)
- Large Vocabulary Continuous Speech Recognition for Nepali Language (Baral, Elina & Shrestha, Sagar. Dec 2020)
- Nepali Speech Recognition Using CNN, GRU and CTC (B. Bhatta, B. Joshi, and R. Maharjhan, Sep. 2020)
- Nepali Speech Recognition using RNN-CTC Model (P. Regmi, A. Dahal, and B. Joshi, Jul. 2019)
- Crowd-Sourced Speech Corpora for Javanese, Sundanese, Sinhala, Nepali, and Bangladeshi Bengali (Kjartansson et al., 2018)
- HMM based isolated word Nepali speech recognition (M. K. Ssarma, A. Gajurel, A. Pokhrel, and B. Joshi, Jul. 2017)
- Nepali Spell Checker 1.1 and the Thesaurus, Research and Development Bal Krishna Bal et. al., 2007)
- Nepali Spell Checker. (Bal Krishna Bal et. al.)
- A Survey on Various Stemming Techniques for Hindi and Nepali Language (Upadhyaya et al., Aug 2021)
- A novel rule-based recursive stemming algorithm for Nepali Plagiarism Detection (Shah et al., 2020)
- A Nepali Rule Based Stemmer and its performance on different NLP applications (Koirala P. & Shakya A., 2020)
- A new stemmer for Nepali language (Shrestha and Dhakal, 2016)
- An Affix Removal Stemmer for Natural Language Text in Nepali (Paul et al., 2014)
- A Hybrid Algorithm for Stemming of Nepali Text (Chiranjibi Sitaula, 2014)
- Attention based Recurrent Neural Network for Nepali Text Summarization (Timalsina et al., Jun 2022)
- Extractive Method for Nepali Text Summarization Using Text Ranking and LSTM (Khanal et al., Oct 2021)
- Encoder Decoder based Nepali News Headline Generation (Mishra et. al., Sep 2020)
- Salient Sentence Extraction of Nepali Online Health News Texts (Ranabhat et al., 2019)
- Natural language processing for Nepali text: a review (Shahi T.B. & Sitaula, Chiranjabi, Oct 2021)
- Survey of NLP Resources in Low-Resource Languages Nepali, Sindhi and Konkani (Rajan, Annie & Salgaonkar, Ambuja, Jul 2021)
- Towards building advanced natural language applications: an overview of the existing primary resources and applications in Nepali (Bal Krishna Bal, 2009)
- Nepali Text to Speech Synthesis System using FreeTTS (Shah et al., 2018)
- Building a Natural Sounding Text-to-Speech System for the Nepali Language - Research and Development Challenges and Solutions (Bajracharya et al., Aug 2018)
- Enhancing the Quality of Nepali Text-to-Speech Systems (Bal, Bal Krishna and Ghimire, Rupak Raj; Aug 2017)
- Nepali Text to Speech using Time Domain Pitch Synchronous Overlap Add Method (Malla P. 2015)
- Nepali Text to Speech Synthesis System using ESNOLA Method of Concatenation (Chettri B. & Shah K.B., Jan 2013)
- Statistical and Syllabification Based Model for Nepali Machine Transliteration (Roy et al., Jul 2022
- Low Resource English to Nepali Sentence Translation Using RNN—Long Short-Term Memory with Attention (Nemkul K. & Shakya S., Mar 2021)
- English to Nepali Sentence Translation Using Recurrent Neural Network with Attention (Nemkul K. & Shakya S., Feb 2021)
- Efforts in the Development of an Augmented English–Nepali Parallel Corpus (Duwal S. & Bal B.K., Dec 2019)
- The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English (Guzman et al. (2019))
- Neural Machine Translation: Hindi ⇔ Nepali (Laskar et al., 2019)
- A Comparative Study of SMT and NMT: Case Study of English-Nepali Language Pair (Bal, Bal Krishna and Acharya, Praveen 2018)
- English to Nepali Statistical Machine Translation System (Paul et. al., 2018)
- Expansion of the First Hindi-Nepali Word-Net based BiLingual Dictionary and the advancement of the HumanMachine Interface (Chakraborty et. al., Dec 2011)
- Development of a Nepali-English MT system using the Apertium MT platform
- An Approach Towards The Construction Of The First Hindi-Nepali Word-Net Based Bi-Lingual Dictionary And The Challenges Handled (Chakraborty et al., 2011)
- Experiences in building the Nepali WordNet - insights and challenges (Chakraborty et. al., 2009)
- Rule based machine translation system in the context of Nepali text to English text (Shrestha, H. K., 2008)
- Handling Honorification in Dobhase: Online English-to-Nepali Machine Translation System (Keshari et al., 2007)
- Generation of Interlinear form of Nepali text with target language as English (Shrestha et al., 2005)
- UNL Nepali Deconverter (Keshari et al., 2005)
- Nepali Word-Sense Disambiguation Using Variants of Simplified Lesk Measure (Singh et al., Aug 2021)
- Word Sense Disambiguation using WSD specific Wordnet of Polysemy Words (Dhungana et al., Sep 2014)
- Word sense disambiguation in Nepali language (Dhungana U.R. & Shakya S., May 2014)
- Word Sense Disambiguation using Clue Words (Dhungana U.R. & Shakya S., 2014)
- Knowledge based approaches to nepali word sense disambiguation (Rey et. al., 2014)
- Nepali WSD Specific WordNet (Dhungana U.R., 2012)
- Nepali Word Sense Disambigution using Adapted Lesk Algorithm (Dhungana U.R., 2011)
- Resources for Nepali Word Sense Disambiguation (Shrestha et al., Oct. 2008)
- Word Sense Disambiguation; a Brief Survey with Application to Nepali (Shrestha et al., Jan 2008)
- Exploiting linguistic information from Nepali transcripts for early detection of Alzheimer's disease using natural language processing and machine learning techniques (Adhikari et al., Dec 2021)
- Detecting Alzheimer’s Disease by Exploiting Linguistic Information from Nepali Transcript (Thapa et al., 2020)
- Linguistic Taboos and Euphemisms in Nepali (Nobal B. Niraula and Saurab Dulal and Diwa Koirala, 2020)
- Issues in Encoding the Writing of Nepal’s Languages (Hall et al., 2014)
- Research Report on PDA Localization (Bal et al.)
- 300-Dimensional Word Embeddings for Nepali Language (Lamsal Rabindra)
- fasttext embeddings
- ELMo embeddings (and more, for many South Asian languages)
- Byte Pair Embeddings
- NPVec1
- Nepali NLP Research Papers
- Nepali NLP Resources
- Nepali NLP Toolkit
- Machine learning datasets for Nepali Researchers
If there has been a mistake of any kind (paper name, link, author attribution and so on) or you want me to add new papers related to Nepali NLP, feel free to open an issue describing the case and I'll make sure to correct the mistake or add that paper. In case you, want to suggest changes or updates yourself then fork the repository and create a pull request.