Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
-
Updated
Jan 20, 2024 - Python
Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
Repository containing experimentation platform on how to train, infer on wav2vec2 models.
State-Of-The-Art & ready to use mini NLP models for Indian Languages
OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes
Generate large textual corpora for almost any language by crawling the web
A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanagari script.
Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"
इंग्रजी ते मराठीचा कोश. English to Marathi thesaurus.
Code for the ACL 2020 Paper on Schwa Deletion in Hindi and Punjabi
Finite-state script normalization and processing utilities
Handy Web App for performing OCR for Indian languages, a.k.a Indic Vision UI
Resources and tools for Indian language Natural Language Processing
Establish Semantic Relatedness across Languages Documentation - http://kshitijkarthick.github.io/tvecs
Indic evals for quantised models AWQ / GPTQ / EXL2
Lot Of Indic Tweets
Marathi author article processing and classification using custom vectorizer and sklearn
Python program to transliterate Malayalam text into ISO Latin script equivalents.
transliteration for indic language
KPT: Kannada Pre-trained Transformer
ShowCase Fork; Open source transliteration models for Indian languages (Roman script to Native scripts)
Add a description, image, and links to the indic-languages topic page so that developers can more easily learn about it.
To associate your repository with the indic-languages topic, visit your repo's landing page and select "manage topics."