Skip to content

VarunGumma/indic_nlp_library

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Indic NLP Library

This repository is a de-bloated fork of the original Indic NLP Library and integrates UrduHack submodule and Indic NLP Resources directly. This allows to work with Urdu normalization and tokenization without needing to install urduhack and indic_nlp_resources separately, which can be an issue sometimes as it is TensorFlow based. This repository is mainly created and mainted for IndicTrans2 and IndicTransTokenizer

For any queries, please get in touch with the original authors/maintainers of the respective libraries:

Usage:

git clone https://github.com/VarunGumma/indic_nlp_library.git

cd indic_nlp_library
pip install --editable ./

Updates:

  • Integrated urduhack directly into the repository.
  • Renamed master branch as main.
  • Integrated indic_nlp_resources directly into the repository.
  • De-bloated the repository.

About

Resources and tools for Indian language Natural Language Processing

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%