Skip to content
master
Switch branches/tags
Code
This branch is 1 commit behind SangitaNLP:master.
Contribute
Fetch upstream

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

SANGITA

forthebadge

Chat at Slack

GitHub open pull requests GitHub closed pull requests GitHub closed issues GitHub open issues

A Natural Language Toolkit for Indian Languages


What is Sangita?

Sangita is a natural language toolkit for Indian languages built in Python. The aim of the project is to provide basic Natural Language Functionalities that include tokenization, lemmatisation, stemming, named entity recognition and Part of Speech Tagging for popular Indian Languages with Deep Neural Networks being employed for some of these tasks.

Dependencies

* Keras
* Scikit Learn
* Corpus is Stored at Sangita Data

License

The code and the models are distributed under the Apache 2.0 License.

We have used the following datasets and their respective Liesnses are enclosed along with them.

  • Hindi Dependency Treebank - LANGUAGE TECHNOLOGIES RESEARCH CENTER, IIIT Hyderabad.

    • Creative Commons License Attribution-NonCommercial-ShareAlike 4.0 International.

Contributions

Issues relating to Girlscript Summer of Code are referenced with the respective tags:

* Cakewalk - 10 points
* Intermediate - 20 points
* Pro - 30 points
* TopCoder - 50 points

You can look at the ongoing issues here on this project board

Check out the first evaluation milestone

About

A Natural Language Toolkit for Indian Languages

Resources

License

Releases

No releases published

Packages

No packages published

Languages