Code and Corpus for Indian Language Computation
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
20wfrq.txt
Malayalam_morph_analyzer.rar
README.md
T_L.UNK.M0.LR.MRG
knd_morph.tgz
tel_morph.tgz

README.md

Corpusandcodes

Code and corpus for Indian language computation

This page contatins codes and corpora for morphological segmentation of Dravidian Languages. At first we have, Kannada, Malaylam, Telugu and Tamil. All the corpora is extracted from Amrita University, IIIT-H, IIIT-M Kerala implemented morphological analysers. It also contains cleaned Wikipidieda text for Kannada, Malaylam, Telugu and Tamil. As Github doesn't allow to include files that are bigger than 25 MB. We only upload the models.

For the entire corpus and codes, please contact - Arun - akallararajappan@\uoc.edu