"Datuk", the Unicode Malayalam - Malayalam dictionary dataset
Latest commit d600e7a May 21, 2013 1 @knadh First commit
Failed to load latest commit information.
corpus First commit May 21, 2013
utils First commit May 21, 2013
LICENSE First commit May 21, 2013
README.md First commit May 21, 2013


The Datuk Corpus

The Datuk corpus is a free and open source Malayalam–Malayalam dictionary dataset with over 106,000 definitions for more than 83,000 Malayalam words. It is an extensively refined and semanticized version of Datuk's original digitisation work incorporating tens of thousands of changes. The majority of words and definitions are grammar tagged, and a large number of records also have additional metadata attached to them.


For documentation and other information, visit http://olam.in/open/datuk



Kailash Nadh, May 2013 - http://nadh.in