Statistically generated Tigrinya English Word Dictionary
Finding a free Tigrinya-English dictionary for research or educational purposes is not easy, to say the least. Building a dictionary from scratch manually (the proper way) requires a lot of effort and time. As an option, presented here is a statistically generated bilingual lexicon between English and Tigrinya, using nothing but few algorithms and parallel corpora.
No human supervision was done, so the content is highly crude and contains numerous inaccuracies, but it is could be a good start for many applications.
- This word list CANNOT serve as a reference dictionary AS IS, but is useful for research and educational purposes.
- Parts of this work has been used to improve Free English-Tigrinya Bidirectional Dictionary android app.
- This resource is under the permissive MIT License, it can be freely used for any purpose with proper attribution.
- If you use this resource in a published work please cite as follows:
GeezLab Tigrinya BiLexicon, https://github.com/fgaim/Tigrinya-BiLexicon
- The dictionary is provided in plain text.
- Contains only single word matching, this is very limiting but was required to improve accuracy.
- Each line in the file comprises a Tigrinya entry followed by a corresponding English word separated by tab.
- The entries are sorted by probability of accuracy, i.e., entries at the top of the list are most likely accurate while those at the bottom could be nonsensical.
The building process is entirely probabilistic without any human intervention, and hence there is no guarantee of it's accuracy. It requires professional lexigoraphers to build a proper dictionary and this is just a mere list of word mappings.