List of research and engineering of NLP for American Native/Indigenous Languages.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
LICENSE Initial commit Aug 19, 2017 Update Sep 3, 2018
_config.yml Set theme jekyll-theme-tactile Nov 17, 2017
challenges-slides.pdf COLING 2018 Slides added! Aug 22, 2018

About Naki


This page tries to assemble all the research on Natural Language Processing (NLP) for native and indigenous languages of the American continent. Our languages are in danger, especially if they don't get involved in the new digital boom, that is introduced even into the most remote communities. Nevertheless, scientific and engineering work has been done in the field, much more work is necessary to archive usable tools that can compete with the products from the big companies (as Google Translate, Alexa, etc.). To push forward this effort, this work wants to generate an (as much as possible) complete list.

Our main aim is to encourage native speakers, researchers, and engineers to participate in this effort. Hopefully, we can do it with these survey.

If you want more information, please read our paper: "Challenges of language technologies for the indigenous languages of the Americas". We also invite you to have a look at our presentation

Last Update: 3/Sep/2018

Table of Contents

  1. Machine Translation
  2. Automatic Lexical extraction
  3. Morphologcal analysis and segmentation
  4. Corpus and digital resources
  5. Speech Recognition
  6. POS Tagging
  7. Parsing
  8. OCR
  9. Spell checking
  10. WordNet
  11. Language ID
  12. Code-Switching and Multilingual NLP
  13. Tools, documentation and education
  14. Computational Linguistic Analyze and Surveys
  15. Contact

Machine Translation

Online demos and software

Scientific papers and thesis

Automatic Lexical Extraction

Scientific papers

Corpus and digital resources

Online Corpus Resources

Scientific papers

Morphologcal analysis and segmentation


Scientific Papers

Speech Recognition

POS Tagging



Spell checking


Language ID

Code-Switching and Multilingual NLP

Tools, documentation and education

Computational Linguistic Analyze and Surveys


This effort can be completed only with the cooperation of all visitors. If you know about some work in this field, please let me know and push to this repositoy or send an email to mmager [at] or visit my personal web page.

How to cite

If you found this information usfull for your academic research please acknowledge its use with a citation:

Mager, M., Gutierrez. X., Sierra, G., and Meza, I. (2018, August). Challenges of language technologies for the Americas indigenous languages. In Proceedings of the 27th international conference on Computational linguistics. Association for Computational Linguistics.

  author = 	"Mager, Manuel
		and Gutierrez-Vasques, Ximena
		and Sierra, Gerardo
		and Meza-Ruiz, Ivan",
  title = 	"Challenges of language technologies for the indigenous languages of the Americas",
  booktitle = 	"Proceedings of the 27th International Conference on Computational Linguistics",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"55--69",
  location = 	"Santa Fe, New Mexico, USA",
  url = 	""