NNLP-IL is a national initiative for the creation of infrastructure, research and development of advanced capabilities for the advancement of the field of NLP in Hebrew and Arabic.
We know what you're thinking.. (Why in english? 🤦♀️) - as for now we have decided english will work best for the NNLP-IL open source community, for more information see NNLP-IL Homepage.
NLP in Hebrew (and to a lesser extent also in Arabic) is left behind. The major breakthrough that will allow significant use has not yet been made, the cost of fitting and customizing each use case on its own is very high.
- Hebrew and Arabic are difficult languages (rich in morphology), most of the technological development is with morphologically thin languages.
- Modern language models require vast datasets. The accessible data in Hebrew is very limited.
- The industry's economic interest in investing in NLP in Hebrew (and to some extent also in Arabic) is limited compared to other common languages, since it is a relatively small market.
- Generic framework that will allow fitting and customizing solutions to various applications (without focusing on specific use cases).
- Open sourced (as much as possible) - Everyone can take part, contribute and use.
- Break through the data barrier - creating tagged and untagged datasets and make them accessible to the general public.
- Usability - distributing capabilities through manuals, convenient packaging of code and more.
- You!
- The Israeli Ministry of Defence Directorate of Defense Research and Development (DDR&D).
- Israel Innovation Authority.
- The Ministry of Innovation, Science & Technology.
- Index of open tools and resources for Natural Language Processing in Hebrew - The resources index that was created and managed by Shay Palachi and the NLPH community is now maintained by the NNLP-IL community (:pray: Thanks for the kickstarting the NNLP-IL community with it's first major contibution).
The main purpose of this repository is to increase the development in Hebrew and Arabic NLP, Making it relevant and easier to use. Read below to learn how you can take part in improving NNLP-IL.
Read our Code of Conduct that we expect project participants to adhere to. Please read the full text so that you can understand what actions will and will not be tolerated.
Read our Contributing Guide to learn about our development process, how to propose bugfixes and improvements, and how to build and test your changes to NNLP-IL.
NNLP-IL is Apache 2.0 licensed.