My book list
-
Updated
Jul 7, 2024
My book list
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German
Crawler for linguistic corpora
data resource untuk NLP bahasa indonesia
A list of Indonesian NLP resources.
Data for the quantitative study of (Vedic) Sanskrit
My solutions to selected exercises to "Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit" by Steven Bird, Ewan Klein, and Edward Loper.
Amharic English Machine Translation Corpus prepared through website crawelling and custom preprocessing.
A web-based engine for creating and annotating textual corpora
An advanced, extensible web front-end for the Manatee-open corpus search engine
Kanji usage frequency data collected from various sources
A curated list of NLP resources for Hungarian
🕷️ The pipeline for the OSCAR corpus
DEPRECATED - replaced by https://github.com/Esukhia/derge-kangyur
Yet another search platform for linguistic corpora.
SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/
Quran, Hadith, Translations, Tafaseer, Corpus Linguistics. Everything for NLP
Large silver standart Russian corpus with NER, morphology and syntax markup
Add a description, image, and links to the corpus-linguistics topic page so that developers can more easily learn about it.
To associate your repository with the corpus-linguistics topic, visit your repo's landing page and select "manage topics."