Data for the quantitative study of (Vedic) Sanskrit
-
Updated
Aug 13, 2025 - Python
Data for the quantitative study of (Vedic) Sanskrit
Main application code for Ambuda, a breakthrough Sanskrit library (ambuda.org)
Code and data for "Summarising Historical Text in Modern Languages" (EACL 2021)
Raw dataset for Old Persian cuneiform
Official releases of the PROIEL treebank of ancient Indo-European languages
Syllable Analysis Data Augmentation (SADA), This project introduces a glyph dictionary and grammar-aware augmentation strategy designed to enhance Khmer palm leaf manuscript recognition. By modeling the language's grammatical structure, we support more robust OCR performance in low-resource settings.
An Ancient Greek Morphology Tagger
A tool for exploring the Linear A corpus
Semantic Dictionaries for Ancient Languages
The Ancient Greek dictionary for Hunspell (grc_GR for Notepad++, Google Chrome, Vivaldi etc).
Code and sample images described in the paper "DeepScribe: Localization and Classification of Elamite Cuneiform Signs Via Deep Learning"
A metafont-glyphs dataset which facilitate people to define CJK-like glyphs with their metafont scripts by machine learning
No-nonsense simple transliteration between writing systems, mostly of Semitic origin
Documentation for electronic Babylonian library (eBL) project
This project explores advanced document image recognition methods tailored for low-resource historical German manuscripts.
An array of tools for Sanskrit for tasks such as noun declension and verb conjugation.
A program for creating a searchable local language dictionary based (mainly) on dumped wiktionary data. Allows user to collect definitions which can be exported as a machine readable flashcard file. Currently supports Latin, Ancient Greek and Old English.
Online decimal to maya numeral converter.
This is the Jekyll repository which holds the syllabus for the Ancient Language Processing course
Contains a text fabric dataset of the Ugaritic corpus.
Add a description, image, and links to the ancient-languages topic page so that developers can more easily learn about it.
To associate your repository with the ancient-languages topic, visit your repo's landing page and select "manage topics."