OCR engine for all the languages
-
Updated
Jun 19, 2024 - Python
OCR engine for all the languages
Python tools for performing various operations on ALTO XML files
Convert ALTO XML to plain text + minimal metadata
Extract the MODS/ALTO metadata of a bunch of METS/ALTO files into pandas DataFrames for data analysis
ALTO XML coordinates highlighting application for validating the coordinates values
A pipeline to transfer ground truth from Transkribus to eScriptorium.
OCR engine for all the languages
TIFF Image - Converted into OCR XML using Tesseract
Scripts I wrote at my job which could be helpful to others
Add a description, image, and links to the alto-xml topic page so that developers can more easily learn about it.
To associate your repository with the alto-xml topic, visit your repo's landing page and select "manage topics."