hocr

Here are 38 public repositories matching this topic...

UglyToad / PdfPig

Read and extract text and other content from PDFs in C# (port of PDFBox)

pdf csharp pdfbox netstandard pdf-files pdf-document pdf-generation hocr document-analysis pdf-extractor alto-xml page-xml layout-analysis pdf-document-processor

Updated Jun 29, 2025
C#

manisandro / gImageReader

Star

A Gtk/Qt front-end to tesseract-ocr.

c-plus-plus gtk qt ocr scanner tesseract-ocr pdf-document hocr hocr-documents

Updated Jun 4, 2025
C++

mittagessen / kraken

Star

OCR engine for all the languages

ocr neural-networks hocr optical-character-recognition htr handwritten-text-recognition alto-xml page-xml layout-analysis

Updated Jun 27, 2025
Python

BobLd / DocumentLayoutAnalysis

Sponsor

Star

Document Layout Analysis resources repos for development with PdfPig.

pdf csharp hocr tei hocr-documents alto-xml table-extraction page-xml alto layout-analysis document-layout-analysis xycut docstrum pdfpig xy-cut recursive-xy-cut page-segmentation

Updated Oct 1, 2023
C#

UB-Mannheim / ocr-fileformat

Star

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

ocr validation transformation hocr finereader page-xml alto ocr-d

Updated May 21, 2025
JavaScript

cneud / ocr-conversion

Star

Conversions between various OCR formats

ocr hocr tei-xml alto-xml page-xml abbyy-xml

Updated May 13, 2023

dbmdz / mirador-textoverlay

Star

Text Overlay plugin for Mirador 3

ocr hocr optical-character-recognition iiif mirador-plugins alto-xml mirador alto mirador-3

Updated Jun 18, 2025
JavaScript

filak / hOCR-to-ALTO

Star

Convert between Tesseract hOCR and ALTO XML using XSL stylesheets

hocr xsl alto xslt2 xsl-stylesheets

Updated May 22, 2025
XSLT

UB-Mannheim / ocr-gt-tools

Star

Ergonomic line-by-line transcription of scanned text.

ocr web-interface hocr transcription ground-truth

Updated Dec 16, 2020
JavaScript

dmi3kno / hocr

Star

Text-to-tibble

r ocr tesseract rstats tesseract-ocr hocr hocr-documents tibble

Updated Apr 25, 2020
R

fakabbir / OCR

Star

Probabilistic Key Value pair extraction using word weights from Invoices - Non Searchable PDF

ocr tesseract python3 hocr

Updated Jun 12, 2021
Python

macabeus / pyslibtesseract

Star

✏️ Integration of Tesseract for Python using a shared library

ocr tesseract hocr

Updated Mar 25, 2016
Python

GeReV / hocr-editor-ts

Star

A visual hOCR file editor

ocr tesseract-ocr hocr hocr-documents

Updated Apr 3, 2024
TypeScript

hadro / new-york-city-directories

Star

Some basic data and text extraction from the New York City Directories

ocr brooklyn digital-humanities hocr pdfs manhattan nypl new-york-city-directories

Updated Jun 19, 2017

GeReV / HocrEditor

Star

A visual editor for .hocr files.

ocr tesseract-ocr hocr hocr-documents

Updated Feb 5, 2025
C#

mayurcybercz / AI-Exam-evaluation

Star

CLI-Tool to recognise handwritten text from answer sheets using Tesseract OCR. Using this extracted text to evaluate marks using NLP

python nlp cli json nltk tesseract-ocr hocr answer-sheets evaluate-marks

Updated Feb 14, 2019
Jupyter Notebook

hadro / brewery-guides

Star

The data for guides to breweries across the United States from 1896 to 1918

data open-data dataset digital-humanities hocr brewing nypl digital-collections brewers brewery-guides brewing-history

Updated Apr 12, 2017

jlieth / hocr-parser

Star

Python parser for hOCR files using lxml

python ocr hocr parsing-library hocr-documents

Updated Aug 23, 2020
Python

emmeryn / hocr-turtletext

Star

A gem that parses positional text from hOCR output and provides convenience methods to find text.

gem ruby-on-rails hocr extract-text

Updated Oct 20, 2022
Ruby

ansonl / flyspacea-backend

Star

Fly Space-A Facebook flight schedule photo aggregator and processor back-end server.

golang tesseract fuzzy hocr

Updated Mar 4, 2019
Go

Improve this page

Add a description, image, and links to the hocr topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hocr topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hocr

Here are 38 public repositories matching this topic...

UglyToad / PdfPig

manisandro / gImageReader

mittagessen / kraken

BobLd / DocumentLayoutAnalysis

UB-Mannheim / ocr-fileformat

cneud / ocr-conversion

dbmdz / mirador-textoverlay

filak / hOCR-to-ALTO

UB-Mannheim / ocr-gt-tools

dmi3kno / hocr

fakabbir / OCR

macabeus / pyslibtesseract

GeReV / hocr-editor-ts

hadro / new-york-city-directories

GeReV / HocrEditor

mayurcybercz / AI-Exam-evaluation

hadro / brewery-guides

jlieth / hocr-parser

emmeryn / hocr-turtletext

ansonl / flyspacea-backend

Improve this page

Add this topic to your repo