BA-thesis in history.
-
Updated
Jul 13, 2017 - Python
BA-thesis in history.
A more complete example of programming with PDFMiner, which continues where the default documentation stops
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
OCR-D wrapper for page-xml-draw
A powerful CLI tool for visualization and encoding of PAGE-XML files
OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
A Unified Toolkit for Deep Learning Based Document Image Analysis
A python package to structure files using visual and style informations
PdfDet aims to simplify PDF layout detect tasks for users.
A toolbox of OCR models, algorithms, and pipelines based on MindSpore
Analysis of Chinese and English layouts 中英文版面分析
OCR engine for all the languages
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
Add a description, image, and links to the layout-analysis topic page so that developers can more easily learn about it.
To associate your repository with the layout-analysis topic, visit your repo's landing page and select "manage topics."