YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
-
Updated
Jul 5, 2024 - Python
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
OCR engine for all the languages
Analysis of Chinese and English layouts 中英文版面分析
A toolbox of OCR models, algorithms, and pipelines based on MindSpore
PdfDet aims to simplify PDF layout detect tasks for users.
A python package to structure files using visual and style informations
A Unified Toolkit for Deep Learning Based Document Image Analysis
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil
A powerful CLI tool for visualization and encoding of PAGE-XML files
OCR-D wrapper for page-xml-draw
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
A more complete example of programming with PDFMiner, which continues where the default documentation stops
BA-thesis in history.
Add a description, image, and links to the layout-analysis topic page so that developers can more easily learn about it.
To associate your repository with the layout-analysis topic, visit your repo's landing page and select "manage topics."