Build software better, together

This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.

python pdf csv table annotations cluster-analysis document-processing layout-analysis detection-model page-segmentation

Updated Sep 11, 2020
Python

VRI-UFPR / ocrd-gbn

Star

OCR-D compliant toolset for optical layout recognition on historical german-language documents published in Brazil

ocr tensorflow segmentation binarization layout-analysis historical-documents ocr-d

Updated Sep 24, 2021
Python

yoshihikoueno / pdfminer-layout-scanner

Star

A more complete example of programming with PDFMiner, which continues where the default documentation stops

python pdf text-extraction pdfminer layout-analysis

Updated Jul 24, 2019
Python

pleb631 / PdfDet

Star

PdfDet aims to simplify PDF layout detect tasks for users.

document-analysis layout-analysis pdf-document-processor layout-parser layout-detection

Updated Mar 28, 2024
Python

RapidAI / RapidLayoutRecover

Star

针对文档类图像，整合版面分析、文字识别、表格识别和公式识别结果，还原版面布局信息。

layout-analysis layout-recover

Updated Aug 20, 2024
Python

VRI-UFPR / page-xml-draw

Star

A powerful CLI tool for visualization and encoding of PAGE-XML files

visualization opencv ocr segmentation image-map page-xml layout-analysis

Updated May 19, 2021
Python

ixalodecte / filestruct

Star

A python package to structure files using visual and style informations

pdf parser layout-analysis

Updated Mar 9, 2024
Python

diegosiqueir4 / deepdoctection

Star

A Repo For Document AI

ocr layout-analysis

Updated Apr 27, 2023
Python

VRI-UFPR / ocrd-page-xml-draw

Star

OCR-D wrapper for page-xml-draw

visualization ocr segmentation page-xml layout-analysis ocr-d page-xml-draw

Updated May 1, 2021
Python

eustro / michael

Star

BA-thesis in history.

ocr pos-tagging layout-analysis historical-documents tree-tagger michael-the-syrian

Updated Jul 13, 2017
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

layout-analysis

Here are 20 public repositories matching this topic...

opendatalab / MinerU

Layout-Parser / layout-parser

mittagessen / kraken

mindspore-lab / mindocr

RapidAI / RapidLayout

NormXU / Layout2Graph

JPLeoRX / detectron2-publaynet

MaitySubhajit / SelfDocSeg

ppaanngggg / yolo-doclaynet

CaseDrive / publaynet-models

MBAigner / PDFSegmenter

VRI-UFPR / ocrd-gbn

yoshihikoueno / pdfminer-layout-scanner

pleb631 / PdfDet

RapidAI / RapidLayoutRecover

VRI-UFPR / page-xml-draw

ixalodecte / filestruct

diegosiqueir4 / deepdoctection

VRI-UFPR / ocrd-page-xml-draw

eustro / michael

Improve this page

Add this topic to your repo