Skip to content

johnson-magic/Awesome-Document-Layout-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Awesome-Document-Layout-Analysis

A curated list of resources dedicated to document layout analysis

1. Papers

  • *CODE means official code and CODE means not official code
Conf. Date Title Highlight code/cite
ICDAR2021 2021/5/13 VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations V,S,R *CODE
KDD 2020 2020/6/16 LayoutLM: Pre-training of Text and Layout for Document Image Understanding multimodal/pretrain *CODE
ACL2021 2022/1/10 LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding multimodal/pretrain *CODE
Conf. Date Title Highlight code/cite
KDD2018 2018/5/24 Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale pipeline none
Conf. Date Title Highlight code/cite
ICDAR2017 2017/11/9 DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images table det 193
Conf. Date Title Highlight code/cite
ITC-irst Technical Report 1998 Geometric layout analysis techniques for document image understanding: a review traditional 216
IWDAS 2002/8/19 Two geometric algorithms for layout analysis traditional 231
PSDIUT 2003 High performance document layout analysis traditional 139

2. Datasets

2.1 Introduction

Dataset Description dataset link
PubLayNet PubLayNet is a large dataset of document images, of which the layout is annotated with both bounding boxes and polygonal segmentations.The annotations are automatically generated by matching the PDF format and the XML format.The size of the dataset is comparable to established computer vision datasets, containing over 360 thousand document images, where typical document layout elements are annotated. PubLayNet
DocBank DocBank is a new large-scale dataset that is constructed using a weak supervision approach. It enables models to integrate both the textual and layout information for downstream tasks. The current DocBank dataset totally includes 500K document pages, where 400K for training, 50K for validation and 50K for testing. DocBank

2.2 Comparison of datasets for table structure recognition

TO DO

3. Other technical solutions

3.1 Relevant research institutions and scholars

3.2 Related competitions

3.3 Related lecture

About

A curated list of resources dedicated to document layout analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published