Skip to content

Latest commit

 

History

History
51 lines (46 loc) · 10 KB

document-analysis-and-understanding.md

File metadata and controls

51 lines (46 loc) · 10 KB

CVPR-2023-Papers

Application App
New collections Conference

Document Analysis and Understanding

Section Papers Preprint Papers Papers with Open Code Papers with Video

Title Repo Paper Video
Towards Flexible Multi-Modal Document Models
CVPR - Highlight
GitHub Page
GitHub
thecvf
arXiv
YouTube
Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling thecvf
arXiv
YouTube
Unifying Layout Generation with a Decoupled Diffusion Model thecvf
arXiv
YouTube
Conditional Text Image Generation with Diffusion Models thecvf YouTube
Turning a CLIP Model into a Scene Text Detector GitHub thecvf
arXiv
YouTube
Unifying Vision, Text, and Layout for Universal Document Processing
CVPR - Highlight
GitHub Page
GitHub
thecvf
arXiv
YouTube
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild ModelScope thecvf
arXiv
YouTube
GeoLayoutLM: Geometric Pre-Training for Visual Information Extraction
CVPR - Highlight
GitHub Page
GitHub
thecvf
arXiv
YouTube
Handwritten Text Generation from Visual Archetypes GitHub
Streamlit App
thecvf
arXiv
YouTube
Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution GitHub thecvf YouTube
M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis GitHub thecvf YouTube
Disentangling Writer and Character Styles for Handwriting Generation GitHub thecvf
arXiv
YouTube