Popular repositories Loading
-
PDF-Extract-Kit
PDF-Extract-Kit PublicForked from opendatalab/PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Python
-
MinerU
MinerU PublicForked from opendatalab/MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
Python
-
surya
surya PublicForked from VikParuchuri/surya
OCR, layout analysis, reading order, line detection in 90+ languages
Python
-
marker
marker PublicForked from VikParuchuri/marker
Convert PDF to markdown quickly with high accuracy
Python
-
PyMuPDF
PyMuPDF PublicForked from pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Python
-
PaddleOCR
PaddleOCR PublicForked from PaddlePaddle/PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
Python
If the problem persists, check the GitHub status page or contact support.