PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
Updated
Jul 7, 2024 - Python
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Python bindings to PDFium
Analyze PDFs. With colors. And Yara.
CLI program for searching inside text and tables in PDF documents and displaying results in HTML.
A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.
Add a description, image, and links to the pdf-documents topic page so that developers can more easily learn about it.
To associate your repository with the pdf-documents topic, visit your repo's landing page and select "manage topics."