DOC: Add comparison with pdfplumber #1837

RitchieP · 2023-05-09T09:09:30Z

Added my take on the pdfplumber library compared to PyPDF.

Added my take on the `pdfplumber` library compared to PyPDF.

docs/meta/comparisons.md

MartinThoma · 2023-05-18T21:04:57Z

I like it! I have to suggestions. What do you think about those?

RitchieP · 2023-05-19T01:15:28Z

Yeah, I think it would work also, since pdfplumber is built on top of pdfminer.six

Co-authored-by: Martin Thoma <info@martin-thoma.de>

docs/meta/comparisons.md

New Features (ENH) - Simplify metadata input (Document Information Dictionary) (#1851) - Extend cmap compatibilty to GBK_EUC_H/V (#1812) Bug Fixes (BUG) - Prevent infinite loop when no character follows after a comment (#1828) - get_contents does not return ContentStream (#1847) - Accept XYZ destination with zoom missing (default to zoom=0.0) (#1844) - Cope with 1 Bit images (#1815) Robustness (ROB) - Handle missing /Type entry in Page tree (#1845) Documentation (DOC) - Expand file size explanations (#1835) - Add comparison with pdfplumber (#1837) - Clarify that PyPDF2 is dead (#1827) - Add Hunter King as Contributor for #1806 Maintenance (MAINT) - Refactor internal Encryption class (#1821) - Add R parameter to generate_values (#1820) - Make encryption_key parameter of write_to_stream optional (#1819) - Prepare for adding AES enryption support (#1818) Code Style (STY): - Iterate directly over the list instead of using range (#1839) - Minor refactorings in _encryption.py (#1822) [Full Changelog](3.8.1...3.9.0)

mara004 · 2023-05-21T15:24:23Z

docs/meta/comparisons.md


 [`pdfminer.six`](https://pypi.org/project/pdfminer.six/) is capable of
 extracting the [font size](https://stackoverflow.com/a/69962459/562769)
 / font weight (bold-ness). It has no capabilities for writing PDF files.

-## pdfrw / pdfminer / pdfplumber
+[`pdfplumber`](https://pypi.org/project/pdfplumber/) is a library focused on extracting data from PDF documents. Since `pdfplumber` is built on top of `pdfminer.six`, there are **no capabilities of exporting or modifying a PDF file** (see [#440 (discussions)](https://github.com/jsvine/pdfplumber/discussions/440#discussioncomment-803880)). However, `pdfplumber` is capable of converting a PDF file into an image, [draw lines and rectangles on the image](https://github.com/jsvine/pdfplumber#drawing-methods), and save it as an image file.


is capable of converting a PDF file into an image

From skimming the Readme, it looks like pdfplumber calls Wand for pdf rendering, which is a binding to ImageMagick, which in turn uses ghostscript, IIRC.
So this phrase is kinda misleading as pdfplumber is not an actual pdf rendering library (as opposed to mupdf/poppler/pdfium), but merely a rendering "wrapper-wrapper-wrapper".

Yes, I agree! It is not a PDF rendering library, there's just one function to convert the PDF into an image with the tools you mentioned. I'm not experienced with Wand, ImageMagick, and ghostscript, so if you're an expert there, feel free to elaborate more on my changes.

@RitchieP You could rephrase

However, pdfplumber is capable of converting a PDF file into an image

to

However, pdfplumber is capable of converting a PDF file into an image via ImageMagick

Definitely! I'll make a PR in a bit.

Update comparisons.md with pdfplumber

04bc965

Added my take on the `pdfplumber` library compared to PyPDF.

MartinThoma reviewed May 18, 2023

View reviewed changes

docs/meta/comparisons.md Outdated Show resolved Hide resolved

MartinThoma reviewed May 18, 2023

View reviewed changes

docs/meta/comparisons.md Outdated Show resolved Hide resolved

MartinThoma changed the title ~~Update comparisons.md with pdfplumber~~ DOC: Add comparison with pdfplumber May 18, 2023

MartinThoma added the nf-documentation Non-functional change: Documentation label May 18, 2023

RitchieP and others added 2 commits May 19, 2023 09:21

Update docs/meta/comparisons.md

996564f

Co-authored-by: Martin Thoma <info@martin-thoma.de>

Update docs/meta/comparisons.md

720a707

Co-authored-by: Martin Thoma <info@martin-thoma.de>

MartinThoma reviewed May 20, 2023

View reviewed changes

docs/meta/comparisons.md Outdated Show resolved Hide resolved

Update docs/meta/comparisons.md

b9d5e07

MartinThoma reviewed May 20, 2023

View reviewed changes

docs/meta/comparisons.md Outdated Show resolved Hide resolved

Update docs/meta/comparisons.md

bbd1f26

MartinThoma reviewed May 20, 2023

View reviewed changes

docs/meta/comparisons.md Outdated Show resolved Hide resolved

Update docs/meta/comparisons.md

2b6a7b2

MartinThoma merged commit e4ef5b9 into py-pdf:main May 20, 2023

mara004 reviewed May 21, 2023

View reviewed changes

RitchieP mentioned this pull request May 22, 2023

DOC: Clarification of pdfplumbers image conversion capabilities #1853

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Add comparison with pdfplumber #1837

DOC: Add comparison with pdfplumber #1837

RitchieP commented May 9, 2023

MartinThoma commented May 18, 2023

RitchieP commented May 19, 2023

mara004 May 21, 2023 •

edited

Loading

RitchieP May 22, 2023

MartinThoma May 22, 2023

RitchieP May 22, 2023

mara004 May 22, 2023

DOC: Add comparison with pdfplumber #1837

DOC: Add comparison with pdfplumber #1837

Conversation

RitchieP commented May 9, 2023

MartinThoma commented May 18, 2023

RitchieP commented May 19, 2023

mara004 May 21, 2023 • edited Loading

Choose a reason for hiding this comment

RitchieP May 22, 2023

Choose a reason for hiding this comment

MartinThoma May 22, 2023

Choose a reason for hiding this comment

RitchieP May 22, 2023

Choose a reason for hiding this comment

mara004 May 22, 2023

Choose a reason for hiding this comment

mara004 May 21, 2023 •

edited

Loading