We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, I've encountered issues using pikepdf to convert pdf to images. Here's my code for conversion.
pdf_file = Pdf.open(filepath) page1 = pdf_file.pages[0] relevant_key = [key for key in page1.images.keys()][0] rawimage = page1.images[relevant_key] pdfimage = PdfImage(rawimage) image = pdfimage.as_pil_image()
I've identified that the issue only happens in later versions for some PDFs, and I'll list my observation below.
UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True')
pikepdf==7.0.0 In this version, the conversion will work, but the image produced has inverted color (e.g. pixels are white when they should be black)
pikepdf==3.1.0 Everything works.
Here're two sample PDFs where the above issue happens: 65150963-idaho-state-journal-Jun-11-1972-p-1, 101250036-jefferson-city-post-tribune-Feb-16-1967-p-1.pdf
Or you can simply go to my repo: https://github.com/oooyiyangc/pdf2img_test, and run check_conversion.py. The required packages are numpy, Pillow, and pikepdf.
check_conversion.py
numpy
Pillow
pikepdf
My results on Ubuntu 20.04 LTS:
============================ Testing pdf 1 ... (should pass) Converting ................. Pass Matching expected output ... Pass ============================ Testing pdf 2 ... (should fail) UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True') Converting ................. Fail Matching expected output ... Skipped ============================ Testing pdf 3 ... (should fail) UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True') Converting ................. Fail Matching expected output ... Skipped ============================
============================ Testing pdf 1 ... (should pass) Converting ................. Pass Matching expected output ... Pass ============================ Testing pdf 2 ... (should fail) Converting ................. Pass Matching expected output ... Fail ============================ Testing pdf 3 ... (should fail) Converting ................. Pass Matching expected output ... Fail ============================
============================ Testing pdf 1 ... (should pass) Converting ................. Pass Matching expected output ... Pass ============================ Testing pdf 2 ... (should fail) Converting ................. Pass Matching expected output ... Pass ============================ Testing pdf 3 ... (should fail) Converting ................. Pass Matching expected output ... Pass ============================
The text was updated successfully, but these errors were encountered:
#269
Sorry, something went wrong.
Fixed in 8.4.1
Thank you @jbarlow83 ! Really appreciate it!
No branches or pull requests
Hi, I've encountered issues using pikepdf to convert pdf to images. Here's my code for conversion.
Issue
I've identified that the issue only happens in later versions for some PDFs, and I'll list my observation below.
In those versions, the conversion will fail. Error message:
pikepdf==7.0.0
In this version, the conversion will work, but the image produced has inverted color (e.g. pixels are white when they should be black)
pikepdf==3.1.0
Everything works.
To replicate:
Here're two sample PDFs where the above issue happens: 65150963-idaho-state-journal-Jun-11-1972-p-1, 101250036-jefferson-city-post-tribune-Feb-16-1967-p-1.pdf
Or you can simply go to my repo: https://github.com/oooyiyangc/pdf2img_test, and run
check_conversion.py
. The required packages arenumpy
,Pillow
, andpikepdf
.My results on Ubuntu 20.04 LTS:
The text was updated successfully, but these errors were encountered: