UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True') when converting PDF to image #517

oooyiyangc · 2023-09-05T19:06:40Z

Hi, I've encountered issues using pikepdf to convert pdf to images. Here's my code for conversion.

pdf_file = Pdf.open(filepath)
page1 = pdf_file.pages[0]

relevant_key = [key for key in page1.images.keys()][0]
rawimage = page1.images[relevant_key]

pdfimage = PdfImage(rawimage)
image = pdfimage.as_pil_image()

Issue

I've identified that the issue only happens in later versions for some PDFs, and I'll list my observation below.

pikepdf==8.4.0 (to 7.1.0)
In those versions, the conversion will fail. Error message:

UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True')

pikepdf==7.0.0
In this version, the conversion will work, but the image produced has inverted color (e.g. pixels are white when they should be black)
pikepdf==3.1.0
Everything works.

To replicate:

Here're two sample PDFs where the above issue happens: 65150963-idaho-state-journal-Jun-11-1972-p-1, 101250036-jefferson-city-post-tribune-Feb-16-1967-p-1.pdf

Or you can simply go to my repo: https://github.com/oooyiyangc/pdf2img_test, and run check_conversion.py. The required packages are numpy, Pillow, and pikepdf.

My results on Ubuntu 20.04 LTS:

pikepdf==8.4.0 (to 7.1.0)

============================
Testing pdf 1 ... (should pass)
Converting ................. Pass
Matching expected output ... Pass

============================
Testing pdf 2 ... (should fail)
UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True')
Converting ................. Fail
Matching expected output ... Skipped

============================
Testing pdf 3 ... (should fail)
UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True')
Converting ................. Fail
Matching expected output ... Skipped
============================

pikepdf==7.0.0

============================
Testing pdf 1 ... (should pass)
Converting ................. Pass
Matching expected output ... Pass

============================
Testing pdf 2 ... (should fail)
Converting ................. Pass
Matching expected output ... Fail

============================
Testing pdf 3 ... (should fail)
Converting ................. Pass
Matching expected output ... Fail
============================

pikepdf==3.1.0

============================
Testing pdf 1 ... (should pass)
Converting ................. Pass
Matching expected output ... Pass

============================
Testing pdf 2 ... (should fail)
Converting ................. Pass
Matching expected output ... Pass

============================
Testing pdf 3 ... (should fail)
Converting ................. Pass
Matching expected output ... Pass
============================

The text was updated successfully, but these errors were encountered:

jbarlow83 · 2023-09-09T19:21:54Z

#269

jbarlow83 · 2023-09-12T06:30:35Z

Fixed in 8.4.1

oooyiyangc · 2023-09-12T15:45:34Z

Thank you @jbarlow83 ! Really appreciate it!

jbarlow83 closed this as completed Sep 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True') when converting PDF to image #517

UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True') when converting PDF to image #517

oooyiyangc commented Sep 5, 2023 •

edited

jbarlow83 commented Sep 9, 2023

jbarlow83 commented Sep 12, 2023

oooyiyangc commented Sep 12, 2023

UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True') when converting PDF to image #517

UnsupportedImageTypeError('/CCITTFaxDecode with decode parameter /EndOfBlock not equal True') when converting PDF to image #517

Comments

oooyiyangc commented Sep 5, 2023 • edited

Issue

To replicate:

jbarlow83 commented Sep 9, 2023

jbarlow83 commented Sep 12, 2023

oooyiyangc commented Sep 12, 2023

oooyiyangc commented Sep 5, 2023 •

edited