Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

png file looks fine in browser, Pillow won't open it #6843

Closed
eshellman opened this issue Dec 30, 2022 · 5 comments
Closed

png file looks fine in browser, Pillow won't open it #6843

eshellman opened this issue Dec 30, 2022 · 5 comments

Comments

@eshellman
Copy link

eshellman commented Dec 30, 2022

This is a rare problem, affecting 20 files of 1.04 million png files in Project Gutenberg!

What did you do?

from io import BytesIO
from PIL import Image
path = 'path/to/fig001.png'
with open(path, 'rb') as fp:
    image_data = fp.read()
    bio = BytesIO(image_data)
    unsized_image = Image.open(bio)

What did you expect to happen?

Image created

What actually happened?

UnidentifiedImageError                    Traceback (most recent call last)
Cell In [21], line 6
      4 image_data = fp.read()
      5 bio = BytesIO(image_data)
----> 6 unsized_image = Image.open(bio)

File ~/.local/share/virtualenvs/jupyter3-E-rb614q/lib/python3.9/site-packages/PIL/Image.py:3147, in open(fp, mode, formats)
   3145 for message in accept_warnings:
   3146     warnings.warn(message)
-> 3147 raise UnidentifiedImageError(
   3148     "cannot identify image file %r" % (filename if filename else fp)
   3149 )

UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x110ceba40>

What are your OS, Python and Pillow versions?

  • OS: MacOS, Linux
  • Python: 3.9.11, 3.8.5,
  • Pillow: 9.3.0, 9.3.0

fig001

@radarhere
Copy link
Member

radarhere commented Dec 30, 2022

Hi. If I use pngcheck to inspect the image, I get

$ pngcheck fig001.png
fig001.png  CRC error in chunk iCCP (computed ae15b432, expected 9712b48a)
ERROR: fig001.png

So the image is broken, and that is why it is failing to load in Pillow.

However, if you would like to workaround this check, you can use LOAD_TRUNCATED_IMAGES like so -

from io import BytesIO
from PIL import Image, ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
path = 'fig001.png'
with open(path, 'rb') as fp:
    image_data = fp.read()
    bio = BytesIO(image_data)
    unsized_image = Image.open(bio)

@radarhere
Copy link
Member

@eshellman did that answer your question?

@eshellman
Copy link
Author

Yes, that solves my problem. Thank you!

Two suggestions:

  1. Add a mention of this error mode (and the solution) in the documentation for the associated Error: https://pillow.readthedocs.io/en/stable/PIL.html#PIL.UnidentifiedImageError
  2. I'm guessing that in many or most use cases, lenient image loading would be desired. Consider adding LENIENT and STRICT switches to better communicate to users which one to use. Most of us would not have a clue about what a "TRUNCATED_IMAGE" is or whether loading such a thing is a security risk.

@radarhere
Copy link
Member

I've created PR #6856 to update the documentation as per your first suggestion.

As for the second suggestion, a truncated image is an image that is truncated, e,g, #3023. For PNGs, the definition of this is broadened to include images that have missing chunk data or invalid chunk checksums. I don't know why anyone would think this constitutes a security problem? LENIENT and STRICT might better communicate what is happening in the case of PNGs, but as Pillow values backwards compatibility so I wouldn't want to remove LOAD_TRUNCATED_IMAGES, and I'm reluctant to complicate the situation by allowing two settings to control one piece of functionality.

@eshellman
Copy link
Author

Thanks, that makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants