Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance ZnEasy getPng: to handle some problematic PNGs #12261

Closed
chisandrei opened this issue Jan 6, 2023 · 8 comments
Closed

Enhance ZnEasy getPng: to handle some problematic PNGs #12261

chisandrei opened this issue Jan 6, 2023 · 8 comments

Comments

@chisandrei
Copy link
Contributor

When using ZnEasy class>>getPng: for some PNGs there is an error. For example for the Pharo logo from Twitter (https://pbs.twimg.com/profile_images/541743734/icone-pharo-1.png):

ZnEasy getPng: 'https://pbs.twimg.com/profile_images/541743734/icone-pharo-1.png'.

Screenshot 2023-01-06 at 14 12 41

The underlying issue is that PNGReadWriter cannot parse the picture.
We can also save the image to disk and attempt to parse it using PNGReadWriter to get the same error.

'./icone-pharo-1.png' asFileReference binaryReadStreamDo: [ :stream | 
    ImageReadWriter formFromStream: stream ]

Using pngcheck shows that there is some additional data after IEND chunk.

pngcheck -v icone-pharo-1.png            
File: icone-pharo-1.png (559959 bytes)
  chunk IHDR at offset 0x0000c, length 13
    500 x 500 image, 32-bit RGB+alpha, non-interlaced
  chunk bKGD at offset 0x00025, length 6
    red = 0x00ff, green = 0x00ff, blue = 0x00ff
  chunk pHYs at offset 0x00037, length 9: 72x72 pixels/unit (1:1)
  chunk vpAg at offset 0x0004c, length 9
    unknown private, ancillary, safe-to-copy chunk
  chunk IDAT at offset 0x00061, length 32768
    zlib: deflated, 32K window, maximum compression
  chunk IDAT at offset 0x0806d, length 32768
  chunk IDAT at offset 0x10079, length 32768
  chunk IDAT at offset 0x18085, length 14370
  chunk zTXt at offset 0x1b8b3, length 34, keyword: Software
  chunk IEND at offset 0x1b8e1, length 0
  additional data after IEND chunk
ERRORS DETECTED in icone-pharo-1.png

In the PNG specs The IEND chunk must appear LAST. It marks the end of the PNG datastream. The chunk's data field is empty. However it seems that some PNGs can have data afterwards. Not 100% sure this is the case here, but seems to be. Would be more resilient if PNGReadWriter would have a mode to handle these cases.

Tested in Pharo-11.0.0+build.418.sha.d5b3b17a7491bb98ea70539a9198163b938a7d0b (64 Bit)

@Ducasse
Copy link
Member

Ducasse commented Jan 6, 2023

Thanks andrei.

@svenvc
Copy link
Contributor

svenvc commented Jan 6, 2023

Nice error report (as usual ;-)

Let's try hacking along your suggestion (that everything after IEND can be ignored, I did not read any specs).

Change:

PNGReadWriter>>#processNextChunk
   ....
   chunkType = 'IEND' ifTrue: [ ^ #IEND	"*should* be the last chunk" ].
   ....

and

PNGReadWriter>>#nextImage
    .....
    [ stream atEnd or: [ self processNextChunk = #IEND ] ] whileFalse.    
    .....

and then it works for me.

It is not super clean as before self was returned in all cases, but #processNextChunk is internal, so we can get away with that I think.

All 42 PNGReadWriterTest tests are green (not sure that means much, but still).

What do you think ?

@chisandrei
Copy link
Contributor Author

chisandrei commented Jan 6, 2023

For me that looks better than the current version. Not sure how other decoders handle this case but I imagine they ignore what comes after the end. Might also make sense to have a "strict" mode or a way to parse that fails with an error in this case.

From what is mentioned in https://www.w3.org/TR/png IEND marks the end of the PNG.

If I look at the structure of the file, all looks ok, with AE 42 60 82 marking the end of IEND as expected.

Screenshot 2023-01-06 at 17 36 18

But then we seem to get a lot of empty chunks for some reason. No idea if that has any meaning. Seems to happen for many PNGs on Twitter.

Screenshot 2023-01-06 at 17 36 36

This document describes PNG (Portable Network Graphics), an extensible file format for the lossless, portable, well-compressed storage of static and animated raster images. PNG provides a patent-free replacement for GIF and can also replace many common uses of TIFF. Indexed-colour, greyscale, and truecolour images are supported, plus an optional alpha channel. Sample depths range from 1 to 16 bits.

@chisandrei
Copy link
Contributor Author

chisandrei commented Jan 6, 2023

Unrelated, but for more testing of PNG parsing, trying the image at http://www.schaik.com/pngsuite/ might be interesting.

The PngSuite consists of a two hundred PNG images in all possible formats to support the development of PNG viewing and conversion software

@svenvc
Copy link
Contributor

svenvc commented Jan 6, 2023

See https://github.com/DraagrenKirneh/PngSuiteExplorer

we used to score 100% percent IIRC

GitHub
Simple Pharo Widget for viewing the result of running the PNG test suite - GitHub - DraagrenKirneh/PngSuiteExplorer: Simple Pharo Widget for viewing the result of running the PNG test suite

@svenvc
Copy link
Contributor

svenvc commented Jan 10, 2023

In a recent Pharo, you can simply inspect the directory:

(FileLocator documents / 'PngSuite') children.

(FileLocator documents / 'PngSuite') childrenMatching: '*.png'.

(FileLocator documents / 'PngSuite') childrenMatching: 'x*.png'.

Those starting with an x are supposed to give an error.
But it is hard to verify that the interpreted images are correct.

@svenvc
Copy link
Contributor

svenvc commented Mar 19, 2023

I finally did a PR: #13049

@MarcusDenker
Copy link
Member

merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants