Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should unused trailing bytes be allowed in the data stream? #158

Closed
ProgramMax opened this issue Aug 8, 2022 · 3 comments · Fixed by #321
Closed

Should unused trailing bytes be allowed in the data stream? #158

ProgramMax opened this issue Aug 8, 2022 · 3 comments · Fixed by #321
Assignees
Labels

Comments

@ProgramMax
Copy link
Collaborator

#96 mentions this. I'm pulling it out into a separate issue to keep separate issues independent.

It is surprisingly common for users of zlib to write out the entire buffer rather than just the used portion of the buffer. And unfortunately, many zlib users are sensitive to whether these unused trailing bytes are zeroed out or not.

Many PNG encoders & decoders run into this same issue (especially as many use zlib). There are many PNG images in the wild with unused trailing zeros in the compressed data stream.

Should we somehow specify that this is possible? Perhaps just a note for readers to be aware of this unintuitive detail.

@randy408
Copy link

randy408 commented Aug 8, 2022

It might be useful to require decoders handle this corner case properly by ignoring the trailing data. It's impossible to decode all the images from PngSuite if it was handled any other way.

@ProgramMax
Copy link
Collaborator Author

In the Aug 8th, 2022 meeting we discussed this. We agreed that
1.) This is a common problem with pretty much all data streams,
2.) I have encountered major players who rely on these trailing bytes being zero,
3.) There are already so many images and implementations that add useless trailing bytes, and
4.) Sometimes, this is actually a useful feature (such as constant bitrate streams or hardware implementations that always use the whole buffer).

However, we got hung up on defining which bytes are useless.
I believe there is a way to clearly tell. But I haven't specifically dealt with that part of PNG encoding/decoding.
So we decided I should dig into this to learn more and see how firm we can be about defining which bytes are useless before proceeding.

This gets a bit tricky because you can have multiple compression streams within one image. And they might be deliberately padded out to a known offset so multiple cores can decode streams at the same time.

@ProgramMax ProgramMax self-assigned this Aug 8, 2022
@randy408
Copy link

randy408 commented Aug 8, 2022

So we decided I should dig into this to learn more and see how firm we can be about defining which bytes are useless before proceeding.

Compressed data beyond what is needed to decompress the last scanline (of the last pass, if interlaced) is useless.

This gets a bit tricky because you can have multiple compression streams within one image. And they might be deliberately padded out to a known offset so multiple cores can decode streams at the same time.

This is indeed tricky, but based on my research trailing data can only be at the end. Any padding in the middle of the stream may only be in the form of a flush marker or other metadata, otherwise it's no longer a valid PNG. And multi-threaded decoders that don't check for extra compressed data in the middle of the stream cannot be conformant, it can be tricked into decoding a different image compared to a single-threaded decoder, Apple has made this mistake with their implementation.

svgeesus pushed a commit that referenced this issue Jul 11, 2023
This commit adds a note that unused trailing bytes may exist in the
final IDAT chunk, explaining preferable handling.

Closes #158
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants