-
Notifications
You must be signed in to change notification settings - Fork 744
Recover from errors in ancillary chunks #665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@svgeesus this was never intended in previous revisions of PNG; to be an ancillary chunk required that the chunk name be both valid and ancillary. Another example is a "reserved" ancillary chunk. This requires clarification from the authors of PNGv3 and, perhaps, particularly if the earlier interpretation has been changed, the wording needs to be clarified before the DIS; otherwise it might become the interpretation by misinterpretation. This is a test image that was actually authored (according to the github) by ChrisL, but for everyone who doesn't know what it contains it has a chunk name "a\001IT" (using ASCII). Another example would be cHrM - that's BTW "pngcheck" and "pngcheck -v" currently error out on these files; they don't give a warning, it's a hard error which stops |
Really? From PNG 1.0
I checked it in, yes. It was created by DLiang Fun, here: |
|
It's an order thing. Ok, you've just changed the order of the checks which invalidates the previous implementations of libpng. The code of Because it's not an ancillary chunk type if it's not a chunk type in the first place. |
|
I see the original intention by @LucasChollet, and I can agree that slapping an incorrectly-positioned ancillary chunk with a hard error may be a bit too harsh. I suggest signaling this problem with a benign error instead of a warning, though. When the decoder knows what that ancillary chunk is, then the decoder knows it is an error. (Not a warning, not an "FYI", etc.) So it shouldn't be a critical error raised on a non-critical chunk; it should rather be a benign error whose behaviour can be customized by the application. |
As per the third edition of the spec, decoders should recover from errors in ancillary chunks. In section 13.1 Error handling [1]: > Anomalous situations other than syntax errors shall be treated as > follows: > 1. Encountering an unknown ancillary chunk is never an error. The > chunk can simply be ignored. More specifically, in this commit, if a chunk that is detected as ancillary, does not pass the `check_chunk_name()` function, only a _benign_ is issued, instead of an error. This allows libpng to fully decode images like [2] and [3]. It has been tested by passing them to both pngtest and Gnome's image viewer. Note that invalid-unknown-ancillary-after-IDAT.png could already be displayed but not fully decoded. [1] https://w3c.github.io/png/#13Decoders.Errors [2] https://github.com/web-platform-tests/wpt/blob/master/png/errors/support/invalid-unknown-ancillary.png [3] https://github.com/web-platform-tests/wpt/blob/master/png/errors/support/invalid-unknown-ancillary-after-IDAT.png
No; incorrectly positioned ancillary chunks cause a png_chunk_report with the appropriate settings (basically a hard error on write and a benign one on read). This should be a png_chunk_report too; attempting to re-implement png_chunk_report inline is misplaced. This is about a clearly corrupted PNG stream; the chunk name has a '\001' byte in it where the 'privately/publicly defined' character is. The patch will apply to any corrupted PNG stream; given a random stream of garbage there is a 1/4 chance that the header will be accepted and parsing will continue. The previous code rejected all but about 1/1174 of streams of garbage, then 3/4 of the remainder when the critical and reserved bits were considered. The new code just parses garbage. As I said, I believe quite clearly, it is an order thing. Setting bit 5 in the first byte of a chunk name ("type") means that error detection for corrupted streams is turned off. The chance of libpng in fact detecting a corrupted stream before the most likely bogus CRC is reduced massively. It was always possible to do this with malicious intent; setting the chunk length to 0x7fffffff is sufficient but now accidentally damaged streams will be accepted most likely with a meaningless stream of garbled warnings emitted. |
|
From the original issue
|
|
This is a real stretch and a classic example of partial quoting. I wasn't saying you couldn't do this but if you want to do this you have to change the PNGv3 spec to make it clear you have done it. From the PNGv3 spec. before the items you are quoting: There is no definition of "chunk type" here but there is a reference (in the definition of "PNG editor". You need to add a definition of the form "a 32 bit unsigned integer". (The definitions section should not refer to an undefined term). The current definition of "chunk type" appears to be here: See the table (bold emphasis added, superscript converted to markdown):
You omitted the part of what I take to be the "definition" which does define what a chunk type is (in bold) and makes my point and only quoted a part which has no part of the actual definition beyond clarifying the it is in terms of binary values, not the local or C character set. The addition to the "definitions" section is required to fix this and it changes the definition of PNG from all prior versions because in those versions "chunk type" is a value which matches the restrictions; there is no prior definition to override this. I examined exactly what a chunk type really is; deep-diving, horrible, legalistic precision. This is not the basis of any of my points. Rather it is a refutation of yours: A chunk type is:
So this is the definition; so far as I can see anything which meets that definition is a "chunk type" and anything which does not meet that definition is not a chunk type by definition.
That's what the code does and it has always been possible to handle a chunk type that way (almost all, maybe all, of the considerable amount of code I have written for PNG Group does.) But first the way the spec is written (and, yes, I've checked 1.0, 1.1 and 1.2 as well) the code has to determine that the value is a chunk type otherwise the rest of what you quote is irrelevant. However, even so, your account is wrong:
I couldn't find where you got that, it seems to be your interpretation unsupported by any reference to the spec. Specifically the text does not seem to be in PNGv3. I searched for "xamines". So far as I can see there is no mandatory order; the only requirement is that the processing of a chunk type is predicated on it being a chunk type. It is not predicated on it being a "32-bit unsigned integer", or even an array of four bytes. Your change requires an explicit and mandatory (shall) statement of the order in which the various requirements are considered and an error handling of each is required for decoders. (Note that libpng isn't a decoder, it's a library to enable a decoder.) Incidentally if you really want to do this in libpng simply delete the entire call to I encourage the groupies to participate and, if you feel in any way threatened, then please say and I will stop (as, you will recall, I did on your own list). |
ctruta
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. (Apologies for the delay in my review.)
|
@svgeesus wrote:
Agreed. I just approved the new iteration of this PR submitted by @LucasChollet. @jbowler wrote:
Sorry, John, this is a TLDR for me. I will read it eventually, I promise; but now I need to catch up with what I've missed recently. As I understand the motivation of this PR, an actual solution is needed to fix an actual problem. |
|
So I applied this change, but I also see the point that @jbowler is making:
There is clearly a purpose in error recovery, meaning that one may always receive a PNG image stream that somehow got either corrupted during a faulty transmission or botched during a faulty encoding. The viewers should still be able to render it; I mean, how is that any different from an image that has a bad IEND, or no IEND at all?... As we are working on integrating the RISC-V port in 1.6.48, we still have some time to go until we decide whether we keep this in, or we further change it to something better (or something worse). Last but not least, one can always run a |
|
@LucasChollet I apologize for being wishy-washy but the more I thought about the comments made by @jbowler the more I agreed with his point of view. There aren't (or, rather, there shouldn't be) "bad critical chunk names" or "bad ancillary chunk names"; there should be just "bad chunk names". If the PNG spec does not address this properly (and, in my own interpretation, indeed it doesn't) then we should bring this in our next standard discussion. @ProgramMax, @svgeesus, I volunteer to bring it. Until then, I will revert commit 34005e3, restoring back the behavior that we have had since times immemorial. Even if we do change it, this is something that I'd rather leave for libpng18 and later. |
|
I do also think that his interpretation of the spec is correct, and I was surprised no one wanted to address it, but I don't think it was my role to do it. |
|
So I pushed commit e046c0d.
No argument from me here. In fact, I've been doing PNG recovery of various sorts, including this one, myself, in optipng.
No new parsing is needed. I can be done, admittedly not in a straightforward manner, but it can be done by the means of a customized exception handler. Specifically in optipng, I implemented the option --fix in C, using a C++-like exception handling library named "cexcept". Regardless: I agree that a better solution is still needed. This thing has been waiting for decades now, so it probably can wait a few more months. Right now I would like to incorporate some of the new PRs that piled up, and ship libpng-1.6.48, and incorporate the RISC-V submission, and ship libpng-1.6.49, and... But hey, we'll get to this, too, eventually 😉 |
It's double plus not the role of a "reference" library to decode broken data streams. The role is to reject them. @svgeesus if you want libpng to be a dumping ground for all the failed PNG implementations out there I suggest you pay money; see the last paragraph below. This is the NetScape problem: they took GIF and they reimplemented it and then, because they were the defacto standard, all of us had to reimplement our, absolutely correct, parsers (or, in my case encoder) to conform to the undocumented NetScape standard. (Deferring clear codes in my case; try it and see if the bug has been fixed. Probably not.) A standard is not a standard if it can be redefined by implementations, albeit mighty Tech Bro implementations, that simply don't obey the standard. libpng is triple plus NOT a "PNG editor". It's not a reference library in any meaningful sense at present either because of all the shyte it accepts. Yet it is still NOT A PNG EDITOR. It is not the job of a decoder to edit streams into a meaningful form. This is COMPLETELY RIDICULOUS. In limited circumstances, when the base stream is well formed, libpng can be used as the basis for a PNG editor. In cases like this where it is not well formed a more sophisticated program is required. IRC contrib/tools/pngfix.c does stuff like that in a minor way. Someone needs to write a program to do this, given that completely corrupted streams are now acceptable PNG streams. This would be a "support" thing; supporting software that is clearly broken (the software that generated this stream) is a thing we get paid for. Read the GPL. |
|
"The more we learn the less we know", @jbowler. Speaking for myself only, I'd be the second to last to want libpng to be a dumping ground of anything. However, there is such thing as a "street spec". Like it or not, that ship has sailed. I'm not saying that we must modify the library or the specification in order to accommodate this sort of out-of-spec images; but I believe that we should at least discuss it. The PNG spec already has a tradition of making non-normative recommendations to encoders and decoders, which sets it apart from many other codecs. If anything, it puts the "Portable" into the "Portable Network Graphics". At the very least, we can kindly ask @svgeesus to implement in pngcheck something that I, too, implemented in optipng. (Type |
|
I think your current approach is the best I have seen so far and, I hope, it will be very productive. I will do my best to postpone my normal obcomments at least until the list has got so far as a formal proposal. |
As per the third edition of the spec, decoders should recover from errors in ancillary chunks.
In section 13.1 Error handling [1]:
More specifically, in this commit, if a chunk that is detected as ancillary, does not pass the
check_chunk_name()function. Only a warning is issued, instead of an error.This allows libpng to fully decode images like [2] and [3]. It has been tested by passing them to both pngtest and Gnome's image viewer. Note that invalid-unknown-ancillary-after-IDAT.png could already be displayed but not fully decoded.
[1] https://w3c.github.io/png/#13Decoders.Errors
[2] https://github.com/web-platform-tests/wpt/blob/master/png/errors/support/invalid-unknown-ancillary.png
[3] https://github.com/web-platform-tests/wpt/blob/master/png/errors/support/invalid-unknown-ancillary-after-IDAT.png