Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined behavior in libjpeg's example.txt...? #379

Closed
dangelog opened this issue Oct 11, 2019 · 5 comments
Closed

Undefined behavior in libjpeg's example.txt...? #379

dangelog opened this issue Oct 11, 2019 · 5 comments
Assignees
Labels

Comments

@dangelog
Copy link

@dangelog dangelog commented Oct 11, 2019

Hi,

During an unrelated analysis of Qt's JPEG loading code (which uses libjpeg's API, and by default libjpeg-turbo's) I've noticed the pattern of using setjmp + longjmp to handle libjpeg's errors. As written in Qt, the pattern triggered the typical undefined behavior of setjmp/longjmp: a non-volatile variable local to the function where setjmp is called gets written into after the setjmp, and then is accessed again after the longjmp.

The interesting part is that the setjmp/longjmp pattern is actually strongly inspired by libjpeg's own reading example. In https://github.com/libjpeg-turbo/libjpeg-turbo/blob/master/example.txt line 303 declares a local object (cinfo), line 331 does the setjmp, line 340 writes something into that object, and then a longjmp triggered by the error routine may jump back to line 335 where cinfo is accessed again. Since cinfo nor any of its subobjects are declared volatile, my thesis is: their state is indeterminate after the longjmp (cf. man 3 longjmp and paragraph §7.13.2.1.3 of www.open-std.org/jtc1/sc22/wg14/www/docs/n2346.pdf ). I'll up the ante and say outright that for some conforming compiler the behavior is therefore undefined.

The workaround I am pushing into Qt here is to make cinfo and similar objects declared in another function that then calls the one actually doing the work (and containing the setjmp/longjmp pair). By doing this I am making them non-local, thus removing the precondition for indeterminate state.

Of course no compiler seems to be exploiting this UB in any way or form. This code looks over 20 years old, and it's also found in a very very similar fashion in GDK, ImageMagick and who knows how many other projects (see the links in the patch above).

Does the above make sense, and if so, should the example.txt be adjusted accordingly?

Thanks for reading.

@dcommander

This comment has been minimized.

Copy link
Member

@dcommander dcommander commented Oct 11, 2019

I am happy to change example.txt accordingly. Would simply declaring cinfo with the volatile keyword be sufficient for the example?

More generally, this relates to #311 and #378. At the end of the day, the libjpeg API is a legacy API. libjpeg-turbo was never originally intended to do anything other than accelerate it, but as more and more O/S distributions and applications adopted libjpeg-turbo instead of libjpeg, I fell bass-ackwards into maintaining the de facto industry standard JPEG library (which was then adopted by ISO, making it an official standard as well.) But there's a limit to what I can do with the libjpeg API, given that my own downstream projects (VirtualGL and TurboVNC) don't use it (they use the TurboJPEG API) and given the need to maintain backward compatibility. The TurboJPEG API wraps setjmp()/longjmp() error handling in a manner similar to what you describe above. That is generally what I recommend for any application/framework for which setjmp()/longjmp() is problematic. Referring to #311, one possible way forward that would eliminate the need for setjmp()/longjmp() would be to introduce a new error manager (allocated using some new API function, perhaps named jpeg_trap_errors() or similar) that would set an internal state variable if an error or warning is encountered but that would otherwise take no action. The application would be required to call another new function (perhaps named jpeg_get_error()) to retrieve the error state and take action on it. This would make error handling in the libjpeg API work similarly to that of OpenGL, and example.txt could be changed accordingly in the context of implementing that new feature. I'm happy to do that work if some organization steps forward to fund my labor. That would at least fix one of the problematic aspects of the libjpeg API, although there are others (including the fact that the API relies upon exposed structures, so it cannot be easily extended without breaking backward ABI compatibility.) Referring to #313, I would really like to see the open source community come together and figure out a new API to replace the libjpeg API. Perhaps that new API could be inspired by the TurboJPEG API but with modifications to bring in some of the libjpeg API features that it currently lacks (buffered I/O, etc.)

@dcommander

This comment has been minimized.

Copy link
Member

@dcommander dcommander commented Oct 11, 2019

Actually, forget what I said about funding. The proposed error management modifications would be a worthwhile use of the libjpeg-turbo General Fund, and they wouldn't be that difficult or time-consuming (probably a day or two of work on my part.) I would just need a sense of whether those modifications are going to beneficial for enough downstream projects.

@dangelog

This comment has been minimized.

Copy link
Author

@dangelog dangelog commented Oct 11, 2019

I am happy to change example.txt accordingly. Would simply declaring cinfo with the volatile keyword be sufficient for the example?

Unfortunately it's not so simple, becuase libjpeg's APIs want pointer-to-$something, not pointer-to-$something-volatile. For instance jpeg_create_compress wants a pointer to struct jpeg_compress_struct; declaring one as volatile and then passing it raises warnings/errors about the discarded qualifier (and triggers UB again).

I don't see a clean fix for this, and I wouldn't want anyone to invest significant time to overhaul libjpeg's error handling just for such a minor issue. The solution I've used in Qt is quite simple: make a function that declares the objects that need to survive across the setjmp/longjmp, and have that function call another one that does the setjmp/longjmp. Since they're no longer local to the latter function, it becomes safe to access them after the longjmp.

Of course any other idea is welcome :)

@dcommander

This comment has been minimized.

Copy link
Member

@dcommander dcommander commented Oct 11, 2019

OK, well, the core of this issue is that example.txt suggests API usage that may generate undefined behavior. How do we fix example.txt so it doesn't do that? If volatile won't work, then perhaps example.txt just needs to use a wrapper function like you described.

@libjpeg-turbo libjpeg-turbo deleted a comment from LaySoe Oct 11, 2019
@dangelog

This comment has been minimized.

Copy link
Author

@dangelog dangelog commented Oct 12, 2019

Using a wrapper sounds definitely the easiest option IMHO. Ultimately it's what I did in Qt to minimize the diff against existing code:

https://codereview.qt-project.org/c/qt/qtbase/+/276244/5/src/plugins/imageformats/jpeg/qjpeghandler.cpp

Of course in the case of example.txt the diff doesn't have to be that minimal, and of course example.txt is C and not C++.

@dcommander dcommander closed this in 410c028 Nov 6, 2019
@dcommander dcommander added the fixed label Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.