pthread_key_create() and unloading via dlclose() probably mutually incompatible #653

nicowilliams · 2016-02-10T21:34:12Z

Viktor prods me to comment on this in relation to rt 3824 https://rt.openssl.org/Ticket/Display.html?id=3824.

There's no guarantee that a thread-specific's value destructor will be called for all keys before a shared object providing that destructor gets unloaded via last dlcose() call. There seems to be no standard or otherwise portable, or even non-portable way to do this.

One can imagine wrapping all thread-specific values with a wrapper that is put on a global list so that a .fini section can clean them up. But this doesn't work: the destructor function pointer will still be associated with the thread-specific key and as soon as a thread exits the process will blow up as the destructor pointer will be invalid (or worse: perhaps some other object got loaded, or mmap() occurred, such that said pointer now lies in valid address space, but it will still be invalid).

The thing to do in [self-]initialization is to call dladdr() -if available- then dlopen() the resulting dli_fname, then leak the resulting dl handle, thereby making sure that the library is never unloaded.

The text was updated successfully, but these errors were encountered:

mattcaswell · 2016-02-10T22:06:37Z

I'd be interested to see a patch to do this.

nicowilliams · 2016-02-10T22:28:24Z

I'll write one on Friday.

nicowilliams · 2016-02-10T22:34:04Z

Also, where there is no dladdr() the library could dlopen() itself by its SONAME. RTLDs generally have to deduplicate on SONAME, so that dlopen(MY_SONAME) should work, but there is a very small price to pay: users won't be able to trivially elfedit/patchelf such an object to change its SONAME.

mattcaswell · 2016-02-10T22:46:54Z

@nicowilliams great! I'm looking forward to seeing that.

nicowilliams · 2016-02-10T22:58:40Z

The thought occurs that valgrind ought to have an option to ignore leaks where pthread_once() appears in the allocation stack.

nicowilliams · 2016-02-12T20:31:57Z

I can't believe I forgot, but there's the link-editor -z nodelete option, which does this in a much simpler way.

What I don't quite know is how to hack on the OpenSSL build configuration system to use this, much less use it where it's actually supported by ld(1).

nicowilliams · 2016-02-12T20:33:15Z

Is there a good resource for how to hack on the OpenSSL build configuration system? IIUC Richard Levitte is revamping it.

levitte · 2016-02-12T20:38:37Z

There's at least a start in Configurations/... README and README.design. I'm not claiming perfection, but I hope it can get you somewhere.

Otherwise, I think that @dot-asm had something along these lines started, but I haven't seen much news on that lately (and yeah, @dot-asm, this was a very purposeful nudge ;-))

mattcaswell · 2016-02-18T15:40:30Z

If we go with -z nodelete are we going to restrict ourselves to a much smaller set of platforms that can support this? I mean dlopen() is POSIX right, so isn't that going to be the more portable approach?

mattcaswell · 2016-04-06T12:10:16Z

Ping @nicowilliams. So what to do about this one? I did some testing and discovered/verified a couple of things:

Using OpenSSL via dlopen in a mutli-threaded app does leak memory if threads are destroyed after dlcose() is called.
In a single threaded environment, or if threads are destroyed prior to dlclose() being called, then no memory is leaked
Using the RTLD_NODELETE flag to dlopen solves the above problem
Interestingly the OpenSSL atexit() handler gets called when dlclose() is called rather than at application exit (I was worred that it might crash if there was an atexit() handler for a function that has been unloaded)
RTLD_NODELETE is a non-standard flag - but it does seem to be fairly widely supported. As far as I could determine (via google), at least Linux, Solaris, OpenBSD, FreeBSD, HP-UX all seem to support it.

Probably, where we can, we should build with -z nodelete so that we get that behaviour by default, and should also document somewhere that RTLD_NODELETE should be used (or dlcose() called after threads are closed). @nicowilliams, are you planning on progressing this?

mattcaswell · 2016-05-25T22:29:55Z

Ping @nicowilliams ...any further thoughts on this?

mattcaswell · 2016-06-07T09:15:31Z

I am now thinking that, given my above results, perhaps a documentation fix is sufficient. Perhaps there are times where using -z nodelete is undesirable. It's only really a problem when using dlopen() and on most platforms you can just use RTLD_NODELETE (which are probably the same ones where you have a -z nodelete option anyway).

mattcaswell · 2016-06-07T12:32:49Z

I also tested this on Windows using LoadLibrary and FreeLibrary (instead of dlopen and dlclose). The behaviour is much the same on Windows as it is on Unix (except there is no RTLD_NODELETE AFAIK).

If using threads and OpenSSL is loaded via dlopen(), and subsequently closed again via dlclose() *before* the threads are destroyed, then OpenSSL will not free up the per thread resources. We need to document this restriction, and provide some guidance on what to do about it. I did some testing and discovered/verified a few of things (at least this is the behaviour on Linux): - Using OpenSSL via dlopen in a mutli-threaded app does leak memory if threads are destroyed after dlcose() is called. - In a single threaded environment, or if threads are destroyed prior to dlclose() being called, then no memory is leaked - Using the RTLD_NODELETE flag to dlopen solves the above problem - Interestingly the OpenSSL atexit() handler gets called when dlclose() is called rather than at application exit (I was worred that it might crash if there was an atexit() handler for a function that has been unloaded) - RTLD_NODELETE is a non-standard flag - but it does seem to be fairly widely supported. As far as I could determine (via google), at least Linux, Solaris, OpenBSD, FreeBSD, HP-UX all seem to support it. I also tested on Windows (using LoadLibrary instead of dlopen and FreeLibrary instead of dlclose) and experienced similar behaviour, except that (AFAIK) there is no equivalent of RTLD_NODELETE on Windows. GitHub Issue #653 Reviewed-by: Richard Levitte <levitte@openssl.org>

mattcaswell · 2016-06-07T12:45:52Z

Documentation updated in c796e02. Closing.

nicowilliams mentioned this issue Feb 10, 2016

Uninitalization considered harmful #654

Closed

mattcaswell added this to the 1.1.0 milestone May 16, 2016

mattcaswell closed this as completed Jun 7, 2016

jay mentioned this issue Oct 7, 2016

OpenSSL 1.1.0b crashes my app on exit, because OPENSSL_cleanup() is called too late! curl/curl#1055

Closed

coolshrid added a commit to coolshrid/openssl that referenced this issue May 2, 2024

Fixes openssl#22965: Removed trailing whitespace openssl#653

194d6aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pthread_key_create() and unloading via dlclose() probably mutually incompatible #653

pthread_key_create() and unloading via dlclose() probably mutually incompatible #653

nicowilliams commented Feb 10, 2016

mattcaswell commented Feb 10, 2016

nicowilliams commented Feb 10, 2016

nicowilliams commented Feb 10, 2016

mattcaswell commented Feb 10, 2016

nicowilliams commented Feb 10, 2016

nicowilliams commented Feb 12, 2016

nicowilliams commented Feb 12, 2016

levitte commented Feb 12, 2016

mattcaswell commented Feb 18, 2016

mattcaswell commented Apr 6, 2016

mattcaswell commented May 25, 2016

mattcaswell commented Jun 7, 2016

mattcaswell commented Jun 7, 2016

mattcaswell commented Jun 7, 2016

pthread_key_create() and unloading via dlclose() probably mutually incompatible #653

pthread_key_create() and unloading via dlclose() probably mutually incompatible #653

Comments

nicowilliams commented Feb 10, 2016

mattcaswell commented Feb 10, 2016

nicowilliams commented Feb 10, 2016

nicowilliams commented Feb 10, 2016

mattcaswell commented Feb 10, 2016

nicowilliams commented Feb 10, 2016

nicowilliams commented Feb 12, 2016

nicowilliams commented Feb 12, 2016

levitte commented Feb 12, 2016

mattcaswell commented Feb 18, 2016

mattcaswell commented Apr 6, 2016

mattcaswell commented May 25, 2016

mattcaswell commented Jun 7, 2016

mattcaswell commented Jun 7, 2016

mattcaswell commented Jun 7, 2016