-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LD_PRELOAD jemalloc with LD_AUDIT library segfaults #2472
Comments
Similar to the issue you linked, this appears to be a circular dependency / init order issue, i.e. jemalloc is used before the TLS gets initialized, while jemalloc depends on a functioning TLS. The stack trace you shared shows: TLS-init calls malloc (which links to jemalloc), and then jemalloc tries to access TLS, which in turns triggers TLS init again. I can't think of a good solution right now. Maybe try avoiding allocations in the AUDIT lib, or make it not using jemalloc? TLS and pthreads are hard dependencies and it almost certainly will cause issues if not initialized before jemalloc. |
The test audit lib just returns 1 from la_version so is it the loader that is allocating? Would it make sense to have a bootstrap TLS that has a simple allocator that doesn't use malloc, e.g. allocating out of a static char array? |
The TLS storage actually isn't managed by jemalloc. On Linux it's accessing the
Does this only happen when adding the AUDIT lib? Can you try changing the below to
|
With that change it still crashes but in a different place:
|
This is a very nasty issue that prevents reliable jemalloc usage with LD_PRELOAD. It's kind of a gamble whether it will work or not with certain programs. Can't some suitable thread local storage implementation be statically linked into it or a workaround that detects whether allocation was called during initialization and perform an alternative allocation, just for TLS? |
I built a simple
When jemalloc is compiled with the So I believe the issue is not related to glibc brought from LD_AUDIT. |
That problem should have been fixed since https://src.fedoraproject.org/rpms/glibc/c/8aee7e3563ec434ce692fbce0b81ef9ba53c2a0a?branch=rawhide This issue looks like a regression. |
I'm wondering if this might actually be a glibc bug not a jemalloc bug? It seems like the combination of TLS usage in LD_PRELOAD with LD_AUDIT causes the failure. Looking at the glibc-2.36 source on Debian, the TLS error is thrown at elf/dl-reloc.c:140 I set a breakpoint there and printed a backtrace:
The backtrace is:
I don't claim to understand rtld.c, but hazarding a rough guess from the backtrace, it looks like the problem is caused by Here's a smaller reproducer:
This results in:
If I make the size of This is on aarch64 Debian:
Is there any limit on TLS size? If not it seems like this may be a glibc bug? |
We've also run into something similar with the We've stripped things back and built jemalloc from source. The segfault occurs if we just run:
gdb gives the following stacktrace: Details
jemalloc was built using the same options as the conda-forge feedstock:
We ran a At this point we are a bit out of our depth. The stacktrace above suggests some type of chicken-and-egg problem with dynamic loading and tls...? We have noticed that if we remove the Does this look like the same issue to you think or should I open a separate issue for this? System details
|
What version of jemalloc are you using?
Version 5.3.0
Also tested trunk
What operating system and version?
Linux - Red hat Enterprise Linux 8 (
rpm --query redhat-release
reports: redhat-release-8.6-0.1.el8.x86_64)Also tested on debian
What runtime / compiler are you using?
g++ 6.3.1
What did you do?
Created a simple library suitable for loading with LD_AUDIT
Ran 'ls' with the following LD_PRELOAD and LD_AUDIT:
LD_PRELOAD=$MY_SCRATCH_DIR/mylibs/lib/libjemalloc.so LD_AUDIT=libsimple.so ls
(note: this causes all executables that I have tried to fail in the same way)
What did you expect to see?
The output from ls
What did you see instead?
When jemalloc is compiled with the defaults ls gives the following error message:
ls: error while loading shared libraries: libjemalloc.so: cannot allocate memory in static TLS block
When jemalloc is compiled with the
--disable-initial-exec-tls
flag ls segfaultsI found the flag mentioned in this issue: #1237
Source Code for repro:
Compile with:
g++ -m64 simple.cpp -shared -ldl -o libsimple.so
The stack trace of the core dump from the test using a jemalloc built with --disable-initial-exec-tls is:
Note: https://bugzilla.redhat.com/show_bug.cgi?id=1878932 mentions a similar sounding problem.
The text was updated successfully, but these errors were encountered: