Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGFPE in private __libc_early_init in glibc 2.34+ #5437

Open
derchr opened this issue Mar 29, 2022 · 23 comments
Open

SIGFPE in private __libc_early_init in glibc 2.34+ #5437

derchr opened this issue Mar 29, 2022 · 23 comments

Comments

@derchr
Copy link
Contributor

derchr commented Mar 29, 2022

Describe the bug
This bug may not only affect drcachesim but also drmemory, drcpusim and probably other clients as well.

When I run drcachesim like this: ./drrun -disable_rseq -t drcachesim -offline -- ls I get a SIGFPE:
[1] 2736 floating point exception (core dumped) ./drrun -disable_rseq -t drcachesim -offline -- ls

To Reproduce
Steps to reproduce the behavior:

  1. Pointer to a minimized application: ls should work
  2. Precise command line for running the application. ./drrun -disable_rseq -t drcachesim -offline -- ls
  3. Exact output or incorrect behavior. See above

Please also answer these questions:

  • What happens when you run without any client?
    Without any client works (thanks to -disable_rseq)

  • What happens when you run with debug build ("-debug" flag to drrun/drconfig/drinject)?
    Same behaviour

Expected behavior
No crash

Screenshots or Pasted Text

```


Program received signal SIGFPE, Arithmetic exception.
0x00007ffff770eb89 in ?? ()
(gdb) bt
#0  0x00007ffff770eb89 in ?? ()
#1  0x0000000000800000 in ?? ()
#2  0xffffffffffffffff in ?? ()
#3  0x0000000000000007 in ?? ()
#4  0xabababababababab in ?? ()
#5  0x00007ffff7ff0810 in ?? ()
#6  0x00007ffff7e76f77 in privload_os_finalize (privmod=0x7ffdb3ba77d8)
    at /home/derek/Git/dynamorio/core/unix/loader.c:693
#7  0x00007ffff7d54a7b in privload_load_process (privmod=0x7ffdb3ba77d8)
    at /home/derek/Git/dynamorio/core/loader_shared.c:818
#8  0x00007ffff7d54265 in privload_load (filename=0x7fffffffafa0 "/usr/lib/libc.so.6",
    dependent=0x7ffdb3ba7140, client=false) at /home/derek/Git/dynamorio/core/loader_shared.c:683
#9  0x00007ffff7e7700a in privload_locate_and_load (impname=0x7ffff7a2668a "libc.so.6",
    dependent=0x7ffdb3ba7140, reachable=false) at /home/derek/Git/dynamorio/core/unix/loader.c:710
#10 0x00007ffff7e7681f in privload_process_imports (mod=0x7ffdb3ba7140)
    at /home/derek/Git/dynamorio/core/unix/loader.c:566
#11 0x00007ffff7d549da in privload_load_process (privmod=0x7ffdb3ba7140)
    at /home/derek/Git/dynamorio/core/loader_shared.c:811
#12 0x00007ffff7d54265 in privload_load (filename=0x7fffffffb2b0 "/usr/lib/libm.so.6",
    dependent=0x7ffdb3ba6ab8, client=false) at /home/derek/Git/dynamorio/core/loader_shared.c:683
#13 0x00007ffff7e7700a in privload_locate_and_load (impname=0x7ffff675785c "libm.so.6",
    dependent=0x7ffdb3ba6ab8, reachable=false) at /home/derek/Git/dynamorio/core/unix/loader.c:710
#14 0x00007ffff7e7681f in privload_process_imports (mod=0x7ffdb3ba6ab8)
    at /home/derek/Git/dynamorio/core/unix/loader.c:566
#15 0x00007ffff7d549da in privload_load_process (privmod=0x7ffdb3ba6ab8)
    at /home/derek/Git/dynamorio/core/loader_shared.c:811
#16 0x00007ffff7d54265 in privload_load (filename=0x7fffffffb5c0 "/usr/lib/libstdc++.so.6",
    dependent=0x7ffdb3ba5478, client=false) at /home/derek/Git/dynamorio/core/loader_shared.c:683
#17 0x00007ffff7e7700a in privload_locate_and_load (impname=0x7fffb3bb486a "libstdc++.so.6", dependent=0x7ffdb3ba5478, reachable=false)
    at /home/derek/Git/dynamorio/core/unix/loader.c:710
#18 0x00007ffff7e7681f in privload_process_imports (mod=0x7ffdb3ba5478) at /home/derek/Git/dynamorio/core/unix/loader.c:566
#19 0x00007ffff7d549da in privload_load_process (privmod=0x7ffdb3ba5478) at /home/derek/Git/dynamorio/core/loader_shared.c:811
#20 0x00007ffff7d54265 in privload_load (filename=0x7fffffffb8d0 "/home/derek/Git/dynamorio/build/ext/lib64/debug/libdrsyms.so", dependent=0x7ffdb3b71fb8, client=true)
    at /home/derek/Git/dynamorio/core/loader_shared.c:683
#21 0x00007ffff7e7700a in privload_locate_and_load (impname=0x7fffb3b2cfdc "libdrsyms.so", dependent=0x7ffdb3b71fb8, reachable=true)
    at /home/derek/Git/dynamorio/core/unix/loader.c:710
#22 0x00007ffff7e7681f in privload_process_imports (mod=0x7ffdb3b71fb8) at /home/derek/Git/dynamorio/core/unix/loader.c:566
#23 0x00007ffff7d549da in privload_load_process (privmod=0x7ffdb3b71fb8) at /home/derek/Git/dynamorio/core/loader_shared.c:811
#24 0x00007ffff7d52a9a in privload_process_early_mods () at /home/derek/Git/dynamorio/core/loader_shared.c:139
#25 0x00007ffff7d52c84 in loader_init_epilogue (dcontext=0x7ffdb3ba0080) at /home/derek/Git/dynamorio/core/loader_shared.c:203
#26 0x00007ffff7bc2128 in dynamorio_app_init_part_two_finalize () at /home/derek/Git/dynamorio/core/dynamo.c:670
#27 0x00007ffff7e7a6f4 in privload_early_inject (sp=0x7fffffffdab0, old_libdr_base=0x0, old_libdr_size=140737488345328)
    at /home/derek/Git/dynamorio/core/unix/loader.c:2154
#28 0x00007ffff7e234c7 in reloaded_xfer () at /home/derek/Git/dynamorio/core/arch/x86/x86.asm:1179
#29 0x0000000000000001 in ?? ()
#30 0x00007fffffffded5 in ?? ()
#31 0x0000000000000000 in ?? ()
```

Versions

  • What version of DynamoRIO are you using?
    current master (562e797) and also 9.0.1

  • Does the latest build from https://github.com/DynamoRIO/dynamorio/releases solve the problem?
    No

  • What operating system version are you running on?
    Manjaro Linux (derivative of Arch Linux)

  • Is your application 32-bit or 64-bit?
    64bit

Additional context
This time, I wasn't able to test glibc 2.33, so it's not clear if this is also related to glibc 2.35.

Logs:
log.0.3045.txt
ls.0.3045.txt

When I run without -offline another issue occurs. DynamoRIO hangs while waiting on a pipe:

#0  0x00007ffff7694f0b in open64 () from /usr/lib/libc.so.6
#1  0x00005555555b959a in named_pipe_t::open_for_read (this=0x7ffff7f8d0d0)
    at /home/derek/Git/dynamorio/clients/drcachesim/common/named_pipe_unix.cpp:145
#2  0x00005555555d574a in ipc_reader_t::init (this=0x7ffff7f8d010)
    at /home/derek/Git/dynamorio/clients/drcachesim/reader/ipc_reader.cpp:77
#3  0x00005555555a7805 in analyzer_t::start_reading (this=0x5555557b9820)
    at /home/derek/Git/dynamorio/clients/drcachesim/analyzer.cpp:227
#4  0x00005555555a83be in analyzer_t::run (this=0x5555557b9820)
    at /home/derek/Git/dynamorio/clients/drcachesim/analyzer.cpp:296
#5  0x00005555555a48d2 in main (argc=12, targv=0x7fffffffd948)
    at /home/derek/Git/dynamorio/client
s/drcachesim/launcher.cpp:356

I will eventually also create an issue for this.

@derekbruening
Copy link
Contributor

This is what was reported on the list at https://groups.google.com/g/DynamoRIO-Users/c/CKQD11eXyfs and was about to be filed, so this will serve as the tracking issue.

Re: dr$sim online hanging on a pipe: be sure there isn't a stale pipe file from a prior aborted run which can cause such a hang.

@derekbruening derekbruening changed the title CRASH running drcachesim with offline tracing on any app SIGFPE in private __libc_early_init in glibc 2.34+ Mar 29, 2022
@derchr
Copy link
Contributor Author

derchr commented Mar 29, 2022

This is what was reported on the list at https://groups.google.com/g/DynamoRIO-Users/c/CKQD11eXyfs and was about to be filed, so this will serve as the tracking issue.

Oh, I only searched through the Github issues.

Re: dr$sim online hanging on a pipe: be sure there isn't a stale pipe file from a prior aborted run which can cause such a hang.

I made sure to delete the old pipe with rm /tmp/drcachesimpipe

@derekbruening
Copy link
Contributor

Pasting key info from https://groups.google.com/g/DynamoRIO-Users/c/CKQD11eXyfs

Tested on commit 5e13602, Arch Linux x86_64 (glibc package version is 2.35-3) and Ubuntu 21.10 x86_64 (glibc-bin package version is 2.34-0ubuntu3.2).
Even if I use pre-compiled binary (DynamoRIO-Linux-9.0.1.tar.gz), no works.

It looks the problem is related to #5134.
When I use commit 26b5fb (cherry-picked from commit 1dec190 for fixing build error), it's worked but when I use commit f3d907d (cherry-picked from commit 1dec190) it's crashed.

The below output is commit 5e13602 on Ubuntu 21.10.

$ gdb -q --args ./bin64/drrun -debug -c api/samples/bin/libbbcount.so -- ls
Program received signal SIGFPE, Arithmetic exception.
0x00007ffff783065e in ?? ()
(gdb) bt
#0  0x00007ffff783065e in __nptl_tls_static_size_for_stack () at ../nptl/nptl-stack.h:59
#1  __pthread_early_init () at ../sysdeps/nptl/pthread_early_init.h:46
#2  __libc_early_init (initial=<optimized out>) at libc_early_init.c:44
#3  0x00007ffff7e7a512 in privload_os_finalize (privmod=0x7ffdb3ba77b8) at ../core/unix/loader.c:693
#4  0x00007ffff7d55dd3 in privload_load_process (privmod=0x7ffdb3ba77b8) at ../core/loader_shared.c:818
#5  0x00007ffff7d555ad in privload_load (filename=0x7fffffffbd30 "/lib/x86_64-linux-gnu/libc.so.6", dependent=0x7ffdb3b71fb8, client=false) at ../core/loader_shared.c:683
#6  0x00007ffff7e7a5ad in privload_locate_and_load (impname=0x7fffb3b1c927 "libc.so.6", dependent=0x7ffdb3b71fb8, reachable=false) at ../core/unix/loader.c:710
#7  0x00007ffff7e79dae in privload_process_imports (mod=0x7ffdb3b71fb8) at ../core/unix/loader.c:566
#8  0x00007ffff7d55d32 in privload_load_process (privmod=0x7ffdb3b71fb8) at ../core/loader_shared.c:811
#9  0x00007ffff7d53d9a in privload_process_early_mods () at ../core/loader_shared.c:139
#10 0x00007ffff7d53f8c in loader_init_epilogue (dcontext=0x7ffdb3ba0080) at ../core/loader_shared.c:203
#11 0x00007ffff7bc224a in dynamorio_app_init_part_two_finalize () at ../core/dynamo.c:670
#12 0x00007ffff7e7dd20 in privload_early_inject (sp=0x7fffffffdf10, old_libdr_base=0x0, old_libdr_size=140737488346448) at ../core/unix/loader.c:2154
(gdb) x/i $pc
=> 0x7ffff7aae65e <__libc_early_init+142>:        div    rsi
(gdb) p $rsi
$2 = 0

Probably dl_tls_static_size and dl_tls_static_align (on glibc/nptl/nptl-stack.h:58 of 2.34-0ubuntu3.2's glibc source) are zero.

(gdb) x/10i __libc_early_init+98
   0x7ffff7aae632 <__libc_early_init+98>:        mov    rax,QWORD PTR [rip+0xa488f]        # 0x7ffff7b52ec8
   0x7ffff7aae639 <__libc_early_init+105>:        xor    edx,edx
   0x7ffff7aae63b <__libc_early_init+107>:        mov    rsi,QWORD PTR [rax+0x2b0]
=> 0x7ffff7aae642 <__libc_early_init+114>:        mov    rbx,QWORD PTR [rax+0x2a8]
   0x7ffff7aae649 <__libc_early_init+121>:        mov    rcx,QWORD PTR [rax+0x18]
   0x7ffff7aae64d <__libc_early_init+125>:        add    rbx,rsi
   0x7ffff7aae650 <__libc_early_init+128>:        mov    rax,rbx
   0x7ffff7aae653 <__libc_early_init+131>:        mov    QWORD PTR [rip+0xab4d6],rcx        # 0x7ffff7b59b30
   0x7ffff7aae65a <__libc_early_init+138>:        sub    rax,0x1
   0x7ffff7aae65e <__libc_early_init+142>:        div    rsi
(gdb) x/g $rax+0x2b0
0x7ffff7d7cef0:        0x0000000000000000
(gdb) x/g $rax+0x2a8
0x7ffff7d7cee8:        0x0000000000000000

So without the __libc_early_init call, glibc 2.32 crashes; but with it 2.34 crashes? Can't win. There must be some other magic hardcoded initialization done specially for libc in 2.34 by ld.so??

@wakabaplus
Copy link

Sorry for the delay in filing the Issue.

I tested on Ubuntu 22.04 daily build, but drrun gives SIGFPE.
But from #5431 (comment), Ubuntu 22.04 doesn't give it?

@derekbruening
Copy link
Contributor

But from #5431 (comment), Ubuntu 22.04 doesn't give it?

That comment is running without any client: the SIGFPE is only with a client that imports from libc. It does reproduce on Ubuntu 22.04 with a libc-importing client.

@wakabaplus
Copy link

So without the __libc_early_init call, glibc 2.32 crashes; but with it 2.34 crashes?

It looks that glibc 2.34 added a line which is __pthread_early_init(); to libc_early_init.c:44 on glibc-2.34 source code.
As you can see from the backtrace I submitted, __pthread_early_init() results in a crash.

This article says glibc 2.34 removed libpthread, and it integrated into libc.so.6. dynamorio's loader will fail to load libpthread.so.
It looks to have something to do with the crash.
https://developers.redhat.com/articles/2021/12/17/why-glibc-234-removed-libpthread

@derekbruening
Copy link
Contributor

Unfortunately glibc is going the direction of Android with Bionic with tight integration between the loader and libpthread with hardcoded, private dependences between them such that the loader cannot easily be replaced for private loading as it no longer uses clean public interfaces to load libc and libpthread. This is why Android support breaks with each release as they change the internal TLS layout: #3543, #3683

@derekbruening
Copy link
Contributor

Here is a proposal for avoiding DR having to perform custom undocumented
incantations to initialize libc/libpthread and to set up the TLS exactly how
ld.so/libc/libpthread (now merged for Linux; already merged for Android)
want it, which is fragile and breaks across versions.

Instead of DR being the private ld.so and loading the client lib and all
its imported-from libs, we have the real ld.so do that. We keep today's
scheme of mangling all the app seg refs so that the private libs have their
regular TLS (with DR using the other segment on x86 Linux; on aarchxx
Linux (and x86 Mac and x86 Windows) using a slot inside the private TLS,
which should still work here but it may be initialized later).

We create a "client executable" ELF file ("cliex") with ifuncs for the
entire DR API which resolve to the real libdynamorio.so addresses and with
exports for every redirected symbol like malloc which call the
libdynamorio.so handler. Those exports are also ifuncs. The ifuncs all
locate libdynamorio.so either by walking /proc/self/maps or via a TLS ref
(on x86 Linux) or something.

How is the client lib loaded: dynamically by the cliex, and the cliex
always imports from libdl.so?

What about client libs with no dependences other than libdynamorio? Do we
have two different schemes, having DR load these clients, so we have a mode
that works regardless of distro libraries?

Xref #1285 Mac private loader: though there there is an issue with having
a disjoint set of private libs in the first place so this scheme would not
solve all the problems there.

@derekbruening
Copy link
Contributor

@johnfxgalea @abhinav92003 looking for some feedback on which way to go here;
am I forgetting something that DR being the private loader is providing beyond
redirection? The TLS field if shared w/ the private libs should be similar either way.

The 3rd solution is to drop private library support completely and try to provide
a full set of utilities. This seems too limiting as there will always be something
not provided that a client wants to use. (This is the expected long-term
direction on Windows to get to earliest injection with a non-minimal client (the
"drwinapi" in the code) but it would still be limiting there.)

@johnfxgalea
Copy link
Contributor

am I forgetting something that DR being the private loader is providing beyond
redirection?

Yeah, redirection and the copying of own library versions (to avoid resource conflict and re-entrant issues). I don't think dropping support of the private loader is the best solution... AFAIR, DR has limited support for disabling the private loader but one has to stay dealing with gcc xflags.

I like your proposed cliex solution, although I'm not sure how resolving ifuncs on windows would work in a nice fashion.

Do we
have two different schemes, having DR load these clients, so we have a mode
that works regardless of distro libraries?

Are you concerned about performance of the proposed solution here? My first impression was, in the long run, to always use the cliex approach to help with maintainability, but not really sure whether to keep the two.

@derekbruening
Copy link
Contributor

am I forgetting something that DR being the private loader is providing beyond
redirection?

Yeah, redirection and the copying of own library versions (to avoid resource conflict and re-entrant issues). I don't think dropping support of the private loader is the best solution... AFAIR, DR has limited support for disabling the private loader but one has to stay dealing with gcc xflags.

In case it wasn't clear, the proposal is not to eliminate private library copies isolated from app libraries, but to eliminate DR as the loader of those private libraries and instead use a private copy of ld.so.

I like your proposed cliex solution, although I'm not sure how resolving ifuncs on windows would work in a nice fashion.

This would be only for Linux + Android.

Do we
have two different schemes, having DR load these clients, so we have a mode
that works regardless of distro libraries?

Are you concerned about performance of the proposed solution here? My first impression was, in the long run, to always use the cliex approach to help with maintainability, but not really sure whether to keep the two.

It would be simpler with one approach: but part of me likes having the fallback of a scheme that has no dependence on changes in ld.so/libc/libpthread for no-dep clients. Maybe those are so rare nowadays that it's not worth the maintenance burden.

@derekbruening
Copy link
Contributor

For workarounds until a long-term solution is developed:

For some of the simpler C clients, setting set(DynamoRIO_USE_LIBC OFF) should be a workaround for this problem. That used to be set for many of DR's provided samples, but is not the case today. Turning that on should enable those to run. drcov is currently built w/o any libc dependence and would be expected to run just fine on a glibc 2.34 system. (As noted up front, C++ clients like drmemtrace/drcachesim have no simple workaround other than trying to statically link with the C++ and C libraries if PIC versions of those are available.)

@derekbruening
Copy link
Contributor

The use-actual-loader proposal above should solve the client-using-pthread problems #956, #2848

@RobertHenry6bev
Copy link

It seems dynamorio doesn't run on ubuntu 22.04. This is a major bummer, as our summer research interns were hoping to use dynamorio on modern ubuntu 22.04. I'd rather not have to back up to ubuntu 20.04

@derekbruening
Copy link
Contributor

It seems dynamorio doesn't run on ubuntu 22.04. This is a major bummer, as our summer research interns were hoping to use dynamorio on modern ubuntu 22.04. I'd rather not have to back up to ubuntu 20.04

Please consider helping to solve the problem; having more contributors and active maintainers in the community helps tremendously. Also note that core DR and no-external-library clients should work fine on 22.04 (see #5437 (comment)).

@RobertHenry6bev
Copy link

My apologies for the comment yesterday. We'll try to make DR play for our needs on 22.04. Anything involving glibc triggers repressed nightmares from decades ago.

derekbruening added a commit that referenced this issue Oct 21, 2022
Adds a workaround for the SIGFPE in glibc 2.34+ __libc_early_init() by
setting two ld.so globals located via hardcoded offsets, making this
fragile and considered temporary.

Tested on glibc 2.34 where every libc-using client crashes with SIGFPE
but they work with this fix.

Adds an Ubuntu22 GA CI run but if we have failures due to other
reasons the plan is to drastically shrink the tests run or abandon if
it's too much work right now.

Issue: #5437
@derekbruening
Copy link
Contributor

derekbruening commented Oct 21, 2022

I'm surprised nobody else has put effort into solving this. Today I tried writing a reasonable value into the two ld vars identified above, which fixes the SIGFPE and allows the clients I tested to work as they had before. This is rather hacky as a hardcoded value is needed for the var GLRO offsets: unless someone knows of a way to find them more cleanly (decoding some exported function to find an offset would be a little better)? This is PR #5695.

Summarizing the situation:

  • Hacky workaround for now
  • Long-term possible choices:
    • Keep going with whatever tweaks/hacks are needed to keep the current DR-as-private-loader scheme working.
    • Static PIC libc + libc++ with --wrap and don't support other libs unless user can find a static PIC version and link it with --wrap. This would work on Mac as well where private libs may never work well but is limiting to users on Linux where previously there was a lot of freedom in library use and many libraries "just worked" with no effort. This will be annoying to the DR repo's own tools which rely on dynamic compression libs, libunwind, etc. If we go this route, a bonus would be that a tool could make a single static lib with DR and itself and its deps for simplified deployment.
    • Set up the private ld.so and client as though the kernel loaded them and let ld.so relocate and load everything else. This seems the most promising and as noted above may help Android as well but is a big refactor to how things are done and will likely encounter a bunch of issues even if the core setup is not too much code.
    • Switch from glibc to musl: require clients to use musl if they use libc. We'd have to experiment to see whether musl is missing any key features in its libc that many clients rely on; and we'd have to see whether its pthreads has its own problems.
    • Don't support library usage at all in general: this may seem like a non-starter but the model of linking DR and a client into the app ends up with this restriction, though some library use is possible if carefully done: e.g., the STL from the app can be used with placement new, and maybe a separate libc++ copy with --wrap could be linked. The DR API does have a lot of resources in it. Most users end up wanting other libs though.

derekbruening added a commit that referenced this issue Oct 22, 2022
Adds a workaround for the SIGFPE in glibc 2.34+ __libc_early_init() by
setting two ld.so globals located via hardcoded offsets, making this
fragile and considered temporary.  (Improvements might include decoding
__libc_early_init or other functions to find the offsets, which is also
fragile; making runtime options to set them for a non-rebuild fix;
disabling the call to __libc_early_init which doesn't seem to be needed
for 2.34).

Tested on glibc 2.34 where every libc-using client crashes with SIGFPE
but they work with this fix.

Adds an Ubuntu22 GA CI run but it has many failures due to the rseq
issue #5431.  Adds a workaround for this by having drrun set -disable_rseq
if it detects glibc 2.35+.  Even with this we have a number of test failures
so for now we use a label to just run 4 sanity-check tests.  This should
be enough to detect glibc changes that break the offsets here.

Issue: #5437, #5431
derekbruening added a commit to DynamoRIO/drmemory that referenced this issue Oct 22, 2022
Updates DR to cacb5424e for workarounds for 2 Ubuntu22 issues (glibc
SIGFPE and rseq failure).

Issue: DynamoRIO/dynamorio#5437, DynamoRIO/dynamorio#5431
derekbruening added a commit to DynamoRIO/drmemory that referenced this issue Oct 22, 2022
Updates DR to cacb5424e for workarounds for 2 Ubuntu22 issues (glibc
SIGFPE and rseq failure).

Issue: DynamoRIO/dynamorio#5437, DynamoRIO/dynamorio#5431
@SweetVishnya
Copy link
Contributor

SweetVishnya commented Oct 24, 2022

I tried to run my client with the proposed workaround. Now I get hang on release build on Ubuntu 22.04 instead of crash. The debug build reports the following error:

<Starting application /fuzz/build-ubuntu22.04/tests/synthetic/bin64/fread (6226)>
<Initial options = -no_dynamic_options -client_lib '/fuzz/build-ubuntu22.04/dynamorio/bin64/../tools/lib64/debug/libtracer.so;0;"-o" "tmp"' -client_lib64 '/fuzz/build-ubuntu22.04/dynamorio/bin64/../tools/lib64/debug/libtracer.so;0;"-o" "tmp"' -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -early_inject -emulate_brk -no_inline_ignored_syscalls -disable_rseq -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<(1+x) Handling our fault in a TRY at 0x00007f7aa419573f>
<Application /fuzz/build-ubuntu22.04/tests/synthetic/bin64/fread (6226).  Internal Error: DynamoRIO debug check failure: /fuzz/dynamorio/core/synch.c:261 res
(Error occurred @0 frags in tid 6226)
version 9.0.19289, custom build
-no_dynamic_options -client_lib '/fuzz/build-ubuntu22.04/dynamorio/bin64/../tools/lib64/debug/libtracer.so;0;"-o" "tmp"' -client_lib64 '/fuzz/build-ubuntu22.04/dynamorio/bin64/../tools/lib64/debug/libtracer.so;0;"-o" "tmp"' -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -early_inject 
0x00007f785ff8e670 0x00007f7aa3fbd015
0x00007f785ff8e8c0 0x00007f7aa40bae86
0x00007f785ff8e910 0x00007f7aa41dbfd6
0x00007f785ff8e9b0 0x00007f7aa4195260
0x00007f785ff87890 0x00007f7aa41dbe0b
0x00007f785ff87930 0x00007f7aa4195260
0x00007f785ff88010 0x00007f7aa41dc162
0x00007f785ff880b0 0x00007f7aa4195260
0x00007f785ff88790 0x00007f7aa41dc162
0x00007f785ff88830 0x00007f7aa4195260
0x00007f785ff88f10 0x00007f7aa41dc162
0x00007f785ff88fb0 0x00007f7aa4195260
0x00007f785ff89690 0x00007f7aa41dc162
0x00007f785ff89730 0x00007f7aa4195260
0x00007f785ff89e10 0x00007f7aa41dc162
/fuzz/build-ubuntu22.04/dynamorio/lib64/debug/libdynamorio.so=0x00007f7aa3ed2000
/fuzz/build-ubuntu22.04/dynamorio/bin64/../tools/lib64/debug/libtracer.so=0x00007f7a5fec7000
/lib/x86_64-linux-gnu/libc.so.6=0x00007f7aa3a41000
/usr/lib64/ld-linux-x86-64.so.2=0x00007f7aa3e4b000>

GDB backtrace:

#0  is_at_do_syscall (dcontext=0x7fb7ddb47080, 
    pc=0x7fba21dca2e2 <compute_memory_target+15> "H\211\275\270\365\377\377H\211\265\260\365\377\377H\211\225\250\365\377\377H\211\215\240\365\377\377L\211\205\230\365\377\377H\213\205\250\365\377\377H\203\300(H\211E\360H\307E\330", esp=0x7fb7ddb76e20 "") at /fuzz/dynamorio/core/synch.c:261
#1  0x00007fba21dcbfd6 in main_signal_handler_C (sig=11, siginfo=0x7fb7ddb7eaf0, ucxt=0x7fb7ddb7e9c0, xsp=0x7fb7ddb7e9b8 "`R\330!\272\177") at /fuzz/dynamorio/core/unix/signal.c:5906
#2  0x00007fba21d85260 in xfer_to_new_libdr () at /fuzz/dynamorio/core/arch/x86/x86.asm:1203
#3  0x0000000000000007 in ?? ()
#4  0x0000000000000000 in ?? ()

P.S. Commented out all code in my client except Boost options parsing that I statically link with. It works well. I'll continue uncommenting code part by part to determine what breaks the client.

@derekbruening
Copy link
Contributor

#1 0x00007fba21dcbfd6 in main_signal_handler_C (sig=11, siginfo=0x7fb7ddb7eaf0, ucxt=0x7fb7ddb7e9c0, xsp=0x7fb7ddb7e9b8 "`R\330!\272\177") at /fuzz/dynamorio/core/unix/signal.c:5906

As you can see there is a SIGSEGV. The assert in synch.c on processing the SIGSEGV is a secondary effect. I would suggest getting a callstack of the SIGSEGV point and debugging from there, as well as callstacks of all threads for the release build hang. If this is not related to the private loader (the crash/hang is not in a private library) please open a separate issue.

@SweetVishnya
Copy link
Contributor

SweetVishnya commented Oct 26, 2022

I managed to get real backtrace:

#0  0x00007fbee5378253 in __GI___libc_cleanup_push_defer (buffer=buffer@entry=0x7ffef630c0b0) at ./nptl/libc-cleanup.c:30
#1  0x00007fbee5349b77 in __vfscanf_internal (s=s@entry=0x7ffef630c6c0, format=format@entry=0x7fbea1c272dc "%u.%u.%u", argptr=argptr@entry=0x7ffef630c6a8, mode_flags=mode_flags@entry=2)
    at ./stdio-common/vfscanf-internal.c:372
#2  0x00007fbee53493e2 in __GI___isoc99_sscanf (s=0x7ffef630c932 "5.4.0-131-generic", format=0x7fbea1c272dc "%u.%u.%u") at ./stdio-common/isoc99_sscanf.c:31
#3  0x00007fbea1b11d3c in boost::filesystem::detail::(anonymous namespace)::syscall_initializer::syscall_initializer (this=0x7fbea1d18240) at libs/filesystem/src/operations.cpp:883
#4  0x00007fbea1b18585 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=32767) at libs/filesystem/src/operations.cpp:894
#5  0x00007fbea1b1859f in _GLOBAL__sub_I.32767_operations.cpp(void) () at libs/filesystem/src/operations.cpp:3787
#6  0x00007fbee5a924ed in privload_call_lib_func (func=0x7fbea1b18588 <_GLOBAL__sub_I.32767_operations.cpp(void)>) at /fuzz/dynamorio/core/unix/loader.c:1068
#7  0x00007fbee5a90e4f in privload_call_entry (dcontext=0x7fbca17fd080, privmod=0x7fbca17d45b0, reason=1) at /fuzz/dynamorio/core/unix/loader.c:627
#8  0x00007fbee5967ff6 in privload_call_entry_if_not_yet (dcontext=0x7fbca17fd080, privmod=0x7fbca17d45b0, reason=1) at /fuzz/dynamorio/core/loader_shared.c:121
#9  0x00007fbee596a1e7 in privload_load_finalize (dcontext=0x7fbca17fd080, privmod=0x7fbca17d45b0) at /fuzz/dynamorio/core/loader_shared.c:829
#10 0x00007fbee59683c9 in loader_init_epilogue (dcontext=0x7fbca17fd080) at /fuzz/dynamorio/core/loader_shared.c:217
#11 0x00007fbee57d62ee in dynamorio_app_init_part_two_finalize () at /fuzz/dynamorio/core/dynamo.c:675
#12 0x00007fbee5a94f75 in privload_early_inject (sp=0x7ffef630e870, old_libdr_base=0x0, old_libdr_size=1) at /fuzz/dynamorio/core/unix/loader.c:2245
#13 0x00007fbee5a3b24a in reloaded_xfer () at /fuzz/dynamorio/core/arch/x86/x86.asm:1179
#14 0x0000000000000002 in ?? ()
#15 0x00007ffef63107c9 in ?? ()
#16 0x00007ffef63107e5 in ?? ()
#17 0x0000000000000000 in ?? ()

P.S. Now I am just replacing all boost::filesystem with std::filesystem in my project.
P.S.S. std::filesystem::canonical also crashes DR, removed it.

@SweetVishnya
Copy link
Contributor

SweetVishnya commented Oct 27, 2022

Now I am getting SIGFPE on 32 bit client:

#0  0xf72a288d in __libc_early_init ()
#1  0xf7dca603 in privload_os_finalize (privmod=0x41c73b54) at /fuzz/dynamorio/core/unix/loader.c:749
#2  0xf7cd4bda in privload_load_process (privmod=0x41c73b54) at /fuzz/dynamorio/core/loader_shared.c:818
#3  0xf7cd4457 in privload_load (filename=0xffb62b60 "/usr/lib32/libc.so.6", dependent=0x40bfa4f8, client=false) at /fuzz/dynamorio/core/loader_shared.c:683
#4  0xf7dca6a7 in privload_locate_and_load (impname=0xf7772c66 "libc.so.6", dependent=0x40bfa4f8, reachable=false) at /fuzz/dynamorio/core/unix/loader.c:765
#5  0xf7dc9ccb in privload_process_imports (mod=0x40bfa4f8) at /fuzz/dynamorio/core/unix/loader.c:569
#6  0xf7cd4b4b in privload_load_process (privmod=0x40bfa4f8) at /fuzz/dynamorio/core/loader_shared.c:811
#7  0xf7cd2d6d in privload_process_early_mods () at /fuzz/dynamorio/core/loader_shared.c:139
#8  0xf7cd2f4d in loader_init_epilogue (dcontext=0x40c017c0) at /fuzz/dynamorio/core/loader_shared.c:203
#9  0xf7b7a524 in dynamorio_app_init_part_two_finalize () at /fuzz/dynamorio/core/dynamo.c:675
#10 0xf7dcd7c6 in privload_early_inject (sp=0xffb64380, old_libdr_base=0x0, old_libdr_size=0) at /fuzz/dynamorio/core/unix/loader.c:2245
#11 0xf7d81b2f in reloaded_xfer () at /fuzz/dynamorio/core/arch/x86/x86.asm:1187

derekbruening added a commit that referenced this issue Mar 9, 2023
Adds the same workaround for the SIGFPE in glibc 2.34+
__libc_early_init() as for 64-bit in PR #5695: we hardcode the 32-bit
offsets of the two globals written by the workaround.

Tested on glibc 2.34 where every libc-using client crashes with SIGFPE
but they work with this fix.

Adds an Ubuntu22 GA CI 32-bit run.

Issue: #5437
derekbruening added a commit that referenced this issue Mar 10, 2023
Adds the same workaround for the SIGFPE in glibc 2.34+
__libc_early_init() as for 64-bit in PR #5695: we hardcode the 32-bit
offsets of the two globals written by the workaround.

Tested on glibc 2.34 where every libc-using client crashes with SIGFPE
but they work with this fix.

Adds an Ubuntu22 GA CI 32-bit run.

Issue: #5437
@SweetVishnya
Copy link
Contributor

@derekbruening, thank you! Workaround in #5902 resolved my issue.

@derekbruening
Copy link
Contributor

Feedback from #6693 (comment) on wanting to keep the use of the system glibc:

In my opinion, DynamoRIO's capability of being able to work with system glibc reasonably well is a great advantage over pin. I think we should just keep an eye on new versions of glibc and fix broken fields, unless this becomes too unmanageable of course. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants