New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread debugging is not available for relocatable scylla package #4673
Comments
In scylla-debuginfo package, we have /usr/lib/debug/opt/scylladb/libreloc/libthread_db-1.0.so-666.development-0.20190711.73a1978fb.el7.x86_64.debug but we actually does not have libthread_db.so.1 in /opt/scylladb/libreloc since it's not available on ldd result with scylla binary. To debug thread, we need to add the library in a relocatable package manually. Fixes #4673 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190711111058.7454-1-syuu@scylladb.com>
In scylla-debuginfo package, we have /usr/lib/debug/opt/scylladb/libreloc/libthread_db-1.0.so-666.development-0.20190711.73a1978fb.el7.x86_64.debug but we actually does not have libthread_db.so.1 in /opt/scylladb/libreloc since it's not available on ldd result with scylla binary. To debug thread, we need to add the library in a relocatable package manually. Fixes #4673 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <20190711111058.7454-1-syuu@scylladb.com> (cherry picked from commit 842f75d)
Even with
|
I don't see anything interesting in strace, looks like gdb reads debug info files and then fails. When I execute the command for the second time, there's no shared lib reading:
Also, the backtrace has all things "optimized out", which is unusual. It may be a problem with matching debug info. When debugging the same crash on a non-relocatable binary, there is plenty of things in the backtrace which are not optimized out. |
Cc @espindola |
@bhalevy I think you could use |
@slivne I'm not aware of any workaround. @espindola, did you have a chance to look at this? |
I guess I'm off - but I'll try - can we use the docker that was used to
build the version to analyze the coredump ?
…On Wed, Aug 14, 2019 at 2:28 PM Tomasz Grabiec ***@***.***> wrote:
@slivne <https://github.com/slivne> I'm not aware of any workaround.
@espindola <https://github.com/espindola>, did you have a chance to look
at this?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4673>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA2OCCD2V3CYL7ZVEFL5753QEPT7FANCNFSM4H67LNBQ>
.
|
On Wed, Aug 14, 2019 at 2:20 PM Shlomi Livne ***@***.***> wrote:
I guess I'm off - but I'll try - can we use the docker that was used to
build the version to analyze the coredump ?
I tried that, same problem.
… |
What about using the frozen toolchain? It even has gdb installed. |
Same problem.
…On Wed, Aug 14, 2019 at 2:59 PM Avi Kivity ***@***.***> wrote:
What about using the frozen toolchain? It even has gdb installed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4673>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AACFIL3HFNNVVJIH3BAIZK3QEP6TBANCNFSM4H67LNBQ>
.
|
I guess it happens because the core contains links to paths which no longer exist. Or perhaps rpm's post-processing to generate split debuginfo got confused. |
I'll try to reproduce to see which is which. |
No. I came back from vacation on Monday and have been going over review requests on the types.hh refactoring. |
Well, one problem is that gdb thinks the binary is ld.so. It is correct in thinking so, but we need to trick it into thinking the binary is scylla.bin. |
(that problem happens with |
With
in fact they're even nicer because gdb colorizes them. |
On Thu, Aug 15, 2019 at 01:08:57AM -0700, Avi Kivity wrote:
Well, one problem is that gdb thinks the binary is ld.so. It is correct in thinking so, but we need to trick it into thinking the binary is scylla.bin.
We can set the interpreter path when linking then there will be no need to
run Scylla via ld.so.
…--
Gleb.
|
We don't know the interpreter path during link time. |
On Thu, Aug 15, 2019 at 01:16:10AM -0700, Avi Kivity wrote:
We don't know the interpreter path during link time.
There is always a hack with patchelf.
…--
Gleb.
|
Yes, but patchself itself is dynamically linked. What we can do is do the ld.so trick with patchelf, then use patchelf during installation to adjust the binary. |
Will there ever be a binary that we don't relocate? Why can't we recompile patchelf statically ? Does it really need to be dynamically linked ? |
Also, if we use relative paths, can't we know the path to the interpreter at build time ? So we could compile scylla setting RPATH to |
The interpreter has to be an absolute path. |
Our current relocation works by invoking the dynamic linker with the executable as an argument. This confuses gdb since the kernel records the dynamic linker as the executable, not the real executable. Switch to install-time relocation with patchelf: when installing the executable and libraries, all paths are known, and we can update the path to the dynamic loader and to the dynamic libraries. Since patchelf itself is dynamically linked, we have to relocate it dynamically (with the old method of invoking it via the dynamic linker). This is okay since it's a one-time operation and since we don't expect to debug core dumps of patchelf crashes. We lose the ability to run scylla directly from the uninstalled tarball, but since the nonroot installer is already moving in the direction of requiring install.sh, that is not a great loss, and certainly the ability to debug is more important. Fixes scylladb#4673.
I was just checking that. The kernel passes PT_EXEC to open_exec(elf_interpreter), and that becomes do_open_execat(AT_FDCWD, filename, 0), so it should work with relative paths, no? |
relative paths do work, I just created a file with |
But do thread-locals work? The issue was with thread locals, not lack of backtraces. |
@espindola the paths are relative to $CWD, while we want them to be relative to $ORIGIN. |
I wrote https://github.com/avikivity/scylla/commits/patchelf, which should fix the problem, except that it triggers a bug (in patchelf or debugedit) so we can't create rpms any more. |
@tgrabiec I think they don't, because the executable (as far as gdb is concerned) is ld.so instead of scylla. |
|
I'll just drop rpath modifications and rely on LD_LIBRARY_PATH. That means the binary has to be called through the thunk, but we have to have that for GNUTLS_SYSTEM_PRIORITY_FILE. |
On Sun, Aug 18, 2019 at 01:04:07AM -0700, Avi Kivity wrote:
`patchelf --set-interpreter` does not trigger the bug. `patchelf --set-rpath` does, with either $ORIGIN or a full path. We could set rpath in the linker command line, but then ./build/release/scylla wouldn't work any more.
So does the bug related or unrelated to what gdb think executable name
is? Because if it is related we may try to fool it by changing
argv[0][0] to point to the binary directly.
…--
Gleb.
|
We already did that (with exec -a), but I think the kernel records the true binary. |
Strangely, my hack failed testing, but it passed testing with |
Looks like both are needed. |
Our current relocation works by invoking the dynamic linker with the executable as an argument. This confuses gdb since the kernel records the dynamic linker as the executable, not the real executable. Switch to install-time relocation with patchelf: when installing the executable and libraries, all paths are known, and we can update the path to the dynamic loader and to the dynamic libraries. Since patchelf itself is dynamically linked, we have to relocate it dynamically (with the old method of invoking it via the dynamic linker). This is okay since it's a one-time operation and since we don't expect to debug core dumps of patchelf crashes. We lose the ability to run scylla directly from the uninstalled tarball, but since the nonroot installer is already moving in the direction of requiring install.sh, that is not a great loss, and certainly the ability to debug is more important. Fixes scylladb#4673.
With a bit of extra help, works on .deb too. |
Our current relocation works by invoking the dynamic linker with the executable as an argument. This confuses gdb since the kernel records the dynamic linker as the executable, not the real executable. Switch to install-time relocation with patchelf: when installing the executable and libraries, all paths are known, and we can update the path to the dynamic loader and to the dynamic libraries. Since patchelf itself is dynamically linked, we have to relocate it dynamically (with the old method of invoking it via the dynamic linker). This is okay since it's a one-time operation and since we don't expect to debug core dumps of patchelf crashes. We lose the ability to run scylla directly from the uninstalled tarball, but since the nonroot installer is already moving in the direction of requiring install.sh, that is not a great loss, and certainly the ability to debug is more important. dh_strip barfs on some binaries which were treated with patchelf, so exclude them from dh_strip. This doesn't lose any functionality, since these binaries didn't have debug information to begin with (they are already-stripped Fedora executables). Fixes scylladb#4673.
Our current relocation works by invoking the dynamic linker with the executable as an argument. This confuses gdb since the kernel records the dynamic linker as the executable, not the real executable. Switch to install-time relocation with patchelf: when installing the executable and libraries, all paths are known, and we can update the path to the dynamic loader and to the dynamic libraries. Since patchelf itself is dynamically linked, we have to relocate it dynamically (with the old method of invoking it via the dynamic linker). This is okay since it's a one-time operation and since we don't expect to debug core dumps of patchelf crashes. We lose the ability to run scylla directly from the uninstalled tarball, but since the nonroot installer is already moving in the direction of requiring install.sh, that is not a great loss, and certainly the ability to debug is more important. dh_strip barfs on some binaries which were treated with patchelf, so exclude them from dh_strip. This doesn't lose any functionality, since these binaries didn't have debug information to begin with (they are already-stripped Fedora executables). Fixes #4673.
Our current relocation works by invoking the dynamic linker with the executable as an argument. This confuses gdb since the kernel records the dynamic linker as the executable, not the real executable. Switch to install-time relocation with patchelf: when installing the executable and libraries, all paths are known, and we can update the path to the dynamic loader and to the dynamic libraries. Since patchelf itself is dynamically linked, we have to relocate it dynamically (with the old method of invoking it via the dynamic linker). This is okay since it's a one-time operation and since we don't expect to debug core dumps of patchelf crashes. We lose the ability to run scylla directly from the uninstalled tarball, but since the nonroot installer is already moving in the direction of requiring install.sh, that is not a great loss, and certainly the ability to debug is more important. dh_strip barfs on some binaries which were treated with patchelf, so exclude them from dh_strip. This doesn't lose any functionality, since these binaries didn't have debug information to begin with (they are already-stripped Fedora executables). Fixes #4673.
Our current relocation works by invoking the dynamic linker with the executable as an argument. This confuses gdb since the kernel records the dynamic linker as the executable, not the real executable. Switch to install-time relocation with patchelf: when installing the executable and libraries, all paths are known, and we can update the path to the dynamic loader and to the dynamic libraries. Since patchelf itself is dynamically linked, we have to relocate it dynamically (with the old method of invoking it via the dynamic linker). This is okay since it's a one-time operation and since we don't expect to debug core dumps of patchelf crashes. We lose the ability to run scylla directly from the uninstalled tarball, but since the nonroot installer is already moving in the direction of requiring install.sh, that is not a great loss, and certainly the ability to debug is more important. dh_strip barfs on some binaries which were treated with patchelf, so exclude them from dh_strip. This doesn't lose any functionality, since these binaries didn't have debug information to begin with (they are already-stripped Fedora executables). Fixes scylladb#4673. (cherry-picked from commit 698b72b) Backport notes: - 3.1 doesn't call install.sh from the debian packager, so add an adjust_bin and call it from the debian rules file directly - adjusted install.sh for 3.1 prefix (/usr) compared to master prefix (/opt/scylladb)
Our current relocation works by invoking the dynamic linker with the executable as an argument. This confuses gdb since the kernel records the dynamic linker as the executable, not the real executable. Switch to install-time relocation with patchelf: when installing the executable and libraries, all paths are known, and we can update the path to the dynamic loader and to the dynamic libraries. Since patchelf itself is dynamically linked, we have to relocate it dynamically (with the old method of invoking it via the dynamic linker). This is okay since it's a one-time operation and since we don't expect to debug core dumps of patchelf crashes. We lose the ability to run scylla directly from the uninstalled tarball, but since the nonroot installer is already moving in the direction of requiring install.sh, that is not a great loss, and certainly the ability to debug is more important. dh_strip barfs on some binaries which were treated with patchelf, so exclude them from dh_strip. This doesn't lose any functionality, since these binaries didn't have debug information to begin with (they are already-stripped Fedora executables). Fixes #4673. (cherry-picked from commit 698b72b) Backport notes: - 3.1 doesn't call install.sh from the debian packager, so add an adjust_bin and call it from the debian rules file directly - adjusted install.sh for 3.1 prefix (/usr) compared to master prefix (/opt/scylladb)
Scylla version: 3.1
Without thread debugging most of the
scylla-gdb.py
commands won't work. Thread-locals can't be read.I guess this is due to
/usr/lib/debug/opt/scylladb/libreloc/libthread_db.so.1
, which matches the/opt/scylladb/bin/../libreloc/libpthread.so.0
, being missing.GDB log with
set debug libthread-db 1
:The text was updated successfully, but these errors were encountered: