Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relocatable: switch from run-time relocation to install-time relocation #4863

Merged
merged 1 commit into from Aug 19, 2019

Conversation

avikivity
Copy link
Member

@avikivity avikivity commented Aug 18, 2019

Our current relocation works by invoking the dynamic linker with the
executable as an argument. This confuses gdb since the kernel records
the dynamic linker as the executable, not the real executable.

Switch to install-time relocation with patchelf: when installing the
executable and libraries, all paths are known, and we can update the
path to the dynamic loader and to the dynamic libraries.

Since patchelf itself is dynamically linked, we have to relocate it
dynamically (with the old method of invoking it via the dynamic linker).
This is okay since it's a one-time operation and since we don't expect
to debug core dumps of patchelf crashes.

We lose the ability to run scylla directly from the uninstalled
tarball, but since the nonroot installer is already moving in the
direction of requiring install.sh, that is not a great loss, and
certainly the ability to debug is more important.

dh_strip barfs on some binaries which were treated with patchelf,
so exclude them from dh_strip. This doesn't lose any functionality,
since these binaries didn't have debug information to begin with
(they are already-stripped Fedora executables).

Fixes #4673.

Our current relocation works by invoking the dynamic linker with the
executable as an argument. This confuses gdb since the kernel records
the dynamic linker as the executable, not the real executable.

Switch to install-time relocation with patchelf: when installing the
executable and libraries, all paths are known, and we can update the
path to the dynamic loader and to the dynamic libraries.

Since patchelf itself is dynamically linked, we have to relocate it
dynamically (with the old method of invoking it via the dynamic linker).
This is okay since it's a one-time operation and since we don't expect
to debug core dumps of patchelf crashes.

We lose the ability to run scylla directly from the uninstalled
tarball, but since the nonroot installer is already moving in the
direction of requiring install.sh, that is not a great loss, and
certainly the ability to debug is more important.

dh_strip barfs on some binaries which were treated with patchelf,
so exclude them from dh_strip. This doesn't lose any functionality,
since these binaries didn't have debug information to begin with
(they are already-stripped Fedora executables).

Fixes scylladb#4673.
@avikivity
Copy link
Member Author

Updated: some hacks to make .deb generation works (documented in patch).

Copy link
Contributor

@espindola espindola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the right way to patch the binary at install time.

Given that we still need a shell script, it is not clear that it is not better to use a relative path for the interpreter instead of patching the binary at install time.

#!/bin/bash -e
export GNUTLS_SYSTEM_PRIORITY_FILE="\${GNUTLS_SYSTEM_PRIORITY_FILE-$prefix/libreloc/gnutls.config}"
export LD_LIBRARY_PATH="$prefix/libreloc"
exec -a "\$0" "$prefix/libexec/$bin" "\$@"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are creating a script anyway, it seems better to use a CWD relative path for the interpreter. That way the script can be similar to

cd $prefix/libexec
exec -a "$0" ./$bin "$@"

And we don't need to patch the binary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't change CWD from under the executable's feet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. LGTM

@@ -34,7 +34,9 @@ override_dh_installinit:
dh_installinit --no-start --name node-exporter

override_dh_strip:
dh_strip -Xlibprotobuf.so.15 -Xld.so --dbg-package={{product}}-server-dbg
# The binaries (ethtool...patchelf) don't pass dh_strip after going through patchelf. Since they are
# already stripped, nothing is lost if we exclude them, so that's what we do.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know what is special about these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

adjust_bin() {
local bin="$1"
# We could add --set-rpath too, but then debugedit (called by rpmbuild) barfs
# on the result. So use LD_LIBRARY_PATH in the thunk, below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why? What is the error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error manifests in debugedit complaining that DWARF version 0 is unsupported, followed by another failure. Random googling shows a bug in patchelf where the linker used the rpath string as a string pool and pointed other strings into it (not 100% sure), and since patchelf nuked it those other strings got corrupted. Maybe it's related and maybe not.

@bhalevy
Copy link
Member

bhalevy commented Aug 19, 2019

@fruch, we probably need to adjust scylla-ccm/ccmlib/scylla_node.py the same way to get debuggable coredumps.

https://github.com/scylladb/scylla-ccm/blob/468f559e248fb0144496b6807baacf4404b4a10d/ccmlib/scylla_node.py#L197-L201

@avikivity
Copy link
Member Author

@bhalevy ccm should use install.sh instead of duplicating it

@avikivity
Copy link
Member Author

@espindola is you are satisfied, please mark it on github, or @tgrabiec will never merge it

@slivne slivne requested a review from tgrabiec August 19, 2019 20:35
@slivne
Copy link
Contributor

slivne commented Aug 19, 2019

@tgrabiec if all is ok please merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Thread debugging is not available for relocatable scylla package
5 participants