New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix BinaryEdit::getResolvedLibraryPath for Ubuntu 22.04 #1362
Fix BinaryEdit::getResolvedLibraryPath for Ubuntu 22.04 #1362
Conversation
- the main issue was `/sbin/ldconfig -p` on Ubuntu 22.04 has an extra line at the end: "Cache generated by: ldconfig (Ubuntu GLIBC 2.35-0ubuntu3) stable release version 2.35" - this would result in an occasionaly segfault - refactored the rest to avoid strtok which is known to have a fair amount of problems
template <typename ContainerT = std::vector<std::string>, | ||
typename PredicateT = std::string (*)(const std::string&)> | ||
inline ContainerT | ||
delimit( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For as much as I hate in/out parameters, I'd rather replace this with
std::vector<std::string> vals;
boost::split(vals, line, boost::token_compress_on);
just to have one less hand-rolled solution in this codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes perfect sense. I've actually had this fix for a while but hadn't submitted it bc I figured that would be the case and hadn't looked around for any existing solution.
*pos = '\0'; | ||
if (strcmp(key, filename.c_str()) == 0) { | ||
paths.push_back(val); | ||
if(ldconfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just terrible, and I'm honestly surprised it took this long to get a nasty bug here. I vote we replace this and the manual directory list lookup below by just using
if(void *h = dlopen(filename.c_str(), RTLD_LAZY)) {
char origin[PATH_MAX];
if(dlinfo(h, RTLD_DI_ORIGIN, name) == 0) {
_emplace_if_exists(origin); // check is useless, but uniform
}
}
The only caveat here is that
If the calling object (i.e., the shared library or executable from which dlopen() is called) contains a DT_RPATH tag, and does not contain a DT_RUNPATH tag, then the directories listed in the DT_RPATH tag are searched.
If we want to work around this, we could just continue to do the manual search of LD_LIBRARY_PATH first to let users override.
Begrudgingly, the DYNINST_COMPILER_SEARCH_DIRS has to stay because ldconfig doesn't always have the compiler-specific search directories available. At least on my system, dlopen
wouldn't find libgcc_s.so which lives in /usr/lib/gcc/x86_64-linux-gnu/12/libgcc_s.so.
With this, we'd have the heuristic:
- Use the provided filename if file exists (via stat rules)
- Search DYNINST_REWRITER_PATHS
- Search LD_LIBRARY_PATH
- Use ld lookup rules via
dlopen
.
man dlopen
for reference
o (ELF only) If the calling object (i.e., the shared library or executable from which dlopen() is called) contains a DT_RPATH tag, and does not contain a DT_RUNPATH tag, then the directories listed in the DT_RPATH tag are searched.
o If, at the time that the program was started, the environment variable LD_LIBRARY_PATH was defined to contain a colon-separated list of directories, then these are searched. (As a security measure, this variable is ignored for set-user-ID and set-group-ID programs.)
o (ELF only) If the calling object contains a DT_RUNPATH tag, then the directories listed in that tag are searched.
o The cache file /etc/ld.so.cache (maintained by ldconfig(8)) is checked to see whether it contains an entry for filename.
o The directories /lib and /usr/lib are searched (in that order).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want me to proceed with this? I'll probably first use with the RTLD_NOLOAD to make sure it's not already open bc I'm concerned whether there is a guarantee that loading the library will not have any unintended side effects outside of just finding the library, e.g., cause Dyninst to start parsing that library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had an internal discussion about this yesterday. The confounding factors are the RTLD_NOLOAD won't let you open a file unless it's already open. This means the only way to inspect RTLD_DI_ORIGIN is to actually load the library into Dyninst's address space. Aside from the security implications, it's just an inefficient way of doing it. We're pretty much stuck to parsing the output of ldconfig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about NOLOAD on the library containing this function, grabbing that DI_ORIGIN and searching that? In many cases, that could give us the path we want. We could also NOLOAD dlopen the executable, get the link path, and search the dirs of the libs it is linked to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I fully understand what NOLOAD does. My point is that since this function is being called, unless the function is statically linked, the library is loaded so it will return a handle. And the same goes for passing nullptr to dlopen (which returns the handle to the exe). As long as those are dynamic objects, you can get the handle via NOLOAD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this helps to clarify:
auto* _handle = dlopen("libdyninstAPI.so", RTLD_LAZY|RTLD_NOLOAD);
// ... get origin of handle ...
auto _lib_directory = dirname(...);
In other words, the RT library you want is extremely likely to be installed in the same directory as the dyninstAPI library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is supposed to resolve any library name, so we have to at least consider everything in ld.so.cache which is likely a larger set than the union of DYNINST_REWRITER_PATHS, LD_LIBRARY_PATH, DYNINST_COMPILER_SEARCH_DIRS, and $ORIGIN.
@kupsch @bigtrak
This code is really just parsing the basedirname of the files reported from the cache. Why don't we just read the cache directly and report the basedirname or, if we want to implement a parser, from /etc/ld.so.conf?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is supposed to resolve any library name
Ok well, then how about adding a vector hints parameter which gets searched after LD_LIBRARY_PATH for situations like finding the RT library. If you are concerned about the ABI, we can just implement one with the same signature that calls the new one with zero hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok well, then how about adding a vector hints parameter which gets searched after LD_LIBRARY_PATH for situations like finding the RT library.
What kinds of hints were you thinking? I think adding the "self" $ORIGIN would be good to do regardless. Refactoring this function to check for the file existence before proceeding on to the next lookup location would also be good. This would put off parsing ldconfig unless it's absolutely necessary and would make the $ORIGIN check an actual short circuit. Nevermind. I keep forgetting this function doesn't do what's printed on the tin. It returns all paths that contain the given file name. Shortcircuiting would break that behavior.
P.S. dlopen(nullptr)
is guaranteed to give you the current binary whereas dlopen("libdyninstAPI.so")
might not.
This is testing fine and fixes the issue in my 22.04 container. We can pick up the discussion about how to update/replace using ldconfig when we have time. |
This is testing fine and fixes the issue in my 22.04
container. We can pick up the discussion about how to
update/replace using ldconfig when we have time.
I'll set this testing in my 22.04 environment, and also
on recent fedora -- I'd just set those up for testing
the point release, and perfect timing on this.
Bolo
|
/sbin/ldconfig -p
on Ubuntu 22.04 has an extra line at the end: "Cache generated by: ldconfig (Ubuntu GLIBC 2.35-0ubuntu3) stable release version 2.35"strtok
(which is known to have a fair amount of problems) and consistently apply checking whether path existsldconfig -p
on Ubuntu 20.04ldconfig -p
on Ubuntu 22.04