Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc OSX LC_LOAD_DYLIB paths are broken #28640

Open
m4b opened this issue Sep 24, 2015 · 14 comments
Open

rustc OSX LC_LOAD_DYLIB paths are broken #28640

m4b opened this issue Sep 24, 2015 · 14 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. O-macos Operating system: macOS

Comments

@m4b
Copy link
Contributor

m4b commented Sep 24, 2015

I've noticed that the /usr/local/bin/rustc has several dylib load commands for various rust libraries which have incorrect/nonexistent paths prefixed to them.

There are two points I'd like to bring up, the first being much more serious.

Bad Paths

In my setup various tools all report that /usr/local/bin/rustc loads/requires the following dynamic libraries:

x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_driver-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_trans-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_privacy-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_borrowck-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_resolve-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_lint-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_typeck-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libflate-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_data_structures-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libarena-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libgraphviz-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libgetopts-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librbml-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_back-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libsyntax-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libserialize-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libterm-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/liblog-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libfmt_macros-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_llvm-198068b3.dylib
x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib
/usr/lib/libSystem.B.dylib
/usr/lib/libedit.3.dylib
/usr/lib/libc++.1.dylib

However, most of these (the rust libs) appear to be an artifact from the initial compilation build (on a side note, non-absolute paths in OSX typically should have an @rpath or @install_path prefixed to the path, etc.) . If you run:

DYLD_PRINT_LIBRARIES=true /usr/local/bin/rustc --version

you will notice all of the libraries actually get bound to /usr/local/lib/<name of dylib>. This only accidentally works, because dyld's default library search path(s) are:

$(HOME)/lib:/usr/local/lib:/lib:/usr/lib

and when interpreting a binary if dyld fails to find a library at the specified LC_LOAD_DYLIB path, it then takes the basename of the library with the bad path, e.g.:

x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib -> libstd-198068b3.dylib

and looks for that dynamic library in the default path list; in our case it just so happens the libraries were installed to /usr/local/lib, so it finds them and runs as normal. If they were installed to a nonstandard path, /usr/local/bin/rustc would fail to run with something like:

dyld: Library not loaded: x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib
  Referenced from: /usr/local/bin/rustc
  Reason: image not found
Trace/BPT trap: 5

You can verify this by moving /usr/local/lib/libstd-198068b3.dylib somewhere else (don't do this unless you know how to undue it), or by manually editing the basename of the imported libraries in the rustc binary (don't do this either unless you know how to undo it); either case will fail with an error similar to the above.

Therefore /usr/local/bin/rustc is only accidentally running correctly on OSX as of this writing.

If /usr/local/lib/ is the preferred location for rust libraries (and why not), then I highly suggest outputting LC_LOAD_DYLIB paths like /usr/local/lib/libstd-198068b3.dylib, etc.

If you want this dynamic or configurable, then I suggest using @rpath although this adds more complexity; more information can be found at the Apple official documentation.

Too Many Libraries

This is a less serious issue, but I believe most of the libraries printed above are actually not required to run rustc on OSX. Please correct me if I'm wrong, but it looks like the only imports in rustc are:

2038 __ZN4main20h62bf81b281987584efdE (8) ~> x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_driver-198068b3.dylib
2040 __ZN2rt10lang_start20hd654f015947477d622wE (8) ~> x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib
2048 _rust_stack_exhausted (8) ~> x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib
2050 _exit (8) ~> /usr/lib/libSystem.B.dylib
2028 dyld_stub_binder (8) -> /usr/lib/libSystem.B.dylib

Hence, the minimal set of dynamic libraries required is:

/usr/local/lib/librustc_driver-198068b3.dylib
/usr/local/lib/libstd-198068b3.dylib
/usr/lib/libSystem.B.dylib

which I believe closely matches the library dependencies for the GNU/Linux rustc distribution.

@alexcrichton
Copy link
Member

This is actually largely all expected behavior depending on how you look at it! For the first part, it looks like the entry in LC_LOAD_DYLIB is the same as the dylib's LC_ID_DYLIB directive. This is set via the -install_name flag to the linker, but because the compiler doesn't leverage the option the name defaults to -o:

     -install_name name
                 Sets an internal "install path" (LC_ID_DYLIB) in a dynamic
                 library. Any clients linked against the library will record
                 that path as the way dyld should locate this library.  If
                 this option is not specified, then the -o path will be
                 used.  This option is also called -dylib_install_name for
                 compatibility.

So to "fix" this we'd want to pass -install_name to all our linker invocations. Note that we're currently intentionally not using rpath or other feature (although it's debatable as to whether we should!).

Do you know if there's a benefit (beyond looking nicer) to doing this? It looks like we wouldn't necessarily get any concrete benefit, but I'm certainly no expert in this area!

For the second part, this is actually a little subtle with dylib generation and how we call the linker. Due to the way linkage in Rust work we continually link dylibs downstream instead of just linking them once, so when the rustc binary is generated it's linked (via -l) to all the upstream dylibs (e.g. all the libs you're seeing). The linker appears to assume that all dynamic libraries must be linked, so it emits LC_LOAD_DYLIB sections for all of them.

Now we also pass the -dead_strip flag to the linker which also allows stripping unused dynamic libraries, except that we don't actually generate any dynamic libraries that can be stripped because we don't pass this option:

     -mark_dead_strippable_dylib
                 Specifies that the dylib being built can be dead strip by
                 any client.  That is, the dylib has no initialization side
                 effects.  So if a client links against the dylib, but never
                 uses any symbol from it, the linker can optimize away the
                 use of the dylib.

The clause there about no initialization side effects makes me wary that we'd want to start passing it because a statically linked native library may have side effects (even though no Rust code does). Like the previous point, though, do you know of any concrete benefits of enabling dead stripping of dylibs? Everything will eventually be transitively needed anyway, so I'd expect it to not buy us too much beyond aesthetics of course.

@m4b
Copy link
Contributor Author

m4b commented Sep 25, 2015

Point 1

Good call checking the LC_ID_DYLIB.

So I would suggest using the install-name flag as you suggested; I believe this is typical in OSX builds and unusual to leave it off.

As for benefits of using this, honestly the aesthetic argument is enough for me, as the current setup looks strange and non-standard. I would argue when in Rome, do as the Romans do in this case.

A second reason is again, the dynamic lib calls are only accidentally working, because of the default search paths of dyld. To rely on this behavior when it can be solved with something like -install_name /usr/local/lib/$(basename $i) seems unecessary, as it could be potentially dangerous (what if the default search locations change, someone uses non-standard environment variables, etc.), which can all be easily resolved by setting the install names properly.

Lastly, and this may or may not be a security issue, but again, the dylib's name (LC_ID_DYLIB) and LC_LOAD_DYLIB are called install names for a reason; it's where they reside and where the linker looks first.

To illustrate this, in a directory, say /tmp, create the following folder:

x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/

Now cp libstd-198068b3.dylib to that folder and run:

DYLD_PRINT_LIBRARIES=true rustc --version

and you should see that dyld loaded the libstd-198068b3.dylib in the directories we just created. In other words, it loaded some dynamic library named the same thing in the directories, but which isn't the system library... This cannot happen when the install name is absolute (or properly prefixed with an @rpath)

Again, this may or may not be a security issue; it doesn't look like the libraries are signed, so I could in principle alter any symbol's assembly in the tmp libstd-198068b3.dylib to do whatever I like; regardless though, this can all be avoided and resolved by simply setting the install names to the expected values, and it just looks nice to boot :)

Point 2

Again, for me the aesthetic argument is sufficient, in addition to if they are serving no purpose, then remove them.

As for side-effects, that is definitely a legitimate concern; given this condition, then since the majority of the dead libraries are rust libraries (which as you say have no side-effects), I'd suggest marking them as dead libraries at the very least. It cleans up the binary, and makes it clearer what exactly this binary depends on, say for static analysis tools, someone looking at the binary, someone computing the transitive closure of symbols against a library ;), etc.

As for the other two native libraries, which could be side-effecting, I only see:

/usr/lib/libedit.3.dylib
/usr/lib/libc++.1.dylib

As for the side-effect concern itself, I believe that warning only applies to loading a library purely for it's initialization side-effect, and no other reason. I'm not sure how common this is, and I suspect not at all.

If libedit and libc++ are a transitive dependency, then I believe dyld will load and call every libraries init routines before any symbol linking or passing control to main occurs, and so I don't think this is something you need to worry about (unless, like above, you are loading those libraries purely for their side-effects and you know the library isn't required and hence loaded by any other library or its transitive dependencies).

Of course don't take my word, absolutely test, but I believe all of the above caveats would also apply to the GNU/Linux version of rustc, which has the exact minimal dependencies I listed above.

So if the worry is not including libc++ in the load commands for its side-effects, this will equally apply to the GNU/Linux case, but which doesn't seem to be an issue there (and again, I believe the same condition applies to ld.so in that it loads and runs the init code of all libraries prior to passing control to the binaries main, etc.)

That being said, and given the above, I'd suggest the following as a conservative library declaration:

/usr/local/lib/librustc_driver-198068b3.dylib
/usr/local/lib/libstd-198068b3.dylib
/usr/lib/libSystem.B.dylib
/usr/lib/libedit.3.dylib
/usr/lib/libc++.1.dylib

Although again, I really don't think the latter two are needed at all, and only clutter things.

@steveklabnik steveklabnik added the O-macos Operating system: macOS label Sep 28, 2015
@m4b
Copy link
Contributor Author

m4b commented Sep 28, 2015

On a related note, I've also noticed that executables or dynamic libraries import the technically incorrect rust dynamic libraries if you go by those libraries's LC_ID_DYLIB (again, it works for the reasons above, namely that it doesn't find them and then uses the basename to resolve them against the default lib dirs).

E.g. compiling the following:

fn main (){
    println!("{}", 5);
}

with rustc -C prefer-dynamic foo.rs -o foo produces an executable with the following imports:

1018 __ZN2io5stdio6_print20h89c6f7f6ec14651d44gE (8) ~> x86_64-apple-darwin/stage2/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib
1020 __ZN2rt10lang_start20hd654f015947477d622wE (8) ~> x86_64-apple-darwin/stage2/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib
1028 _rust_stack_exhausted (8) ~> x86_64-apple-darwin/stage2/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib
1000 dyld_stub_binder (8) -> /usr/lib/libSystem.B.dylib
1010 __ZN3fmt3num16i32.fmt..Display3fmt20hd8c7f550e968736bw5ME (8) -> x86_64-apple-darwin/stage2/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib

NOTE the stage2 in the install_name path for the rust libs.

As stated above, otool -l /usr/local/lib/libstd-198068b3.dylib states that its LC_ID_DYLIB is:

x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/libstd-198068b3.dylib

NOTE the stage1.

rustc --version is rustc 1.3.0 (9a92aaf 2015-09-15), OSX 10.10.5

@alexcrichton
Copy link
Member

I think it'd be fine to start passing -install_name, but from looking around it's not clear to me that there's a value to pass in. For example there doesn't seem to be any definitive answer as to what a standard value is for a project that doesn't know where it's going to be installed. Targeting a solution at just our own installation won't be enough as these are the same artifacts that all Rust programs link against, so I'm not sure what should be passed in.

At the very least we could pass -install_name output_filename.dylib to strip the folders and such, but I'm not sure if that's really much better than where we're at today.

For the -mark_dead_strippable_dylib option, I think we'd be fine to just start passing that, it probably wouldn't cause too too many problems. Those two libraries that rustc links to aren't marked as strippable, however, and it has to be up to the linker currently to elide the symbols.

For the running against the "wrong libraries", this is actually somewhat intentional behavior in the sense that it's fine and it happens on all other systems as well.

@m4b
Copy link
Contributor Author

m4b commented Sep 28, 2015

I would suggest default prefixing with /usr/local/lib since that's where you install them by default anyway; and then if a user wants a non-standard install prefix, let them pass that in at compile time and use that. Again, @rpath and other friends can be helpful in this matter depending on the use case, but it is probably overkill for now.

I've noticed #17219 has other issues with the library install name paths, so it might be something you just want to set to /usr/local/lib and deal with it from there.

It will also prevent a library with the same name if a path exists during execution time from being used, which I mentioned above as well, and after thinking about it more, it is probably pretty serious and can be entirely avoided by using absolute install_names.

@m4b
Copy link
Contributor Author

m4b commented Sep 28, 2015

As for the wrong libraries, I can't understand how this is intentional or fine. To be blunt, it's just incorrect. Generated binaries library dependencies should be referencing the exact same name in LC_ID_DYLIB.

For example, scanning the dynamic libraries in /usr/local/lib with something like:

cd /usr/local/lib; for i in *.dylib; do otool -l $i | grep -A 3 LC_ID_DYLIB; done

The only offenders are rust libraries with non-absolute paths.

SImilarly for the binaries in /usr/local/bin:

cd /usr/local/bin; for i in *; do echo "BINARY $i"; otool -l $i | grep -A 3 LC_LOAD_DYLIB; done

The only offenders seem to be rust generated binaries.

As such I'm not sure what you mean when you say it's fine or happens all other systems as well? On linux you have soname's, so it's a different system, and what's occurring here can't come up. In fact, this is the purpose of ldconfig -p, it maps soname's to actual install locations; but if your soname is bad, your binary won't load.

@alexcrichton
Copy link
Member

Currently there's a number of methods by which you can install rust on OSX:

  1. Build from source
  2. Run the standard installers
  3. Homebrew
  4. Multirust

Unfortunately we can't just assume that /usr/local is the root for installation as it's only true some of the time, and worse still the builder of the compiler may not know where it's being installed to. As a result I don't think we can encode absolute paths as part of the build, but they can certainly be tweaked with install_name_tool after the fact.

When I say we're mixing up libraries what I mean is that the ones in /usr/local/lib are built in stage1 where the ones in /usr/local/lib/rustlib/$triple/lib are built during stage2. Artifacts are linked against the stage2 versions but end up running with the stage1 ones (due to DYLD_LIBRARY_PATH most of the time). We're actually required for these two stages to be the same for plugins, so it's find to switch them up.

And don't get me wrong, it definitely looks like there's something to do here! I'm not sure how much of it should be in the compiler vs as part of an installation, but it sounds like at least some change needs to be made!

@alexcrichton
Copy link
Member

cc @brson

@brson
Copy link
Contributor

brson commented Oct 8, 2015

It seems like this problem affects more than just the libs we distribute, but also all OS X libs that rustc generates.

It's baffling that dynamic linkers all seem to expect you to magically know features of your runtime environment from your build environment. Linux has a similar problem that the path to the dynamic linker is embedded in the binary.

Dynamic linkers are such a nightmare.

@comex
Copy link
Contributor

comex commented Jan 3, 2017

I think this is fixed. I now get

          cmd LC_LOAD_DYLIB
      cmdsize 64
         name @rpath/libstd-713ad88203512705.dylib (offset 24)

with rustc installed via rustup.

@m4b
Copy link
Contributor Author

m4b commented Jan 3, 2017

@comex can you update with output for the LC_RPATH value?

@comex
Copy link
Contributor

comex commented Jan 3, 2017

Oh, yeah. Actually, LC_RPATH is not quite right. There are two:

Load command 14
          cmd LC_RPATH
      cmdsize 128
         path @loader_path/../../Users/comex/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib (offset 12)
Load command 15
          cmd LC_RPATH
      cmdsize 64
         path /usr/local/lib/rustlib/x86_64-apple-darwin/lib (offset 12)

The first one is right, although I'm about to file another issue about the nonsensicality of using a relative path in this instance. The second one is wrong, as /usr/local/lib/rustlib does not exist.

@steveklabnik
Copy link
Member

Triage: i don't have a mac, so I can't re-try this. The original report was in 2015, a couple of comments in 2017. Lots has changed. Can anyone reproduce this?

@comex
Copy link
Contributor

comex commented Oct 25, 2019

In my previous post here (from 2017), I think I pasted the rpath not of a toolchain dylib but of a binary compiled with -C prefer-dynamic; sorry for not making that clear at the time. Let's try this again.

LC_ID_DYLIB and LC_LOAD_DYLIB still use rpath:

% otool -Lvv ~/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libtest*.dylib                
/Users/comex/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libtest-6343af43832822dc.dylib:
	@rpath/libtest-6343af43832822dc.dylib (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libstd-330229df6d027e2b.dylib (compatibility version 0.0.0, current version 0.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.50.4)
	/usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)

Here is the rpath for a dylib from the toolchain:

% otool -lvv ~/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libtest*.dylib | grep -A 2 RPATH
          cmd LC_RPATH
      cmdsize 32
         path @loader_path/../lib (offset 12)

No /usr/local/lib, which is good. @loader_path is documented in man dyld as "the path to the directory containing the mach-o binary which contains the load command using @loader_path"... so it makes sense for dylibs to refer to each other this way.

However, when compiling a binary:

If I only pass -C prefer-dynamic without -C rpath, the binary is just broken:

% rustc -C prefer-dynamic a.rs
% otool -lvv a | grep -A 2 RPATH
# no output
% ./a
dyld: Library not loaded: @rpath/libstd-330229df6d027e2b.dylib
  Referenced from: /private/tmp/./a
  Reason: image not found
zsh: abort      ./a

This has a separate report: #50001 (Cannot execute Hello World on macOS with prefer-dynamic)

If I pass -C rpath, the binary does work, but the rpaths are even stranger than in 2017:

% cd /tmp/foobar
% rustc -C prefer-dynamic -C rpath /tmp/a.rs
% otool -lvv a | grep -A 2 RPATH       
          cmd LC_RPATH
      cmdsize 136
         path @loader_path/../../../Users/comex/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib (offset 12)
--
          cmd LC_RPATH
      cmdsize 72
         path /private/tmp/foobar/lib/rustlib/x86_64-apple-darwin/lib (offset 12)

The first rpath is similar to before: a relative path from the directory containing the binary to the dylib, even though in this case that means backing all the way out to the root directory. It works but will stop working if the binary is moved. I'm not sure if I ever filed an issue for it, but #58343 seems to be about the same issue on Linux.

The second rpath is broken... differently from before. For some reason, rustc has appended lib/rustlib/x86_64-apple-darwin/lib to /private/tmp/foobar, the directory containing the binary. This path does not exist; /private/tmp/foobar is completely empty except for that binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. O-macos Operating system: macOS
Projects
None yet
Development

No branches or pull requests

6 participants