Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linking staticlib files into shared libraries exports all of std:: #33221

Open
froydnj opened this issue Apr 26, 2016 · 12 comments
Open

linking staticlib files into shared libraries exports all of std:: #33221

froydnj opened this issue Apr 26, 2016 · 12 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@froydnj
Copy link
Contributor

froydnj commented Apr 26, 2016

Consider this toy example:

#[no_mangle]
pub fn hello() {
    println!("hello world")
}
extern "C" {
  void hello();
}

void
really_hello()
{
  hello();
}

Compile and link:

$ rustc --crate-type staticlib --emit link=sl.a sl.rs
$ g++ -o hello.so -fPIC -shared driver.cpp sl.a

With rust 1.8.0, we have:

$ ls -l hello.so
-rwxr-xr-x 1 froydnj froydnj 2141544 Apr 26 11:02 hello.so

which is quite large (2MB!) for such a simple program. Despite all of std being compiled with the moral equivalent of -ffunction-sections, adding -Wl,--gc-sections does very little to slim down the binary:

$ g++ -o hello.so -fPIC -shared driver.cpp sl.a -Wl,--gc-sections
$ ls -l hello.so
-rwxr-xr-x 1 froydnj froydnj 2141544 Apr 26 11:02 hello.so

That's only about 400 bytes eliminated, which seems suboptimal.

The problem is that all of the public functions in libstd.rlib are marked as global symbols. When sl.a is linked into a shared library, all of those global symbols from libstd.rlib are now treated as symbols that the newly-created shared library should export as publically visible symbols. Which creates bloat in terms of a large PLT the shared library must tote around as well as rendering -Wl,--gc-sections ineffective, as virtually everything is transitively reachable from these public functions from libstd.rlib. hello.so has ~5000 visible functions, when it should really only have a handful. hello.so contains code for parsing floating-point numbers, even though it really shouldn't, according to the functions shown above.

This example is admittedly contrived, but Firefox's use of Rust is not terribly dissimilar from this: we compile all the crates we use into rlibs, link all of the rlibs together into a staticlib, and then link the staticlib into our enormous shared library, libxul. We're pretty careful with symbol visibility; we have hundreds of thousands of symbols in libxul, but fewer than 500 exported symbols. We would very much like it if:

  1. libxul didn't suddenly grow thousands of newly-exported symbols overnight.
  2. libxul didn't contain Rust code from std (or otherwise) that it doesn't use.

We didn't think terribly hard about this when we enabled Rust on our desktop platforms (though we should have), but our Android team cares quite a bit about binary size, and Rust support taking up this much space would be a hard blocker on our ability to ship Rust on Android. It would be somewhat less than the above because we'd be compiling for ARM, but it'd still be significant. (I assume the situation is similar on Mac and Windows, though I haven't checked.)

cc @alexcrichton @rillian @glandium

@alexcrichton
Copy link
Member

We're pretty careful with symbol visibility; we have hundreds of thousands of symbols in libxul, but fewer than 500 exported symbols.

I'm curious, how do y'all end up doing this? We have a few options in Rust for what's going on here, but it may not quite overlap with what you're doing:

  1. First up, you can compile with LTO (-Clto) when creating a staticlib. This will internalize as much as possible on the Rust side and LLVM is basically doing --gc-sections at that point. This gives me a 194K shared object.

  2. Next, if you have a whitelist of symbols, you can use a linker script like this:

    {
      global:
        really_hello;
      local: *;
    };
    

    For me that generates a 102K shared library without LTO, and 72K with LTO.

  3. We can tweak the visibility of symbols by default in Rust (similar to don't make intra-crate calls to exported functions go through the PLT or similar #32887 I think?), but I at least unfortunately don't know much about all the visibility choices across platforms so I dunno if this'd actually help.

Those are some ideas off the top of my head at least, but depending on what Gecko is already doing we can likely do something similar :)

@alexcrichton
Copy link
Member

cc @brson, @rust-lang/tools

@froydnj
Copy link
Contributor Author

froydnj commented Apr 26, 2016

I'm curious, how do y'all end up doing this?

We compile with -fvisibility=hidden and wrap all the system headers we use to ensure that visibility is restored to the default visibility for any symbols in system headers. We don't have to wrap symbols on Darwin, though; perhaps the compiler ensures things in system headers have the appropriate visibility?

On Windows I believe we don't have to do any of this because the default is to have local symbols in the library and you explicitly export what you want. (We wrap STL headers on Windows, but for different reasons.)

First up, you can compile with LTO (-Clto) when creating a staticlib.

Ah, that's super-useful all on its own! We should definitely start doing this.

Next, if you have a whitelist of symbols, you can use a linker script like this...For me that generates a 102K shared library without LTO, and 72K with LTO.

Is that with -Wl,--gc-sections enabled when linking? Is the LTO you speak of here applied at staticlib generation time, or at link time for the final shared library?

I don't think a linker script would work for us, because of the messiness of specifying the appropriate symbols (C++ symbol mangling, lovely), but it might be worth investigating.

We can tweak the visibility of symbols by default in Rust

That was the possibility I initially thought of, but then I assume you'd have to compile objects separately for shared vs. static libraries or play weird linking tricks.

The number of symbols that std exports seems rather high (~5K), but I think that has to do with things like exporting functions for all the arithmetic operations on every arithmetic type, and various partial/total-ordering trait operations. I feel like we shouldn't have to do that in general, even if LTO would take care of dead code elimination for us.

@retep998
Copy link
Member

retep998 commented Apr 27, 2016

(I assume the situation is similar on Mac and Windows, though I haven't checked.)

This is not really an issue on Windows with msvc due to symbols only being exported from a DLL if they are marked as dllexport or if they are included as part of a .def file (unless nothing is specified via those methods in which case everything is exported). Anything which isn't referenced by those exported symbols is stripped by the linker. While Rust doesn't emit dllexport, it does emit a .def file when it controls creation of the DLL, which will make for very slim DLLs when using the upcoming cdylib crate type.

@alexcrichton
Copy link
Member

@froydnj

Awesome, thanks for the info! I was discussing with @brson a bit today about symbol visibility, and our prospects may be bleak in doing something like change by default to hidden visibility (due to backcompat concerns now). In general though it seems that not a lot of thought has gone into the visibility of symbols in Rust beyond "internal or public", and it seems that hidden/protected/default linkage (at least in LLVM terms) is a whole new suite of choices within the "this is a public symbol" option.

All that basically to say, the compiler probably can't generate hidden symbols today (LLVM certainly can, we just haven't bound it), and it's not the clearest how we'd want to do that just yet. Should be possible in the long term of course though!

Ah, that's super-useful all on its own! We should definitely start doing this.

One note about LTO is that it retains all #[no_mangle] non-Rust ABI reachable entry points, so I believe this would mean that the Rust C API would still all go into the PLT for example.

Is that with -Wl,--gc-sections enabled when linking?

Yeah, although once you enable Rust LTO the --gc-sections option shouldn't actually do much else (unless you rely on it for stripping C code)

Is the LTO you speak of here applied at staticlib generation time, or at link time for the final shared library?

Ah yeah so to clarify, Rust LTO isn't like C LTO where we have a special object file format or something like that and the linker takes care of it. Rather Rust LTO is our way of saying "take the whole world of Rust code, optimize it all together, then emit one object file". We do this by loading LLVM bytecode from rlibs, internalizing all symbols, then throwing it at the LLVM optimizer.

So to answer your question, this LTO happens at staticlib generation time. The .a archive will just have a smaller object file (as much of it will be stripped), and that object shouldn't necessarily be special in any way.

That was the possibility I initially thought of, but then I assume you'd have to compile objects separately for shared vs. static libraries or play weird linking tricks.

Yeah right now we use the same object file for libstd.rlib as well as libstd.so, but these would have very different symbol visibility requirements, so that's problem 1 we'd have to solve :(

The number of symbols that std exports seems rather high (~5K), but I think that has to do with things like exporting functions for all the arithmetic operations on every arithmetic type, and various partial/total-ordering trait operations.

Right yeah most of this stuff is just for future monomorphizations. An example of this is that all format strings in rust (e.g. format!("foo: {}", "bar")) will generate a static symbol describing the parsed format string. If this format string is in a generic function, then monomorphizations of the generic function will need to reference the format string, so the symbol is made public (which for us there's only one level of public right now, which as you've found means "in the PLT").

Basically all that is to say that these symbols are basically just a ton of internal implementation details, and it should be totally fine to hide all of them so long as other Rust code can still link to them. (aka this sounds like exactly hidden visibility)

@rillian
Copy link
Contributor

rillian commented Apr 27, 2016

Adding -Clto to our staticlib build in gecko reduced the number of T symbols from 5230, to 280. So this does help a lot.

However, the library filesize went from 5033882 to 6321450.

@alexcrichton
Copy link
Member

@rillian you're sure optimizations are turned on, right? (e.g. -O)

@rillian
Copy link
Contributor

rillian commented Apr 28, 2016

No, that was an unoptimized build. Trying with -O now.

@rillian
Copy link
Contributor

rillian commented Apr 28, 2016

Ok, in an opt build -Clto changes the staticlib file size from 4314802 to 1669346 bytes. Thanks for the hint!

@rillian
Copy link
Contributor

rillian commented Apr 28, 2016

See https://bugzilla.mozilla.org/show_bug.cgi?id=1268547 about turning this on for gecko.

@michaelwoerister
Copy link
Member

This might be the same underlying problem as in #37530.

@steveklabnik steveklabnik added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed A-compiler labels Mar 24, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 25, 2017
@alexcrichton
Copy link
Member

Some more comments on this thread -- https://internals.rust-lang.org/t/rust-staticlibs-and-optimizing-for-size/5746

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants