Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplication of libLLVM-*.so in the distributed rustc component. #70838

Closed
eddyb opened this issue Apr 6, 2020 · 11 comments · Fixed by #72000
Closed

Duplication of libLLVM-*.so in the distributed rustc component. #70838

eddyb opened this issue Apr 6, 2020 · 11 comments · Fixed by #72000
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. T-release Relevant to the release subteam, which will review and decide on the PR/issue.

Comments

@eddyb
Copy link
Member

eddyb commented Apr 6, 2020

Since #57286, libLLVM-*.so has been distributed in the rustc component in two locations:

  • lib/libLLVM-*.so (presumably as a runtime dependency of bin/rustc)
  • lib/rustlib/$target/lib/libLLVM-*.so (not sure, maybe for linking custom drivers?)
    • likely belongs in the rustc-dev component, if the duplication can't be avoided

The difference can be seen in these consecutive nightlies' rustc components:

All 3 of those libLLVM-8svn.so files are identical (even across nightlies, yay for deterministic builds):

e400d157a7ceaee26f8680cb6f7430bd98c251dbb32e1f10549dda9f5769dc42  before/rustc-nightly-x86_64-unknown-linux-gnu/rustc/lib/rustlib/x86_64-unknown-linux-gnu/lib/libLLVM-8svn.so
e400d157a7ceaee26f8680cb6f7430bd98c251dbb32e1f10549dda9f5769dc42  after/rustc-nightly-x86_64-unknown-linux-gnu/rustc/lib/libLLVM-8svn.so
e400d157a7ceaee26f8680cb6f7430bd98c251dbb32e1f10549dda9f5769dc42  after/rustc-nightly-x86_64-unknown-linux-gnu/rustc/lib/rustlib/x86_64-unknown-linux-gnu/lib/libLLVM-8svn.so

One fascinating aspect is that the size only increases from 82.22 MiB to 83.18 MiB, despite libLLVM-8svn.so alone being 67 MiB uncompressed, so compression is really helping.

Even the .tar.gz only goes from 111.41 MiB to 113.26 MiB.
(I'm using archive sizes from the 2019-01-06 and 2019-01-07 directory listings in https://static.rust-lang.org/dist, FWIW)


This part of the build system seems relevant, but I'm not sure which of the two this creates:

// Ensure that `libLLVM.so` ends up in the newly build compiler directory,
// so that it can be found when the newly built `rustc` is run.
dist::maybe_install_llvm_dylib(builder, target_compiler.host, &sysroot);

cc @alexcrichton @rust-lang/release

@eddyb eddyb added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-heavy Issue: Problems and improvements with respect to binary size of generated code. labels Apr 6, 2020
@jonas-schievink jonas-schievink added the T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. label Apr 6, 2020
@eddyb eddyb added the T-release Relevant to the release subteam, which will review and decide on the PR/issue. label Apr 6, 2020
@Mark-Simulacrum
Copy link
Member

cc @cuviper -- you've touched code related to this a bunch recently I think, so maybe would have some time to investigate fixing this. I suspect we want to only ship one LLVM per tarball, at least, though ideally we would only have just the one LLVM library total as well.

@eddyb
Copy link
Member Author

eddyb commented Apr 6, 2020

Half-way through writing up this issue I realized the codegen-backends connection.
I'd suggest we try to get rid of the lib/rustlib/$target/lib/libLLVM-*.so copy and see what breaks, it's possible it's been dead weight for a while now.

And only if we have to keep both for some odd reason, we'd move it to rustc-dev.

@cuviper
Copy link
Member

cuviper commented Apr 6, 2020

One fascinating aspect is that the size only increases from 82.22 MiB to 83.18 MiB, despite libLLVM-8svn.so alone being 67 MiB uncompressed, so compression is really helping.

Even the .tar.gz only goes from 111.41 MiB to 113.26 MiB.

The rust-installer purposely orders the tarred files so that similar filenames are adjacent in the compression stream, especially in hope of helping with such duplicate files.

And only if we have to keep both for some odd reason, we'd move it to rustc-dev.

I think we may still want the rustlib copy for the odd cross-compiled tool using librustc*, but that definitely belongs in rustc-dev.

@eddyb
Copy link
Member Author

eddyb commented Apr 6, 2020

I think we may still want the rustlib copy for the odd cross-compiled tool using librustc*, but that definitely belongs in rustc-dev.

Is that... even possible? Is rustc-dev a per-target component in the same way rust-std is?

@cuviper
Copy link
Member

cuviper commented Apr 6, 2020

Is rustc-dev a per-target component in the same way rust-std is?

Yes it is -- try rustup component list -- but I don't know if anyone really needs that...

@eddyb
Copy link
Member Author

eddyb commented Apr 6, 2020

Actually, can you try something like cross-compiling a copy of src/rustc to i686, after deleting the per-target libLLVM-*.so?

I suspect that it would actually run just fine on a i686 host, because rustc-dev's rlibs should only really contain the rmeta information and maybe some object code that librustc_driver-*.so is missing (which we could get rid of if we kept MIR for everything but idk how feasible that is).

That is, the result of linking against lib/rustlib/$target/lib/librustc_driver-*.so should still use lib/librustc_driver-*.so at runtime, so libLLVM-*.so in rustlib is unnecessary.

But maybe I'm missing something. Like running the cross-compiled i686 on a x64 host without anything more than the rustc-dev component?

@cuviper
Copy link
Member

cuviper commented Apr 6, 2020

Hmm, I can't actually get cross-builds to work with rustc-dev. I added this to rustc.rs:

#![feature(rustc_private)]
extern crate rustc_driver;

... so it would pick that up from the sysroot. Trying to cross-compile i686, I get:

error[E0463]: can't find crate for `rustc_macros` which `rustc_driver` depends on
  --> ./rustc.rs:30:5
   |
30 |     rustc_driver::set_sigpipe_handler();
   |     ^^^^^^^^^^^^ can't find crate

I do have both i686 and x86_64 builds of rustc_macros in the sysroot. I suspect the error is because the i686 build has different metadata than my host build, so it can't find the one it wants for the host. That may mean that rustc-dev is really tied to being used on the same host.

It works if I just compile native x86_64 though. Then I tried removing the rustlib/.../libLLVM.so, and it gets essentially the same error as #68714:

error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" [...] "-lLLVM-9-rust-1.44.0-nightly" "-lutil" "-lutil" "-ldl" "-lrt" "-lpthread" "-lgcc_s" "-lc" "-lm" "-lrt" "-lpthread" "-lutil" "-lutil"
  = note: /usr/bin/ld: cannot find -lLLVM-9-rust-1.44.0-nightly
          collect2: error: ld returned 1 exit status

AFAICS, building normal crates still works fine with that extra libLLVM removed though.


In short:

  • Cross-compiling with rustc-dev seems broken -- maybe it should be shipped native-only.
    • The only difference would be in the manifest created for rustup.
  • Linking to rustc-dev crates will inherit -lLLVM, so that needs to be in the library path.

@eddyb
Copy link
Member Author

eddyb commented Apr 6, 2020

Linking to rustc-dev crates will inherit -lLLVM, so that needs to be in the library path.

Ah, fascinating. What happens if you build such an executable and then remove the rustc-dev component entirely? Can it pick up the dylibs outside rustlib or does it hardcode the paths inside?

@cuviper
Copy link
Member

cuviper commented Apr 6, 2020

Ah, fascinating. What happens if you build such an executable and then remove the rustc-dev component entirely? Can it pick up the dylibs outside rustlib or does it hardcode the paths inside?

The final executable still isn't actually linked to LLVM, since it didn't need any of those symbols:

$ readelf -d ./rustc | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [librustc_driver-ebfe476c9299964b.so]
 0x0000000000000001 (NEEDED)             Shared library: [libstd-426ca51550eed098.so]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

I can run this with rustup run nightly ./rustc, which set LD_LIBRARY_PATH=$sysroot/lib, or even if I set that path myself. But LD_LIBRARY_PATH=$sysroot/lib/rustlib/.../lib doesn't work if we've removed that library, since librustc_driver needs libLLVM itself.

@eddyb
Copy link
Member Author

eddyb commented Apr 6, 2020

Okay, it being dependent on LD_LIBRARY_PATH makes sense.
Overall, probably the safe thing to do would be to move the duplicate .so to rustc-dev, oh well.

@cuviper
Copy link
Member

cuviper commented Apr 7, 2020

Oh, llvm-tools-preview also needs LLVM, and those binaries are installed in rustlib/$target/bin with RUNPATH $ORIGIN/../lib. I guess they might be able to find the LLVM in $sysroot/lib anyway?

bors added a commit to rust-lang-ci/rust that referenced this issue May 13, 2020
Move the target libLLVM to rustc-dev

For running the compiler, we usually only need LLVM from `$sysroot/lib`,
which rustup will make available with `LD_LIBRARY_PATH`. We've also been
shipping LLVM in the `$target/lib` directory, which bloats the download
and installed size. The one time we do need the latter is for linking
`rustc-dev` libraries, so let's move it to that component directly.

Here are the dist sizes that I got before and after this change:

    rust-1.45.0-dev-x86_64-unknown-linux-gnu.tar.gz           182M  159M
    rust-1.45.0-dev-x86_64-unknown-linux-gnu.tar.xz           107M  91M
    rustc-1.45.0-dev-x86_64-unknown-linux-gnu.tar.gz          100M  78M
    rustc-1.45.0-dev-x86_64-unknown-linux-gnu.tar.xz          68M   53M
    rustc-dev-1.45.0-dev-x86_64-unknown-linux-gnu.tar.gz      146M  168M
    rustc-dev-1.45.0-dev-x86_64-unknown-linux-gnu.tar.xz      92M   108M

The installed size should reduce by exactly one `libLLVM.so` (~70-80M),
unless you also install `rustc-dev`, and then it should be identical.

Resolves rust-lang#70838.
@bors bors closed this as completed in c60b675 May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. T-release Relevant to the release subteam, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants