Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid shipping duplicate artifacts in the host and target sysroot #42645

Open
alexcrichton opened this issue Jun 13, 2017 · 20 comments
Open

Avoid shipping duplicate artifacts in the host and target sysroot #42645

alexcrichton opened this issue Jun 13, 2017 · 20 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-help-wanted Call for participation: Help is requested to fix this issue.

Comments

@alexcrichton
Copy link
Member

All released compilers have identical dynamic libraries in two locations. The locations on Linux are:

  • $sysroot/lib/*.dylib
  • $sysroot/lib/rustlib/$target/lib/*.dylib

All of these artifacts are byte-for-byte equivalent (they're just copies of one another). These duplicate artifacts inflate our installed size, inflate downloads, and cause weird bugs like #39870. Although #39870 is itself fixed it's just a hack fix for now that would be ideally solved by fixing this issue!

Some possible thoughts I personally have on this are:

  • Symlinks won't work because they don't work on Windows
  • Hard links may work here, but I'm not sure. This'd require a lot of updates to lots of tools (rust-installer, rustup, etc)
  • Simply not shipping one of these is going to be very difficult. $sysroot/lib is required for rustc itself to run correctly (that dir is typically in LD_LIBRARY_PATH or the equivalent) and $sysroot/lib/rustlib/$target/lib is where the compiler looks for target libraries. The compiler can't look in $sysroot/lib for libs as that's typically got a ton of libs on Unix systems.
  • The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents rustc-look-in-your-libdir. That way something like $sysroot/lib/rustlib/$target/lib/libstd.dylib would exist but essentially be an empty file (not a valid dynamic library). Instead rustc would look at $sysroot/lib/libstd.dylib for that file instead.

Unsure if I'm on the right track there, but hopefully can get discussion around this moving!

@alexcrichton alexcrichton added the E-help-wanted Call for participation: Help is requested to fix this issue. label Jun 13, 2017
@durka
Copy link
Contributor

durka commented Jun 13, 2017

Can you clarify:

$sysroot/lib/rustlib/$target/lib is where the compiler looks for target libraries. The compiler can't look in $sysroot/lib for libs as that's typically got a ton of libs on Unix systems.

What kinds of Bad Stuff (tm) would happen if there were "a ton of libs" in the place the compiler looks for target libraries?

@alexcrichton
Copy link
Member Author

Oh sure yeah, I'm basically thinking of #20342, which is the direct consequence of looking in all of $sysroot/lib for libs.

@est31
Copy link
Member

est31 commented Jun 13, 2017

inflate downloads

I'm not sure about that. Afaik we order files by name inside download folders to avoid precisely that.

@retep998
Copy link
Member

@est31 The $sysroot/lib/*.dylib libraries are in a different component than the $sysroot/lib/rustlib/$target/lib/*.dylib libraries. Because they're in different components, compression can't eliminate the redundancy.

@brson
Copy link
Contributor

brson commented Jun 15, 2017

The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents rustc-look-in-your-libdir

This seems unlikely to work with the dynamic linker in cases where rpath is disabled or unavailable. I believe the main reason for the historical redundancy here is so that the dylibs rustc needs are literally located in /usr/local/lib.

@brson
Copy link
Contributor

brson commented Jun 15, 2017

I favor a hardlink solution, but teaching rust-installer/rustup how to do that across components is pretty hairy.

A (relatively) simple solution would be to leave the components as they are, but have rustup deduplicate them with hardlinks at install time. If that were combined with a proposed optimization to have rustup use the combined package when possible, the effect would be that downloads were deduplicated (via compression), and disk space was deduplicated (via hardlinks).

@brson
Copy link
Contributor

brson commented Jun 15, 2017

The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents rustc-look-in-your-libdir

Doing this in the opposite direction seems like it would work (pseudo-symlinks in libdir), but then you lose the consistency of having all the real libs in libdir, and seemingly the installer would have to be responsible for setting that up.

@alexcrichton
Copy link
Member Author

Oh sorry yeah I was thinking that $sysroot/lib/rustlib/$target/lib/*.dylib would be a "pseudo symlink" to the versions in $sysroot/lib, that way we wouldn't mess with the libraries that rustc itself needs to execute.

Upon further reflection though I do agree that this seems like a rustup problem sort of. We still want to produce a rust-std package with all of the libraries in it, not a bunch of "pseudo symlink" pointers which point to nonexistent libraries. We basically want rustup toolchains and make install installed-toolchains to have this "symlink behavior" but everything else should stay as-is today.

@cuviper
Copy link
Member

cuviper commented Jun 20, 2017

FWIW, in Fedora packaging I do replace the rustlib libraries with actual symlinks to the libdir. I suppose it wouldn't hurt if those were "pseudo" symlinks, but I want to be careful about that redirection. Namely, I've got /usr/lib/rustlib/$target/lib/ so all targets share a common /usr/lib/rustlib/, and then 64-bit rustc will get its libraries from /usr/lib64 because that's how Fedora arranges things.

(I've kind of hacked that in place after ./x.py install since rustbuild et al. don't allow separating the libdir and rustlibdir paths, but maybe they should.)

@MikeMcQuaid
Copy link

We basically want rustup toolchains and make install installed-toolchains to have this "symlink behavior" but everything else should stay as-is today.

A suggestion: use symlinks in make install on Unix systems that support them and punt on a Windows solution for now. It seems the complaints about double-packaging and related issues are currently exclusively from Unix packagers so I think you could get away with just addressing it there for now.

@retep998
Copy link
Member

Windows users would still like to avoid having to download those libraries twice. It's not really critical or anything, just something that would be helpful in the future when someone gets around to it maybe.

@MikeMcQuaid
Copy link

Windows users would still like to avoid having to download those libraries twice.

Yep but they have to do that today. I'm not sure it's worth avoiding a straightforward solution to the problem for Unix systems because there's no obvious solution for Windows.

@metux
Copy link

metux commented Jul 12, 2017

Oh sorry yeah I was thinking that $sysroot/lib/rustlib/$target/lib/*.dylib would be a "pseudo symlink" to
the versions in $sysroot/lib, that way we wouldn't mess with the libraries that rustc itself needs to
execute.

Wait a second - (host) rustc itself links against libraries in (target) sysroot ?
Seriously ?!

@Mark-Simulacrum Mark-Simulacrum added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jul 27, 2017
@jonwolski
Copy link

Are symlinks or hard links not an option just due to Windows support? If so, perhaps it is worth pointing out that Windows 10 supports symbolic and hard links without the privilege escalation that was necessary in Vista, 7, and 8 (and XP if you include "junctions").

Forgive me if I'm stating the obvious. I do not really understand this issue. I just would not want to see an easy solution overlooked due to unfamiliarity with Windows' current capabilities.

https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/#RRmytWmTlOwHQ8YZ.97

@steveklabnik
Copy link
Member

Triage: I don't think that anything has changed here, but I'm not sure.

@eddyb
Copy link
Member

eddyb commented Apr 5, 2020

@Mark-Simulacrum did either of us open an issue about the libLLVM-*.so duplication?

@Mark-Simulacrum
Copy link
Member

Not to my knowledge, no. It would probably be good to do so.

@eddyb
Copy link
Member

eddyb commented Apr 6, 2020

Filed #70838.

@pwnorbitals
Copy link

This may be fixed now that #70838 is closed

@jyn514
Copy link
Member

jyn514 commented Jun 27, 2022

Triage: the only duplicate artifacts I see now are libstd.so and libtest.so, which sounds like it's a lot less than before (12 MB between them). But those two are still duplicated.

(posting for future reference: it turns out libtest.so is shipped in the host sysroot so that rustdoc can compile doctests)

@jyn514 jyn514 changed the title Avoid shipping duplicate artifacts Avoid shipping duplicate artifacts in the host and target sysroot Jun 27, 2022
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-help-wanted Call for participation: Help is requested to fix this issue.
Projects
None yet
Development

No branches or pull requests