Skip to content

Really avoid redundant downloads when bootstrapping #36069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Really avoid redundant downloads when bootstrapping #36069

wants to merge 1 commit into from

Conversation

hasufell
Copy link

Follow-up fix of ab5309e

This also fixes packaging for source distros that don't allow
fetching during build-time, which was unconditionally done for
the sha256 file.

Follow-up fix of ab5309e

This also fixes packaging for source distros that don't allow
fetching during build-time, which was unconditionally done for
the sha256 file.
@rust-highfive
Copy link
Contributor

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@alexcrichton
Copy link
Member

Thanks for the PR! This looks pretty good to me, but I think that with normal use the foo.sha256 file never gets moved to the final location, right? That is, if foo.sha256 doesn't exist ahead of time, it's not created even though we download it?

@alexcrichton
Copy link
Member

Oh right and the other piece is that if we need to invalidate the previously downloaded artifact when we update the source to bootstrap from a new compiler. Previously I believe that happened because we'd download a new checksum file, but now it looks like we may cache the old checksum fail and forget to see the update?

@hasufell
Copy link
Author

hasufell commented Aug 29, 2016

This looks pretty good to me, but I think that with normal use the foo.sha256 file never gets moved to the final location, right? That is, if foo.sha256 doesn't exist ahead of time, it's not created even though we download it?

Hm, I'm unable to follow here. What do you mean with with normal use? If the user does not manually download the file and put it into the dl/ folder (typically only done by packagers), then the script will auto-download the sha256 file into a temporary file, just like before.

@hasufell
Copy link
Author

hasufell commented Aug 29, 2016

but now it looks like we may cache the old checksum fail and forget to see the update?

The checksum file has the same filename as the tarball, except for the file extension:

So when a new bootstrap compiler is used, both the tarball of the compiler and the sha256 file will have different file names and so the old ones will not be used.

@alexcrichton
Copy link
Member

Hm so yeah I think my concern isn't relevant because of the first piece of behavior, never caching sha256 by default. My latter concern stems from the fact that filenames don't always change, for example the master branch bootstraps from the beta compiler which always has the same filename.

I guess that not caching the sha256 is fine, but this seems like very magical behavior where "if you happen to place some magical files here we'll avoid downloading things." As in, that's a code path which isn't tested but us at all, so it'd be very likely to break :(

Another point is that for offline builds I was under the impression that's what --local-rust was used for to the ./configure script. Does that not work with rustbuild though? (if so that's a bug)

@hasufell
Copy link
Author

hasufell commented Aug 29, 2016

Another point is that for offline builds I was under the impression that's what --local-rust was used for to the ./configure script. Does that not work with rustbuild though? (if so that's a bug)

I only see --enable-local-rust and --local-rust-root=[/usr/local] neither of which do what this is for: downloading the bootstrapping stages before compile time, which is a requirement for several source distros.

My latter concern stems from the fact that filenames don't always change, for example the master branch bootstraps from the beta compiler which always has the same filename.

Actually, the .sha256 file is not cached (whether it's a temporary file or in the dl/ folder), see https://github.com/hasufell/rust/blob/49beda483eb75225795117e0a22d339fb7c26da3/src/bootstrap/bootstrap.py#L51

but the tarball is cached, because of this return: https://github.com/hasufell/rust/blob/49beda483eb75225795117e0a22d339fb7c26da3/src/bootstrap/bootstrap.py#L41

The latter was already the case before this patch, so I don't know what the desired behavior is here.

So currently:

  • tarball is cached (as before)
  • sha256 file is not cached (as before, but also for the manually downloaded one)

@alexcrichton
Copy link
Member

Ok sorry for the runaround a bit, but I think this is starting to make sense to me. It sounds like you're doing something that unfortunately is not supported by us or our build system, which is attempting to pre-download a snapshot compiler and then hope the build system doesn't try to do it again.

For distros it is our assumption that this logic will never be used, but rather the --local-rust-root and/or --enable-local-rust options are used instead. That, paired with the fact that we guarantee that releases can bootstrap from the previous release, was thought to be sufficient in the past. Is that not the case, though? Will those two options not work for you?

@hasufell
Copy link
Author

Will those two options not work for you?

no, how would I bootstrap without a local rust?

@alexcrichton
Copy link
Member

Yes to bootstrap the compiler you need to download something from somewhere. It sounds like you're arranging for some bootstrap compiler to be available, but you can just pass --local-rust-root pointing at that instead of trying to convince the download script to not download it

@hasufell
Copy link
Author

Yes to bootstrap the compiler you need to download something from somewhere. It sounds like you're arranging for some bootstrap compiler to be available, but you can just pass --local-rust-root pointing at that instead of trying to convince the download script to not download it

So you are basically telling me to recreate the logic of your bootstrapping scripts manually? That's really distro friendly.

@alexcrichton
Copy link
Member

@hasufell oh no, not at all! That would indeed be quite difficult! Instead what you can do is the equivalent of:

# Download the previous compiler, e.g. 1.12.0 from somewhere
$ curl https://sh.rustup.rs | sh -s -- --default-toolchain 1.12.0

# Configure rust to build with that compiler
$ ./configure --local-rust-root=$(rustc --print sysroot)
$ make

That is, any standard layout of the compiler works just fine, you just need to point the build system at it.

@hasufell
Copy link
Author

What your example suggests is:

  1. downloading an unversioned script from the internet -- no
  2. running an interactive shell script -- no, never
  3. STILL fetching during build-time -- no, avoiding that is the whole point of this PR

You do not seem familiar with packaging rules/mechanisms and I don't blame you. But this debate is getting annoying. The current bootstrapping mechanisms of rust are fine, except for this tiny bit that can be easily fixed with the provided patch. I'm starting to feel my time is being wasted here.

@alexcrichton
Copy link
Member

@hasufell

Sorry I think I may be failing to explain something here, let me see if I can try again!

So the constraint here, as I understand it, is that a distro wants to be able to build Rust without the build process itself downloading anything. Right now it sounds like you're doing this by placing a tarball in the right place it "works" except for the fact that we download a sha256 checksum file. My point, however, is that this is not the interface we support for avoiding downloads as part of the build process.

What's happening here is that you're arranging for the bootstrap compiler to be available ahead of time through your own build system. Instead, however, you'll need to just arrange for the bootstrap compiler to be unpacked somewhere. Once that's done, you can pass the --enable-local-rust or --local-rust-root options to tell the build system to use that compiler you've arranged to be available. We then also guarantee that the 1.X release can be bootstrapped with the 1.(X-1) release.

Once you've configured with --enable-local-rust and --local-rust-root then the build system should not attempt to download anything, but if it does that's a bug!

Does that make sense? I believe that's the strategy that other distributions like Debian and Fedora are using as well, and we were at least under the impression that if it works there it should work for everyone, but that may not be the case!

@hasufell
Copy link
Author

hasufell commented Sep 1, 2016

I believe that's the strategy that other distributions like Debian and Fedora are using as well

No, they are not, debian is even applying something similar to this PR:
https://anonscm.debian.org/cgit/pkg-rust/rust.git/tree/debian/patches/dont-download-stage0.diff

Fedora is also applying bootstrapping patches: http://pkgs.fedoraproject.org/cgit/rpms/rust.git/tree/

So your information is clearly wrong.

In addition, you are comparing binary distos to source distros, which does not make sense. They have different set of rules. Source distros don't ship binaries and as such, bootstrapping happens on the user machine.

and we were at least under the impression that if it works there it should work for everyone, but that may not be the case!

It doesn't even work cleanly on debian and fedora.

So, can you present a proper argument without guessing what distros do or what problems distros have?

@brson
Copy link
Contributor

brson commented Sep 2, 2016

It looks to me like there's a disconnect between how we expected packagers to use the build system for bootstrapping and how they are in practice.

If you look at Debian's build script, you can see that they do use --enable-local-rust, but only in the case when they are rebootstrapping from the distro's toolchain. When they are doing the original bootstrap from upstream they are not using --enable-local-rust; instead they are "bootstrap[ing] from the stage0 given in that tarball". So it appears that during the original bootstrap they are including appropriate binaries in build/cache and using those as the bootstrap compiler; of course to do that one needs a patch like in this PR.

@hasufell Does that sound like a correct assessment of the situation to you?

From the way the build system is constructed today, we would instead expect those tarballs that are being placed in build/cache to: be located somewhere custom to the distro's packaging solution; be extracted to a directory; and that directory passed to --local-rust-root. As an example, based on what I see in Debian's scripts, they could place the bootstrap rustc/cargo tarballs in a ./bootstrap-tarballs directory; extract them to ./bootstrap-install; then call ./configure --enable-local-rust --local-rust-root=./bootstrap-install.

But they are not doing that, and @hasufell is disinclined to do that as well. So the obvious question is "why"? Just from the description of the process I gave, I might expect that the answer is "because the way they are doing it is easier", but we should probably go to the source and ask.

@infinity0 @sylvestre I wonder if you could say something about how Debian is doing their initial bootstrap from upstream, and why. As described in this comment and previous in the thread, we expect that bootstrap to be possible via --enable-local-rust, but Debian is not using that flag and is instead relying on the stage0 compiler in build/cache, by short-circuiting the checksum verification.

My guess is that using the existing stage0 cache structure is just more convenient to do, and if so we should be providing more facilities in-tree for creating this arrangement. Just as one idea, we could create a build system command for creating on offline-bootstrap tarball, that produced this same effect, but using a different short-circuiting mechanism that avoids the problems @acrichto is concerned with.

@hasufell Are you doing packaging for a distro? If so, do you mind saying which and linking to the scripts you have so we can compare to what Debian is doing? The language in your op, "this also fixes packaging for source distros", sounds as though you may have multiple motivations for wanting this short-circuiting. If that's the case, do you mind expanding on what use-cases this patch supports beyond bootstrapping for distros?

@infinity0
Copy link
Contributor

Rust's build scripts unconditionally download the hash file, because (I guess) you guys want the freedom to re-define the bootstrap compiler during development, and you want to make sure old bootstraps are always obsoleted when updating a git checkout. This does indeed require network access.

For Debian, as you noticed, we are patching away this "unconditionally download" and instead making the build scripts use what's already there. However, this is logically contradictory to the requirement mentioned above, and that is why I marked the patch as Forwarded: not-needed.

I personally am OK with rust not accepting this PR and us at Debian patching it away specifically like we're currently doing. Another option would be to make this PR dependant on an environment variable, then we at Debian (as well as the OP) could set this during the build.

@infinity0
Copy link
Contributor

Or, instead of requiring an envvar, detect the existence of ".git", and change the behaviour (always-download-the-hash vs reuse-existing-hash-on-filesystem) based on this.

@hasufell
Copy link
Author

hasufell commented Sep 7, 2016

This discussion is useless and wasting my time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants