New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for pre-built dependencies #1139

Open
marcbowes opened this Issue Jan 9, 2015 · 21 comments

Comments

Projects
None yet
@marcbowes
Copy link
Contributor

marcbowes commented Jan 9, 2015

Currently you can add dependencies using path or git. Cargo assumes this is a location to source code, which it will then proceed to build.

My use-case stems from integrating Cargo into a private build and dependency management system. I need to be able to tell Cargo to only worry about building the current package. That is, I will tell it where the other already-built libraries are.

Consider two projects: a (lib) and b (bin) such that b depends on a:

[package]

name = "b"
version = "0.0.1"
authors = ["me <plain@old.me>"]

[dependencies.a]

path = "/tmp/rust-crates/a"

A clean build will output something like:

> cargo build -v                                                                                                                                                                                                                                               master untracked
   Compiling a v0.0.1 (file:///private/tmp/rust-crates/b)
     Running `rustc /tmp/rust-crates/a/src/lib.rs --crate-name a --crate-type lib -g -C metadata=10d34ebdfa7a5b84 -C extra-filename=-10d34ebdfa7a5b84 --out-dir /private/tmp/rust-crates/b/target/deps --emit=dep-info,link -L dependency=/private/tmp/rust-crates/b/target/deps -L dependency=/private/tmp/rust-crates/b/target/deps`
/tmp/rust-crates/a/src/lib.rs:1:1: 3:2 warning: function is never used: `it_works`, #[warn(dead_code)] on by default
/tmp/rust-crates/a/src/lib.rs:1 fn it_works() {
/tmp/rust-crates/a/src/lib.rs:2     println!("a works");
/tmp/rust-crates/a/src/lib.rs:3 }
   Compiling b v0.0.1 (file:///private/tmp/rust-crates/b)
     Running `rustc /private/tmp/rust-crates/b/src/lib.rs --crate-name b --crate-type lib -g -C metadata=429959f67e51bc23 -C extra-filename=-429959f67e51bc23 --out-dir /private/tmp/rust-crates/b/target --emit=dep-info,link -L dependency=/private/tmp/rust-crates/b/target -L dependency=/private/tmp/rust-crates/b/target/deps --extern a=/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib`
/private/tmp/rust-crates/b/src/lib.rs:1:1: 3:2 warning: function is never used: `it_works`, #[warn(dead_code)] on by default
/private/tmp/rust-crates/b/src/lib.rs:1 fn it_works() {
/private/tmp/rust-crates/b/src/lib.rs:2     println!("b works");
/private/tmp/rust-crates/b/src/lib.rs:3 }

Importantly:

--extern a=/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib

Would it make sense to expose a extern option (in dependencies.a) for low level customization?

[dependencies.a]

extern = "/private/tmp/rust-crates/b/target/deps/liba-10d34ebdfa7a5b84.rlib"

This can be worked around by using a build script along the lines of:

use std::io::fs;

fn main() {
    let from = Path::new("/tmp/rust-crates/a/target/liba-b2092cdbfc1953bd.rlib");
    let to = Path::new("/tmp/rust-crates/b/blah/liba-b2092cdbfc1953bd.rlib");
    fs::copy(&from, &to).unwrap();
    println!("cargo:rustc-flags=-L /tmp/rust-crates/b/blah");
}

But it is not ideal to have to do this with every project.

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Jan 9, 2015

This is a Rust problem even more than a Cargo problem. You can't guarantee that a pre-built Rust library will work unless it's built with the exact same SHA of the compiler.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jan 9, 2015

Yes unfortunately this would require changes to rustc itself, so it's unable to be tackled at this time.

The specific restriction I'm referring to is that you're basically limited to only working with binaries generated by the exact revision of the compiler you're using, as well as the exact same set of dependencies.

@marcbowes

This comment has been minimized.

Copy link
Contributor

marcbowes commented Jan 9, 2015

Is there something I can read/follow that explains the issues relating to why this requirement is so strict? Is this expected to change over time?

Regardless, assume I can meet the requirement of providing a set of prebuilt libraries with the exact same SHA. Is this a reasonable feature? Even something like letting the build script emit --extern as part of the whitelisted flags that cargo:rustc-flags can configure would help me out (assuming that is easier to implement than another top-level dependency option).

@marcbowes

This comment has been minimized.

Copy link
Contributor

marcbowes commented Jan 9, 2015

/cc @aturon: I had a chat with Steve on IRC about this and he suggested getting your input.

One big reason I want to avoid building dependencies over and over is that our build system rebuilds consumers - in my example, a change to a would trigger a rebuild of b (including tests) and if b failed, the new version of a would not be released. The implication of this is that when building b, a would be rebuilt a second time. This becomes really wasteful.

I'm happy to contribute the change required to implement this if the team feels it is a worthwhile feature. I imagine there are plenty of companies out their with their own in-house build systems, so something like this could be an adoption blocker.

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Jan 9, 2015

I actually meant @alexcrichton not @aturon :)

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jan 9, 2015

Is there something I can read/follow that explains the issues relating to why this requirement is so strict?

Unfortunately no :(. We don't have a ton of documentation in this area, just a bunch of cargo-culted knowledge. In general though this is largely because of two primary reasons (that I can think of):

  1. The ABI for a library is not stable between compilations, even when theoretical ABI-compatible modifications are made.
  2. The metadata format for libraries, while extensible, is not currently use in an extensible way as it regularly breaks backwards compatibility.

Is this expected to change over time?

Certainly! We probably won't invest too much energy into it before 1.0, but I'd love to see progress in this area!

Regardless, assume I can meet the requirement of providing a set of prebuilt libraries with the exact same SHA. Is this a reasonable feature?

I suppose it depends on how much cargo integration you want. In your example you gave in the second comment, the manifest probably says that b depends on a, in which case cargo will already pass --extern for a when it compiles b. Cargo would not only just have to forward your --extern flags, but it would also have to know to turn off its own --extern. Additionally it would then have to cut a out entirely from the dependency graph.

In principle allowing --extern from rustc-flags would be possible, but it may have surprising results!

I imagine there are plenty of companies out their with their own in-house build systems, so something like this could be an adoption blocker.

I agree this would definitely be bad! I'd like to hone in on what's going on here first though.

My first question would be: Does Cargo suffice? If you're using cargo build, then Cargo won't build a if it hasn't changed and you've already built it, but it sounds like you're not using Cargo to build libraries?

I suppose my other questions would be based on that answer, so I'll hold off for that :)

@marcbowes

This comment has been minimized.

Copy link
Contributor

marcbowes commented Jan 9, 2015

Thanks for the detailed answer Alex!

My first question would be: Does Cargo suffice? If you're using cargo build, then Cargo won't build a if it hasn't changed and you've already built it, but it sounds like you're not using Cargo to build libraries?

Imagine a is built by Travis. It outputs liba, documentation and so forth - a collection of artifacts. People mostly discard these artifacts in practice, but you might imagine a system where those artifacts are retained. I'm sure this is not conceptually dissimilar to what your built bots do - you get some named and versioned output that you can later use either for development (ala rustup) of yet more projects or for deployment purposes.

To integrate with this build system, one only needs to implement the simple contract: provide something that can be executed that will produce build artifacts. This is just a one-line shell script that turns around and calls cargo build, and we're done.

Along comes project b. It starts off the same as a until we decide to use some of the functionality that a provides. In cargo, you just add the dependency to the manifest - name and version. This build system works in the same way so we add it to it's manifest too. The build system uses this manifest to provide the build artifacts of a for b at compile time. Now the only thing left to do is adjust the path attribute under [dependencies.a] to point to the build artifacts ($A_ARTIFACTS/src, if you will) and we're golden.

(We now have the dependency declared in two places. We can either live with the duplication, or adjust our build script to copy them from one into the other.)

However, we've just hit the first real problem: b needs the source code of a to compile but this doesn't really fit in with the concept of a build artifact. We can cheat by adjusting our shell script to also copy the src of a into it's build directory.

Hopefully, at this point, I've answered the latter question: the intention is to use cargo to build libraries such as a or binaries such as b. The reasons:

  • it does pretty much everything we'd otherwise have to implement (w.r.t. rustc)
  • Cargo.toml is nicer than other interfaces like make for customization
  • it's good to keep things similar to the way the rest of the world does it
  • it makes importing third-party projects easier

The question then, is what happens when a changes? At a high level, the system tracks dependencies, rebuilds them according to the graph and fails if something in the build breaks. This means that b would be rebuilt against the new artifacts of a. If we change our shell script to also execute cargo test, this means that b gets a chance to veto the build as a whole if the change breaks it in some way.

And this brings us to the second problem. a is being built twice. If c depends on b, then the build will build a three times and b twice. This becomes incredibly wasteful pretty quickly. In the context of this specific system, it is also redundant because any changes to a will trigger b to be rebuilt; whereas in the "normal" world, b will be rebuilt if a changes but only when b is explicitly built.

As I mentioned in the issue overview, I can work around this by using a cargo build script that adds a -L option to rustc, provided I just dump the libs output by the builds of all of it's (recursive) dependencies. This works and completely solves my problem. Incidentally, it removes the need for the other changes to the build script (don't need to copy source code in, don't need to declare dependencies in Cargo.toml).

But then we hit the problem of ABI compatibility, for which it sounds like there is no solution yet. This means I'll need to find a way of (effectively) adding the rustc SHA to the tuple that identifies an artifact (similar to disambiguating 32/64 bit builds). Or just going with the aforementioned option of building all dependencies for each consumer.

A question you might also ask is: "would hosting a crates.io mirror help with this?". It doesn't. Not because it doesn't "work", but because it only meets some of the requirements (such as private code, not having direct dependencies on external sources for security reasons).

One huge benefit we get out of a single extensible build system is that adding a dependency on a Rust package is no different to adding a dependency on a C, Ruby, Java, Python or Haskell package - they're just named and versioned artifacts. A big use case for me is going to be enabled by that: for example, authoring Rubygems in Rust to speed up performance-critical code paths.

I hope this detail makes my initial question more clear: cargo does things I'd otherwise have to implement myself, but it also does things I'd like to skip. Specifically, I'd like to be able to use something like path for exact control over where the dependency lives, but I don't want cargo to try build it.

FWIW, I'm probably going to go with:

  • include source code in build artifacts
  • rebuild dependencies for each consumer when that consumer is built
  • the build system's build script should copy in Rust dependencies from the build system's manifest to Cargo.toml and specify the path
@marcbowes

This comment has been minimized.

Copy link
Contributor

marcbowes commented Jan 9, 2015

Alex points out that http://doc.crates.io/build-script.html#overriding-build-scripts could be extended to support overriding rust crates. We could then generate .cargo/config files on the fly.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jan 11, 2015

Alright, after reading that over (thanks for taking the time to write it up!) it sounds like what we discussed on IRC is the best way to move forward with this. Specifically I'd be thinking of something like:

# .cargo/config
[target.$triple.rust.foo]
libs = ["path/to/libfoo.rlib", "path/to/libfoo.so"]
dep_dirs = [ ... ]

Note that the current overrides (target.$triple.$lib) I think may want to be renamed to target.$triple.native.$lib to give us some more leeway. When cargo detects this form of override, however, it will not build libfoo but instead just pass --extern foo=... to the paths listed and -L dependency=... to all of the values in dep_dirs.

One problem I can forsee, however, is that you mentioned about not wanting to share the source code between projects. Cargo would still need the source code, however, to read data such as the Cargo.toml. Cargo doesn't actually need the entire source code base, but it'll need at least that much.

Does that sound like what would work for you?

@C-Bouthoorn

This comment has been minimized.

Copy link

C-Bouthoorn commented May 27, 2017

bump?

@Boscop

This comment has been minimized.

Copy link

Boscop commented Jun 1, 2017

Please support this, it's frustrating that it doesn't work yet.
I have to prebuild the ring crate because on the server where I don't have root the GCC version is too old to build ring, so I build it on a server where I'm root and copy it over.
Is there any way right now to use the prebuilt rlib instead of compiling it from crates.io? Maybe with a build.rs script?

@aturon aturon reopened this Jul 12, 2017

@aturon aturon added the I-nominated label Jul 12, 2017

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jul 12, 2017

Nominated for discussion at the Cargo team meeting.

@Twey

This comment has been minimized.

Copy link

Twey commented Oct 24, 2017

Was progress made on this at the meeting?

@Popog

This comment has been minimized.

Copy link

Popog commented Oct 27, 2017

I created DHL as a workaround for this issue.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Oct 30, 2017

I don't know about "pre-built" dependencies, but it'd be nice to be able to build a variety of leaf crates without building the dependency crates more than once, if the leaf crates request the same features from the dependencies.

@Twey

This comment has been minimized.

Copy link

Twey commented Mar 23, 2018

Bump?

@dwijnand

This comment has been minimized.

Copy link
Member

dwijnand commented Apr 25, 2018

Probably not: still nominated..

@Twey

This comment has been minimized.

Copy link

Twey commented Jun 29, 2018

I'm still a bit in the dark about what happened here. Was anything discussed at the team meeting?

@mitchmindtree

This comment has been minimized.

Copy link

mitchmindtree commented Jul 6, 2018

TL;DR Would just like to add another +1 for support for pre-built binaries. It would be great to get a follow up on what was discussed at the meeting.

Motivation Story

Last night we ran a workshop on the nannou creative coding framework. Seeing as nannou supports audio, graphics, lasers, etc along with quite a high-level API in a cross-platform manner, it has a lot of dependencies. It took between 5 minutes and 25 minutes (depending on the user's machine) for users just to build nannou and all of its dependencies for the first time in order for us to begin working through the examples together. Ideally in the future we would write a build script that attempted to first fetch pre-built dependencies before falling back to building from the src. It seems like the feature described within this issue would help to simplify this.

@robclouth

This comment has been minimized.

Copy link

robclouth commented Jul 7, 2018

What about some global cache on disk for both the downloaded source code the built binaries, so that if the compiler and crate version match it can avoid a rebuild? Similar to yarn.

@Twey

This comment has been minimized.

Copy link

Twey commented Jul 10, 2018

This is also what Nix wants to do. There won't be a compiler mismatch for packages built with Nix, because Nix will use the compiler as a build input to the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment