New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support a first-class format for declaring external dependencies #3816

Open
alexcrichton opened this Issue Mar 10, 2017 · 9 comments

Comments

Projects
None yet
8 participants
@alexcrichton
Copy link
Member

alexcrichton commented Mar 10, 2017

This is an issue extracted from the discussion on rust-lang/rust-roadmap-2017#12. The high level idea is that Cargo should support a first-class method of declaring dependencies on external artifacts in a structured format. When combined with #3815 this would easily allow external build systems to resolve these dependencies to internal rules known by those build system. For example Buck/Bazel may have their own copy of OpenSSL compiled, and the openssl-sys crate should be connected to that copy (both literally at compile time but also in the dependency graph).

The purpose of this support is to allow the majority of build scripts in the ecosystem to largely be overwritten and avoided at compile time. Build scripts tend to be difficult to wrangle in restrictive build systems as they can have an unpredictable set of inputs (for an arbitrary build script) and are otherwise difficult to always audit one by one (for any particular build script). By having a first class description of what the build script would otherwise do this can allow external build systems to assume by default that a build script need not be run.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 10, 2017

Note that there's some prior work here to draw from as well:

  • Cargo supports overriding build scripts, preventing their execution. This feature is somewhat underdeveloped though as it has yet to see much adoption in the community. It may be a good starting point though!

  • Crates like metadeps support structed configuration in Cargo.toml read at build time (for the build script). In this case specifically for calling pkg-config and I believe integrating with distro builds.

@bennofs

This comment has been minimized.

Copy link
Contributor

bennofs commented Apr 22, 2017

I think completely ignoring the build.rs script when metadata is present would be suboptimal, as it is then an all or nothing choice. Some crates may be able to specify 99% of their deps as structured deps but still need a custom build.rs script for some remaining, not yet supported option.

I like the solution of metadeps much more, where the build.rs script still gets run but behaves very predictable, so for example a package manager can parse the pkg-config requirements and then be sure that build.rs will run sucessfully.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Apr 22, 2017

Ignoring build.rs is like "jailbreaking" a dependency version bound to force a disallowed version -- useful sledgehammer but best used as backup.

The metadeps approach is nice for modularity, If such libraries could work both at build time (build.rs) and plan-extraction-time (#3815), that would be especially cool.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Jul 5, 2017

As one possibility: I'd love to see a standardized mechanism to extend the build metadata without writing an explicit build.rs, in a way that allows multiple such extensions to combine. Suppose you have multiple crates like metadeps, which parse all their information from Cargo.toml, need no inputs, and provide outputs directly to Cargo; they only need to provide metadata on success, or an error message otherwise. What if you listed those crates in a buildext or similar key in Cargo.toml, and Cargo automatically built them as a dependency, invoked them by a standard interface at the time that it would invoke build.rs?

That would allow using the crate ecosystem to standardize key portions of build systems into declarative metadata, without privileging any particular implementation or limiting additions. And crates wouldn't need a programmatic build.rs file unless they wanted to do something not supported by such build-extension crates.

So, for instance, a project written in Rust, to bind to a C library, using bindgen at build time, and expose a Python interface could have buildext = ["metadeps", "metabind", "pyinterface"], and if that covers all your requirements, you wouldn't need build.rs at all. Cargo would build all three of those crates, invoke them, incorporate the additional metadata they emit, and then build the crate at hand.

How does that sound?

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Jul 5, 2017

I've looked at my build.rs scripts, and in order to replace them, I'd need these features:

  • Ability to configure which dependencies (including indirect dependencies-of-dependencies) are linked statically or dynamically, per project, per OS.

    • For example on macOS I have to link to libpng statically, but I should link to zlib dynamically. On Windows everything static. On Linux it's distro-dependent.
  • Run bindgen for libraries that have version-dependent ABI. For example using system-wide libvpx requires using the same .h version as installed on the system.

    • but compiling bindgen is painfully slow, so I bundle several versions of pre-compiled ffi.rs and use a build script to pick one or fall back to bindgen only when I don't have the right version already.
  • Windows-compatible pkg-config alternative. On macOS and Linux the pkg-config crate works for 80% of cases, but on Windows packaging is a hopeless mess and I build from source instead :(

  • Compatibility with cmake and autotools to build a complex library that is not just in C, but also builds architecture-dependent assembly files.

@luser

This comment has been minimized.

Copy link
Contributor

luser commented Jul 6, 2017

This issue came up in discussions the Firefox build team had with @alexcrichton last week. One thing that "build scripts as black boxes" makes difficult is caching build outputs with sccache. Currently we can cache the outputs of rustc invocations, but we still have to run all the build scripts, and some of them spend a lot of time doing things like invoking third-party build systems. If we had a more declarative syntax we could cache the output of build scripts and avoid unnecessary work. For this to work we'd need a full list of inputs and outputs for the build script up-front. I could imagine this being a little tricky for build scripts that are doing things like "build a project using the cmake crate to invoke its cmake build system" (servo-freetype-sys is one example I know of that does this.)

@kornelski

This comment has been minimized.

Copy link
Contributor

kornelski commented Jul 6, 2017

Another problem is that for build scripts (and Make, etc.) environmental variables are "dependencies" too.

It would be good if env vars could be declared as well, so that Cargo would correctly rebuild projects when env vars change.

@jmesmon

This comment has been minimized.

Copy link
Contributor

jmesmon commented Jul 6, 2017

For this to work we'd need a full list of inputs and outputs for the build script up-front

Even knowing the list of inputs & outputs after a single invocation of the build script (ie: not up-front) should allow a bit of improvement here as one would be able to avoid running the build script again.

Another alternate is allowing a way to ask a build script to "tell me what you need, but avoid doing any work" (though that may not be feasible for a build script to provide).

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Sep 2, 2017

There's an RFC for build system integration now up at rust-lang/rfcs#2136

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment