New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hermeticize cargo build. #5742

Merged
merged 9 commits into from May 20, 2018

Conversation

Projects
None yet
3 participants
@jsirois
Copy link
Member

jsirois commented Apr 24, 2018

The cargo build is now hermetic in order to be transparent to
build tools like IDEA. Using the links in build-support/bin/native
like cargo, rustdoc and rustfmt to operate directly on the
modules in src/rust/engine works as one would expect of the
corresponding cargo tools.

@jsirois jsirois force-pushed the jsirois:issues/4558/rust_toolchain branch 3 times, most recently from 20e6d51 to 8c455fc Apr 24, 2018

@jsirois jsirois requested review from illicitonion , dotordogh , ity and baroquebobcat and removed request for dotordogh Apr 27, 2018

-name '*.rs' -not -wholename '*/bazel_protos/*' -not -wholename '*/target/*')
"${NATIVE_ROOT}/process_execution/bazel_protos/src/verification.rs"
"${NATIVE_ROOT}/process_execution/bazel_protos/build.rs"
${REPO_ROOT}/build-support/bin/native/cargo fmt --all --

This comment has been minimized.

@jsirois

jsirois Apr 27, 2018

Member

NB: Alot of dancing here due to the switch from use of rustfmt <files> directly to cargo fmt --all. It think this was worth it, cargo has comprehension of our rust project; ie: it knows we generate proto files and skips trying to format those as you'd hope. The sticky wicket was cargo fmt acting on our own src/rust/engine/process_execution/bazel_protos/src/lib.rs which declares modules for that generated code which might not be present. More comments in src/rust/engine/process_execution/bazel_protos/.gitignore explain the rest of the dance along with changes in build-support/bin/check_rust_formatting.sh to deal with exit code 2.

}

# Echos directories to add to $PATH.
function ensure_native_build_prerequisites() {

This comment has been minimized.

@jsirois

jsirois Apr 27, 2018

Member

This broke out into bootstrap_rust.sh with changes to produce a cargo tools binary symlink farm under build-support/bin/native.

echo "${NATIVE_ROOT}/target/${MODE}/libengine.${LIB_EXTENSION}"
}

function bootstrap_native_code() {

This comment has been minimized.

@jsirois

jsirois Apr 27, 2018

Member

This broke out into bootstrap_code.sh which uses the broken-out bootstrap_rust.sh.

@@ -1,20 +0,0 @@
# coding=utf-8

This comment has been minimized.

@jsirois

@jsirois jsirois requested a review from cosmicexplorer Apr 27, 2018

@illicitonion
Copy link
Contributor

illicitonion left a comment

Thanks for putting this together, it generally looks great, and a lot better than what was there before :)

Could you add a README.md in build-support/bin/native, or add to the one in src/rust/engine, which explains roughly what is now found in there and why? I can see someone wanting to quickly work out what's going on there, and reading through all of the scripts will be a lot of context gathering to answer simple questions.

I'm not entirely convinced by the changes to protobuf generation. I think what you've got here is a lot harder to read and reason about than what we currently have. Can you talk through the concrete benefits it provides?

)
)
case $? in
2)

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

I'd be tempted to throw in a cargo check -p bazel_protos before the formatting check to ensure that the files exist, mostly just to make this script simpler, but I don't feel strongly

This comment has been minimized.

@jsirois

jsirois May 1, 2018

Member

I don't think that helps. Fundamentally, without an isolated crate for all the generated code and just the generated code, there will be at least one file (lib.rs here) that links to modules that won't exist in a fresh workspace in which one could run this script (and CI does exactly this in one shard). Even using cargo +nightly and the ignore option for rustfmt, that file will get parsed and lead to an exit code of 2. See the comments in the bazel_protos .gitignore.

This comment has been minimized.

@illicitonion

illicitonion May 2, 2018

Contributor

I must be missing something... The cargo check -p bazel_protos would force the code to be generated, so by doing it in this script, we would guarantee that the generated code was present before we tried to lint.

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

Gotcha, and agreed.

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

Although, more shenanigans to make this ~bulletproof. Much easier understood shenanigans and self-contained in build.rs.

# This is also fine since any real syntax errors will be caught by compile checks!
echo >&2 "NB: Skipped formatting some files due to syntax errors."
if [[ -z "${CHECK_RUST_FORMATTING_DEBUG}" ]]; then
echo >&2 "To see the errors, run:"

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

Indent

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

Obsoleted.

echo >&2 "Usage: $0 host util_name version [filename]"
echo >&2 "Example: $0 binaries.pantsbuild.org go 1.7.3 go.tar.gz"
exit 1
REPO_ROOT=$(cd $(dirname "${BASH_SOURCE[0]}") && cd ../.. && pwd -P)

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

We use this script internally in Twitter's monorepo, so if we can avoid hard-coding assumptions about the source tree layout, that would be handy; particularly, if we're just using this to look up the CACHE_ROOT...

This comment has been minimized.

@jsirois

jsirois May 1, 2018

Member

REPO_ROOT need not be the real repo root, its only used to calculate a path relative to the Pants repo root. So, unless your process moves Pants repo files around, this should still work unless I'm missing something.

This comment has been minimized.

@illicitonion

illicitonion May 2, 2018

Contributor

It currently doesn't replicate the whole build-support tree, just the bin directory therein. But this is fixable in the twitter repo :)

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

Aha. OK - I'm happy to let Twitter fix that then. Pants support scripts really shouldn't be bending over backwards to support hidden use-cases.


source "${REPO_ROOT}/build-support/pants_venv"

if (( $# != 1 )); then

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

I'm not familiar with (( ... )) - what does this do? I would have expected if [[ $# -ne 1 ]]; then here

This comment has been minimized.

@jsirois

jsirois May 1, 2018

Member

In bash (( ... )) is a mathematics-aware version of test, so you can un-awkwardly say !=.

readonly NATIVE_ENGINE_BINARY="native_engine.so"
readonly NATIVE_ENGINE_RESOURCE="${REPO_ROOT}/src/python/pants/engine/${NATIVE_ENGINE_BINARY}"

# N.B. Set $MODE to "debug" to generate a binary with debugging symbols.

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

Our release mode binaries actually still have debug symbols in them (see a comment in src/rust/engine/Cargo.toml's profile.release section). --release just controls optimisation (and so compile/link times)

This comment has been minimized.

@jsirois

jsirois May 1, 2018

Member

I just carried this comment along in the refactor move, but I'll fix.

String::from_utf8(data).unwrap()
}

impl ExecutionError {

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

impl From<Output> for ExecutionError feels a little more idiomatic than a random method named from (and means that below you would just Err(output.into()))

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

Agreed, fixed.

@@ -0,0 +1,143 @@
use std::env;

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

Throughout this file: I tend to try to .unwrap() only when things are guaranteed not to fail, and instead to .expect("Message") when things may fail, so that you have a little context. I think that could be useful everywhere that you're .unwrap()ing in this file.

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

Obsoleted here, trimmed down API with the simplification of bazel_protos/build.rs now delegating to cargo.sh for protoc setup. That said, I followed this in cffi_build.rs.

pub_mod_stmts.sort();
let contents = format!(
"\
// This file is generated. Do not edit.

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

Include instructions for how to alter its generation, and how to manually force a re-generation.

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

Obsoleted since this is no longer checked in.

@@ -0,0 +1,11 @@
// This file is generated. Do not edit.

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

If it's generated, it probably shouldn't be checked in. Does it need to be?

This comment has been minimized.

@jsirois

jsirois May 1, 2018

Member

It does per the comments in .gitignore. See what you think of that explanation / state of affairs.

This comment has been minimized.

@illicitonion

illicitonion May 2, 2018

Contributor

I'm tempted, if we're going to start checking in generated code, to just check in the generated protobuf structs, and side-step all of these issues. Seems like it would probably prevent a lot of complexity and hassle. WDYT?

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

No longer checked in, embedding cargo check -p bazel_protos in check_rust_formatting.sh obviates the need for all this dance.

@@ -1,29 +1,99 @@
use std::env::home_dir;

This comment has been minimized.

@illicitonion

illicitonion Apr 30, 2018

Contributor

Ditto the unwrap vs expect comment above

This comment has been minimized.

@jsirois

jsirois May 2, 2018

Member

The only remaining unwraps after simplification are can't-happens afaict. IE known Pants repo path names being converted to UTF-8, listing of directories known to exist in the repo, etc. Let me know if we have different ideas about this though.

@jsirois

This comment has been minimized.

Copy link
Member

jsirois commented May 1, 2018

Could you add a README.md in build-support/bin/native ...

Good idea, will do.

I'm not entirely convinced by the changes to protobuf generation. ...

This may be another eye of the beholder. Prior to this change part of the knowledge of protbuf paths is in generate-grpc.sh and part in build.rs to trigger changed re-runs. In addition build.rs has no easy way to get CACHE_ROOT right when emitting changed lines for grpc_rust_plugin and protoc-gen-rust. Finally that mechanism required cargo-ensure-installed.

Here, as much of the logic as possible is self-contained with only the binary download of protoc shelled out. Its true the logic is still spread between multiple bits. Prior to this change, it was build.rs, generate-grpc.sh, download_binary.sh and bootstrap.sh, now it's build.rs, download_binary.sh and bootstrap_rust.sh. Things seem slightly simpler to follow to me anyway.

@illicitonion

This comment has been minimized.

Copy link
Contributor

illicitonion commented May 2, 2018

I'm not entirely convinced by the changes to protobuf generation. ...

This may be another eye of the beholder. Prior to this change part of the knowledge of protbuf paths is in generate-grpc.sh and part in build.rs to trigger changed re-runs. In addition build.rs has no easy way to get CACHE_ROOT right when emitting changed lines for grpc_rust_plugin and protoc-gen-rust. Finally that mechanism required cargo-ensure-installed.

Here, as much of the logic as possible is self-contained with only the binary download of protoc shelled out. Its true the logic is still spread between multiple bits. Prior to this change, it was build.rs, generate-grpc.sh, download_binary.sh and bootstrap.sh, now it's build.rs, download_binary.sh and bootstrap_rust.sh. Things seem slightly simpler to follow to me anyway.

Yeah, I can see where you're coming from :) Particularly with the codegen'd mod.rs.

I think the reason I found the status quo simpler is because the shell script (which I find pretty easy to just read and see what it does) encapsulates all of the "what am I doing" logic pretty cleanly, and the build.rs is just a tiny "run the shell script" wrapper, but I can see that the hidden "fetch and install some binaries" piece in bootstrapping, and the magical cargo:rerun-if-changed stuff in the build.rs is actually pretty magical. LGTM!

@jsirois jsirois force-pushed the jsirois:issues/4558/rust_toolchain branch from fcea963 to e987677 May 2, 2018

@jsirois

This comment has been minimized.

Copy link
Member

jsirois commented May 2, 2018

Thanks for the detailed review! All feedback addressed. PTAL.

@illicitonion
Copy link
Contributor

illicitonion left a comment

Looks great, thanks for the clean-up and improvement!

src/ - rust stdlib sources (generated)
<remaining symlinks> - symlinks to the remaining $CARGO_HOME/bin tools (generated)
```
Some explanation of the checked-in bootstrsap scripts is in order:

This comment has been minimized.

@illicitonion

illicitonion May 3, 2018

Contributor

typo: s/bootstrsap/bootstrap/

This comment has been minimized.

@jsirois

jsirois May 19, 2018

Member

Fixed


+ `cargo.sh`

These serve as a replacement for a vanilla cargo, wrapping execution of an underlying pristine

This comment has been minimized.

@illicitonion

illicitonion May 3, 2018

Contributor

These? Plural? But just talking about one file? Should the header be "cargo and cargo.sh"?

This comment has been minimized.

@jsirois

jsirois May 19, 2018

Member

It was exactly as you say, but kept the header as-is and singularfied.

function _build_native_code() {
# Builds the native code, and echos the path of the built binary.

"${REPO_ROOT}/build-support/bin/native/cargo" build ${MODE_FLAG} \

This comment has been minimized.

@illicitonion

illicitonion May 3, 2018

Contributor

I can get on board with that, but the --manifest-path arg doesn't look fixed :)

print!("{}", String::from_utf8(output.stdout).unwrap());
eprint!("{}", String::from_utf8(output.stderr).unwrap());
exit(output.status.code().unwrap());
let mut module = File::create(gen_dir.join("mod.rs")).unwrap();

This comment has been minimized.

@illicitonion

illicitonion May 3, 2018

Contributor

In production code I'd probably write this as:

File::create(gen_dir.join("mod.rs"))
  .and_then(|file| file.write_all(contents.as_bytes()))
  .expect("Failed to write mod.rs")

both because I don't care whether the create or write failed, and to give a useful error on failure, but in a build.rs I care less :)

This comment has been minimized.

@jsirois

jsirois May 19, 2018

Member

I like it, done.

@@ -76,7 +98,7 @@ fn mark_for_change_detection(path: PathBuf) -> PathBuf {
path
}

fn make_flags(env_script_path: &Path) -> Result<Vec<String>> {
fn make_flags(env_script_path: PathBuf) -> Result<Vec<String>> {

This comment has been minimized.

@illicitonion

illicitonion May 3, 2018

Contributor

This seems a strange change; this function doesn't appear to need ownership? (Not objecting, just curious :))

This comment has been minimized.

@jsirois

jsirois May 19, 2018

Member

Agreed, reverted.

@stuhood

stuhood approved these changes May 7, 2018

Copy link
Member

stuhood left a comment

This nuclear reactor passes my inspection.

@stuhood stuhood force-pushed the pantsbuild:master branch from b6bb42d to 9e2fdb5 May 11, 2018

@jsirois jsirois force-pushed the jsirois:issues/4558/rust_toolchain branch from 86ffcab to bc62e68 May 19, 2018

@jsirois

This comment has been minimized.

Copy link
Member

jsirois commented May 19, 2018

I can get on board with that, but the --manifest-path arg doesn't look fixed :)

Fixed for real now.

jsirois added some commits Apr 24, 2018

Hermeticize cargo build.
The cargo build is now hermetic in order to be transparent to
build tools like IDEA. Using the links in `build-support/bin/native`
like `cargo`, `rustdoc` and `rustfmt` to operate directly on the
modules in `src/rust/engine` works as one would expect of the
corresponding cargo tools.
Fixes prompted by review feedback:
+ Simplify format checks by running code gen minimally first.
+ Use rust better: leverage `From`/`Into` and `expect`.
+ Simplify bazel_protos/build.rs and, as a result, build_utils.
+ Quote paths consistently in shell scripts.
Add a link to rustup.
This can be useful for inspecting the Pants toolchain version,
opening the proper stdlib doc or the proper book.

@jsirois jsirois force-pushed the jsirois:issues/4558/rust_toolchain branch from bc62e68 to 6968480 May 20, 2018

@jsirois jsirois merged commit 195f4d5 into pantsbuild:master May 20, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jsirois jsirois deleted the jsirois:issues/4558/rust_toolchain branch May 20, 2018

stuhood added a commit that referenced this pull request Jul 15, 2018

Fix local execution of hermetic integration tests (#6101)
### Problem

Post #5742, `def hermetic(cls): True` integration tests fail in environments where `PATH` or `PY` are not set due to a non-zero exit code from `which python2.7`, which causes a failure of the script due to `set -e`.

### Solution

Remove unused variable.

### Result

Hermetic integration tests work locally and/or on OSX.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment