Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support expanding locations in rustc_env and build_script_env #461

Closed
wants to merge 9 commits into from

Conversation

dae
Copy link
Contributor

@dae dae commented Oct 22, 2020

This makes it possible to pass in the path to generated files and
external tools.

This potentially closes #459, closes #454, and closes #79.

The docs seem to indicate there's precedent for this in rules_cc:
https://docs.bazel.build/versions/master/be/make-variables.html#predefined_label_variables

@google-cla google-cla bot added the cla: yes label Oct 22, 2020
@dae
Copy link
Contributor Author

dae commented Oct 22, 2020

I've pushed a fix for the failing clippy run; if there was a better way to address it please let me know.

@dae dae marked this pull request as draft October 22, 2020 05:42
@dae
Copy link
Contributor Author

dae commented Oct 22, 2020

Ok, this seems to be related to external repos. Changing to the examples folder and running 'bazel test env_locations/...' works, but it fails when running 'bazel test @examples//env_locations/...' from the top level folder. The issue seems to be limited to build scripts - it works correctly in both cases for normal rust_binaries. I'm not yet familiar enough with Bazel to know how to solve this properly - any suggestions would be appreciated.

- rustc_env now expands $(rootpath //package:target), for passing
the location of files or tools that are needed at compile time
- build_script_env now expands $(execpath //package:target), for
passing the location of files or tools that are needed at build
script runtime
- cargo_script_build() now passes the data argument to the build
script runner, so the data is available at runtime

Ideally both build and run would work with execpath/location, but
the approach of walking back to the root folder that cargo_build_script()
is using doesn't seem to be possible at compile time, as ${pwd} is being
substituted after the starlark code is executed. So at compile time,
we use the relative rootpath instead.
Rust on Windows seems to struggle with relative pathnames in this case
- I suspect due to the fact that symlinks are being used.

fs::read_to_string("../../../path/to/somewhere")

gives an error

as does calling .canonicalize() on the above. Switching to \ as a
path separator does not seem to help.

If the parent directory references are handled separately, eg

Path::new("../../..").canonicalize(), then the path/to/somewhere
can be joined onto the folder and the file can be read. So instead
of automatically adjusting the path, we just return execroot verbatim,
and rely on the build script to do their own path manipulation.
@dae
Copy link
Contributor Author

dae commented Oct 25, 2020

Ok, I've got something that is passing CI. Copying in the updated commit notes:

  • rustc_env now expands $(rootpath //package:target), for passing
    the location of files or tools that are needed at compile time
  • build_script_env now expands $(execpath //package:target), for
    passing the location of files or tools that are needed at build
    script runtime
  • cargo_script_build() now passes the data argument to the build
    script runner, so the data is available at runtime

Ideally both build and run would work with execpath/location, but
the approach of walking back to the root folder that cargo_build_script()
is using doesn't seem to be possible or necessary at compile time,
so at compile time, we use the relative rootpath instead.

And from the fix:

Rust on Windows seems to struggle with relative pathnames in this case

  • I suspect due to the fact that symlinks are being used.

fs::read_to_string("../../../path/to/somewhere")

gives an error

as does calling .canonicalize() on the above. Switching to \ as a
path separator does not seem to help.

If the parent directory references are handled separately, eg

Path::new("../../..").canonicalize(), then the path/to/somewhere
can be joined onto the folder and the file can be read. So instead
of automatically adjusting the path, we just return execroot verbatim,
and rely on the build script to do their own path manipulation.

@dae dae marked this pull request as ready for review October 25, 2020 13:02
@dae dae changed the title Support expanding locations in rustc_env Support expanding locations in rustc_env and build_script_env Oct 25, 2020
"CARGO_PKG_VERSION": "0.1.2",
},
# Optional environment variables passed during build.rs execution.
# The path will be relative to the folder above bazel-out, so your
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this general change of allowing location expansion is a really useful one, but I don't love the interactions around path lookups.

In particular, this suggestion to traverse parent directories until you happen to find the right one. The bazel-out heuristic is an internal implementation detail of how bazel lays out files, and actually only happens to be true for generated files - if you did $(execroot //some/checked/in:file) there wouldn't be a bazel-out. $(location) and friends were designed assuming actions run in the execroot (or a runfiles tree which mirrors the execroot).

I think a reasonable approach here would be for us to change build scripts so that they executed in the execroot as their working directory (which means that these injected env vars are valid relative paths), and to recommend that crates explicitly add a env!("CARGO_MANIFEST_DIR") prefix to any paths they try to read at build time. We changed where build scripts run recently in #427 so I don't feel too bad about changing it again... This approach seems fairly principled, and gives a very clear workaround for crates which face problems with this.

Any required fix pull requests may appear as slightly odd to crate authors (the author may think "the working directory is env!("CARGO_MANIFEST_DIR") - why would I need to be explicit?"), but hopefully they'd just find it odd (and understand that alternative build tools may behave differently), rather than objectionable...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it should also work on source files, as we're only referencing "bazel-out" in the build script, which is always generated. But I take your point about this being a bit of a hack, and an approach that doesn't require hunting for execroot would be both more ergonomic and less fragile.

Perhaps one way to solve this would be to add another argument to cargo_build_script() that controls whether the working dir is execroot or the manifest dir? Crates that have bought into the Bazel ecosystem could then consume the locations directly, and third-party crates that assume cargo will set the working dir to the manifest dir for them would continue to function without patching.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels like a reasonable compromise to me - we could perhaps also set an additional BAZEL_EXEC_ROOT env var in the manifest dir path case, to help folks who are trying to support both well to support conditional behaviour.

One challenge I've run into is crates which use prost-build, so it may be interesting to run through this as a concrete example:

With https://github.com/danburkert/prost/pull/324 (current unreleased, sadly) and this PR, you can set build_script_env to something like {"PROTOC": "$(execroot @com_google_protobuf//:protoc)"}:

In exec-root-as-working-directory mode, things should Just Work assuming all other paths (e.g. paths to the protobufs being codegen'd) are using the $CARGO_MANIFEST_DIR prefix properly.
In cargo-manifest-dir-working-directory mode, it would require the build script to prepend ${BAZEL_EXEC_ROOT} to PROTOC, which works but is a bit annoying... Alternatively, we could extend ${pwd}-substitution to the build script runner (which seems pretty reasonable), so that one could simply use:
{"PROTOC": "${pwd}/$(execroot @com_google_protobuf//:protoc)"} (or possibly {"PROTOC": "$${pwd}/$(execroot @com_google_protobuf//:protoc)"} - I always get lost in the escaping here!)
and the build script runner would do the right thing in both modes...

Actually, the more I talk it through, the more I'm convinced we don't need both modes, we just need ${pwd}-substitution everywhere that we support $(location) substitution - how does that sound to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, solving the prost-build problem was one of the motivations for this PR, as I'm using it in my project too.

Substituting pwd sounds good, I'll look into it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of the latest push? It adds the execroot automatically rather than relying on the user to do so - the rationale being that it's consistent with the way things like genrule() behave. But I can change it to require an explicit ${pwd} reference if you'd prefer.

And if you'd like me to squash these commits to tidy things up, please let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I experimented with resolving execpath/${pwd} for rust_binary/rust_test(), but it didn't work - rustc_env is only available at compile time, and if we bake in the execroot, the execution stage is running in a different sandbox, and the path is no longer valid - so they will need to continue using rootpath instead)

@dae dae marked this pull request as draft October 25, 2020 22:12
dae added a commit to ankitects/rules_rust that referenced this pull request Oct 25, 2020
Copy link
Collaborator

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I just tested this out and it worked beautifully.

@dfreese @mfarrugi How do y'all feel about implicitly making $(execpath) into an absolute path? On the one hand, it's surprising and un-bazel-y, on the other hand, the fact that we're running actions not in the execpath is also surprising and un-bazel-y, and this feels like a reasonable way to compensate for that weirdness...

cargo/cargo_build_script.bzl Show resolved Hide resolved
@dfreese
Copy link
Collaborator

dfreese commented Oct 27, 2020

LGTM - I just tested this out and it worked beautifully.

@dfreese @mfarrugi How do y'all feel about implicitly making $(execpath) into an absolute path? On the one hand, it's surprising and un-bazel-y, on the other hand, the fact that we're running actions not in the execpath is also surprising and un-bazel-y, and this feels like a reasonable way to compensate for that weirdness...

I'm not thrilled by it, particularly since I consider prost-build to be a bad actor by requiring an executable path to be compiled in. (I ended up patching the env! macros out of the build script), but I don't know if I see a better way around it.

@illicitonion
Copy link
Collaborator

LGTM - I just tested this out and it worked beautifully.
@dfreese @mfarrugi How do y'all feel about implicitly making $(execpath) into an absolute path? On the one hand, it's surprising and un-bazel-y, on the other hand, the fact that we're running actions not in the execpath is also surprising and un-bazel-y, and this feels like a reasonable way to compensate for that weirdness...

I'm not thrilled by it, particularly since I consider prost-build to be a bad actor by requiring an executable path to be compiled in. (I ended up patching the env! macros out of the build script), but I don't know if I see a better way around it.

As of https://github.com/danburkert/prost/pull/324 it's at least a runtime rather than compile-time env var :)

@dae
Copy link
Contributor Author

dae commented Oct 29, 2020

A tidied up version of this is in #468 - I'll close this one in favour of the new one.

@dae dae closed this Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants