Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Local Remote Execution #471

Merged
merged 1 commit into from
Dec 14, 2023

Conversation

aaronmondal
Copy link
Contributor

@aaronmondal aaronmondal commented Dec 12, 2023

Provide generated toolchain configurations and a reproducible remote execution toolchain container with a C++ toolchain that can be used for fully hermetic integration tests.

Once it's stabilized this allows us to distribute application-specific toolchain containers and to share CI caches with contributors to eliminate redundant rebuilds.


This change is Reviewable

Copy link
Contributor

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 25 of 25 files at r1, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @aaronmondal, @allada, @blakehatch, and @MarcusSorealheis)


flake.nix line 38 at r1 (raw file):

          customStdenv = import ./tools/llvmStdenv.nix { inherit pkgs; };

          # TODO(aaronmondal): This doesn't work with rules_rust yet.

tioli: Can we github issue this for backlog and link issue id here instead?


local-remote-execution/generated/cc/BUILD line 95 at r1 (raw file):

    cpu = "k8",
    cxx_builtin_include_directories = [
        "/nix/store/i7sgy7izmjhdajnjmqzfp7v5j9jhy0qp-clang-wrapper-16.0.6/resource-root/include",

Are these hashes stable across machines?


local-remote-execution/generated/cc/BUILD line 113 at r1 (raw file):

        "-L/nix/store/1na9hqj99809qdydgqqkipr2b8gqprzb-libunwind-16.0.6/lib",
        "-lc++",
        "-Wl,-rpath,/nix/store/nn3wm6dxy3ps0835kgc9jx4601l1ai5q-libcxx-16.0.6/lib,-rpath,/nix/store/zb21z3mhidfrl3nw6i5a6hrzm84xx7jb-libcxxabi-16.0.6/lib,-rpath,/nix/store/1na9hqj99809qdydgqqkipr2b8gqprzb-libunwind-16.0.6/lib,-rpath,/nix/store/qn3ggz5sf3hkjs2c797xf7nan3amdxmp-glibc-2.38-27/lib",

Been awhile for me in regards to native linking, what is the general rule of thumb on ordering of libcxx/libcxxabi/libunwind/glibc?


local-remote-execution/generated/cc/BUILD line 149 at r1 (raw file):

        "-no-canonical-prefixes",
        "-Wno-builtin-macro-redefined",
        "-D__DATE__=\"redacted\"",

Why redacted and does this get defined later in the pipeline?

Copy link
Collaborator

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 12 of 25 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @aaronmondal, @blakehatch, and @MarcusSorealheis)


.bazelrc line 60 at r2 (raw file):

# Local remote execution flags.

build:lre --incompatible_enable_cc_toolchain_resolution

nit: Lets reference ticket:
bazelbuild/bazel#7260


.bazelrc line 61 at r2 (raw file):


build:lre --incompatible_enable_cc_toolchain_resolution
build:lre --define=EXECUTOR=remote

Lets reference this ticket:
bazelbuild/bazel#7254


.bazelrc line 62 at r2 (raw file):

build:lre --incompatible_enable_cc_toolchain_resolution
build:lre --define=EXECUTOR=remote
build:lre --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1

nit: Maybe use --repo_env instead?

Also maybe reference:
bazelbuild/bazel#19714 (comment)


local-remote-execution/README.md line 21 at r2 (raw file):

First, create an OCI image containing with the toolchains. Make sure to run this
command *not* from within this directory, but from the root of the nativelink

nit: Lets just say:

Make sure you run this command from the WORKSPACE root (project root).

The way it's currently worded is kinda weird due to the use of "not" (negative word).


local-remote-execution/README.md line 41 at r2 (raw file):

If you have the remote execution container deployed as a remote executor you can

nit:
as a remote executor -> as a worker

maybe?


local-remote-execution/README.md line 42 at r2 (raw file):

If you have the remote execution container deployed as a remote executor you can
use switch to remote execution. For instance when using the [Kubernetes

nit: remove use


local-remote-execution/rbe_configs_gen_skip_pull.diff line 9 at r2 (raw file):

 		dockerPath:     "docker",
 	}
-	if _, err := runCmd(d.dockerPath, "pull", d.containerImage); err != nil {

Is there an upstream bug for this, if so, can we add it as a commit message to this diff?


local-remote-execution/examples/BUILD.bazel line 6 at r2 (raw file):

    name = "hello_lre",
    srcs = ["hello.cpp"],
    # Enable verbose logging just for this target

nit: end in period.


local-remote-execution/generated/LICENSE line 1 at r2 (raw file):


nit: Is this required? I don't think we need to ship the apache license if we are bundling it with our own code. In this case anyway we are also Apache2 licensed, so it shouldn't be an issue.


local-remote-execution/generated/cc/cc_wrapper.sh line 25 at r2 (raw file):

# Call the C++ compiler
/nix/store/51cw7fa986m44q5rw0rxvl0pbskhccw8-customClang/bin/customClang "$@"

nit: remove set -eu and prefix this with exec. This will cause the command to "assume control" of the process reducing a fork call and one less process for the system to manage.

But if this is generated meh.

@MarcusSorealheis
Copy link
Collaborator

once the above comments are addressed, I will doin a pass.

Copy link
Contributor Author

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 20 of 25 files reviewed, 7 unresolved discussions (waiting on @adam-singer, @allada, @blakehatch, and @MarcusSorealheis)


.bazelrc line 61 at r2 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

Lets reference this ticket:
bazelbuild/bazel#7254

Done.


local-remote-execution/rbe_configs_gen_skip_pull.diff line 9 at r2 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

Is there an upstream bug for this, if so, can we add it as a commit message to this diff?

This is not a bug. We're disabling this because it lets us directly load the image into the docker-daemon instead of pulling it from an intermediary container registry. Skipping this hash is not a generally safe practice and only works here because our images are reproducible.

I plan to rewrite the entire rbe_configs_gen tool. In our case it's technically not necessary to use images at all and if we do want to generate toolchains from images we can use more modern/performant tools under the hood.


local-remote-execution/generated/LICENSE line 1 at r2 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: Is this required? I don't think we need to ship the apache license if we are bundling it with our own code. In this case anyway we are also Apache2 licensed, so it shouldn't be an issue.

Generated. I plan to rewrite the rbe_configs_gen tool entirely. Until then let's keep this around just to be sure. It should be safe to remove this again later on.


local-remote-execution/generated/cc/BUILD line 95 at r1 (raw file):

Previously, adam-singer (Adam Singer) wrote…

Are these hashes stable across machines?

Yes. This is the reason why LRE works. Since these packages are reprocudible in nix, this will have the same hash on every x86_64-linux system. This also acts as an important safeguard mechanism: If for whatever reason we can't reproduce this hash on a machins the toolchain will hard-error instead of using a potentiall not binary-identical tool that happens to live in e.g. /usr/bin.


local-remote-execution/generated/cc/BUILD line 113 at r1 (raw file):

Previously, adam-singer (Adam Singer) wrote…

Been awhile for me in regards to native linking, what is the general rule of thumb on ordering of libcxx/libcxxabi/libunwind/glibc?

Generally linking order should roughly reflect how "low level" a library is. So it would make sense to link libcxx before libcxxabi. It can get more tricky with less standard compilations like CUDA, HIP and OpenCL, and also when using sanitizers. But here it most likely doesn't matter. I'd be surprised if these libraries shared any symbols.

Glibc is usually linked last as it's the lowest level library.


local-remote-execution/generated/cc/BUILD line 149 at r1 (raw file):

Previously, adam-singer (Adam Singer) wrote…

Why redacted and does this get defined later in the pipeline?

These flags are generated. Every cc_toolchain_config should set this to prevent dates timestamps from leaking into the build and making it irreproducible. This will automatically be added to all cc_* build compiler invocations.


local-remote-execution/generated/cc/cc_wrapper.sh line 25 at r2 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: remove set -eu and prefix this with exec. This will cause the command to "assume control" of the process reducing a fork call and one less process for the system to manage.

But if this is generated meh.

Generated.

Copy link
Contributor

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 5 files at r3, all commit messages.
Reviewable status: 24 of 25 files reviewed, 5 unresolved discussions (waiting on @aaronmondal, @allada, @blakehatch, and @MarcusSorealheis)


local-remote-execution/generated/cc/BUILD line 149 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

These flags are generated. Every cc_toolchain_config should set this to prevent dates timestamps from leaking into the build and making it irreproducible. This will automatically be added to all cc_* build compiler invocations.

Makes sense. Was going to ask if "redacted" is commonly used as a value name, seems so https://github.com/search?q=-D__DATE__%3D%5C%22redacted%5C%22&type=code&p=1


tools/generate-toolchains.nix line 15 at r3 (raw file):

  set -xeuo pipefail

  cd $(git rev-parse --show-toplevel)

Capture as a bash variable and reuse?

SRC_ROOT=$(git rev-parse --show-toplevel)
cd "${SRC_ROOT}"
[...]
--output_src_root=${SRC_ROOT}

Copy link
Contributor

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 24 of 25 files reviewed, 7 unresolved discussions (waiting on @aaronmondal, @allada, @blakehatch, and @MarcusSorealheis)


local-remote-execution/README.md line 3 at r3 (raw file):

# Local Remote Execution

This provides rapidly upgradeable framework to construct and iterate on custom

This line is repeating the same information in the next line, maybe a copy error?


local-remote-execution/examples/BUILD.bazel line 6 at r3 (raw file):

    name = "hello_lre",
    srcs = ["hello.cpp"],
    # Enable verbose logging just for this target.

Probably can remove this comment, seem low value

Copy link
Contributor Author

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 20 of 25 files reviewed, 4 unresolved discussions (waiting on @adam-singer, @allada, @blakehatch, and @MarcusSorealheis)


tools/generate-toolchains.nix line 15 at r3 (raw file):

Previously, adam-singer (Adam Singer) wrote…

Capture as a bash variable and reuse?

SRC_ROOT=$(git rev-parse --show-toplevel)
cd "${SRC_ROOT}"
[...]
--output_src_root=${SRC_ROOT}

Oh yeah that's much better. Note that nix syntax requires the ${ to be escaped as ''${ and ${something} is shorthand for "nix eval something".

Copy link
Collaborator

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r3.
Reviewable status: 21 of 25 files reviewed, 2 unresolved discussions (waiting on @aaronmondal, @adam-singer, @blakehatch, and @MarcusSorealheis)


local-remote-execution/rbe_configs_gen_skip_pull.diff line 9 at r2 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

This is not a bug. We're disabling this because it lets us directly load the image into the docker-daemon instead of pulling it from an intermediary container registry. Skipping this hash is not a generally safe practice and only works here because our images are reproducible.

I plan to rewrite the entire rbe_configs_gen tool. In our case it's technically not necessary to use images at all and if we do want to generate toolchains from images we can use more modern/performant tools under the hood.

Can we write this as a commit message on this diff?

Copy link
Collaborator

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 21 of 25 files reviewed, 2 unresolved discussions (waiting on @aaronmondal, @adam-singer, @blakehatch, and @MarcusSorealheis)

Copy link
Contributor Author

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 20 of 25 files reviewed, 1 unresolved discussion (waiting on @adam-singer, @allada, @blakehatch, and @MarcusSorealheis)


local-remote-execution/rbe_configs_gen_skip_pull.diff line 9 at r2 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

Can we write this as a commit message on this diff?

Done.

Copy link
Contributor

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 4 files at r4, 1 of 1 files at r5, 2 of 2 files at r6, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @allada, @blakehatch, and @MarcusSorealheis)

Copy link
Contributor

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2, 1 of 5 files at r3.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @allada, @blakehatch, and @MarcusSorealheis)

Provide generated toolchain configurations and a reproducible remote
execution toolchain container with a C++ toolchain that can be used for
fully hermetic integration tests.

Once it's stabilized this allows us to distribute application-specific
toolchain containers and to share CI caches with contributors to
eliminate redundant rebuilds.
Copy link
Contributor

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r7, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @allada, @blakehatch, and @MarcusSorealheis)

@aaronmondal aaronmondal merged commit 449376b into TraceMachina:main Dec 14, 2023
13 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants