Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No way to use toolchains correctly in executable rule output #19645

Open
anguslees opened this issue Sep 27, 2023 · 39 comments
Open

No way to use toolchains correctly in executable rule output #19645

anguslees opened this issue Sep 27, 2023 · 39 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Configurability platforms, toolchains, cquery, select(), config transitions type: feature request

Comments

@anguslees
Copy link

anguslees commented Sep 27, 2023

Description of the feature request:

A common pattern1 I see in rules is toolchain declarations that effectively capture a tool path, and templated executable/test rules that resolve the toolchain and substitute the tool path into the executable file output.

Unfortunately this is incorrect! When the exec and target platforms differ, the toolchain will be resolved for the exec platform, but the executable/test output needs to be for the target platform.

As far as I can see, there is no way to make toolchain resolution / platform transitions do the correct thing here. Since executable rules are how bazel tests work, this is a pretty major gap in the toolchains/platforms story. I can't file bugs against rules, because there is no reasonable solution to suggest. This is a feature request for a solution that is (a) correct, and also ideally (b) easy/obvious.

One onerous current workaround is to declare two toolchains for all such cases. One that uses exec_compatible_with/target_compatible_with as usual, and is used when the toolchain is invoked by an action directly. And a second toolchain that uses target_compatible_with only, which can be used in executable rule output. This is moderately terrible, since this duplicates work for everybody, confuses the meaning of exec_compatible_with vs target_compatible_with, and toolchain authors can't anticipate when their toolchain needs to be used in this way - undoing the loose coupling between author and consumer that is kind of the point of toolchains.

An incorrect (afaics) version of the two-toolchains approach is to only declare target_compatible_with toolchain, then create a "current_foo_toolchain" rule that resolves the toolchain for a particular target, and returns the toolchain provider through a rule attribute - see eg python toolchain. This is incorrect in general because another toolchain that depends on this toolchain cannot be resolved at the same time for the same exec+target combination (ie: toolchains=[foo] with a dependency on another rule with toolchains=[bar], is not the same as toolchains=[foo, bar]).

As a possible implementation, I'd love to be able to do exec_group(cfg='target'), since I think ctx.exec_groups["target"].toolchains is an intuitive and 'obvious' way to address the above.

Another option is to not use toolchains for what are effectively artifacts. Rule authors obviously like the loose coupling between toolchain declaration and use however, so this potential path would require widespread advocacy/documentation, and perhaps a canonical meta-library for an "artifact toolchain".

See also slack thread: https://bazelbuild.slack.com/archives/CA31HN1T3/p1690176577746239

Which category does this issue belong to?

Core

What underlying problem are you trying to solve with this feature?

It should be possible to resolve and refer to the correct toolchain for an executable rule output.

Which operating system are you running Bazel on?

macos and Linux

What is the output of bazel info release?

6.3.2

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

Footnotes

  1. Examples of (incorrect) toolchains used for executable rules in rules_js, rules_docker, rules_oci, gazelle

@anguslees anguslees changed the title Wanted: Easy and correct way to use toolchains in executable rule output There is no way to use toolchains correctly in executable rule output Sep 27, 2023
@anguslees anguslees changed the title There is no way to use toolchains correctly in executable rule output No way to use toolchains correctly in executable rule output Sep 27, 2023
@Pavank1992 Pavank1992 added the team-Core Skyframe, bazel query, BEP, options parsing, bazelrc label Sep 27, 2023
@moroten
Copy link
Contributor

moroten commented Jan 25, 2024

Any updates on this?

@iancha1992 iancha1992 added the team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts label Jan 25, 2024
@comius comius added team-Configurability platforms, toolchains, cquery, select(), config transitions and removed team-Core Skyframe, bazel query, BEP, options parsing, bazelrc team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts labels Feb 8, 2024
@katre
Copy link
Member

katre commented Mar 13, 2024

I'm very confused about your use case, hopefully we can figure out what makes sense.

Typically, the intent of toolchains is that they are used during build actions: a c++ compiler is called, an analyzer is run, whatever.

It sounds like you want some files from a toolchain, but you want your build rule to bundle those into the outputs? We've used toolchains for that in a few cases (the Python runtime is probably the best example), but it's always been a tricky fit, because it's using the tools as an output to be passed along.

Maybe you don't want a toolchain, you want a direct (target configured) dependency on the tools, so that they can be bundled into your rule's output?

@anguslees
Copy link
Author

anguslees commented Mar 14, 2024

Maybe you don't want a toolchain, you want a direct (target configured) dependency on the tools, so that they can be bundled into your rule's output?

I agree, but see my numerous linked examples in the footnote of the opening post. This mis-pattern is quite widespread, because folks (apparently) want to use toolchains in tests or bazel-run scripts (ie: the output of executable rules).

This is a feature request for an improvement to bazel toolchains/exec_groups to make it possible to resolve toolchain for a rule's target/parent. Or alternatively we loudly declare this will 'never' be possible, and (as a community), we go and rewrite these examples to avoid toolchains.

I'm ok with either path to address this issue - right now I'm kind of stuck waiting for a decision. Folks "expect" this to work, but it doesn't - and if I'm going to delete their toolchain then I'd at least like to be able to link to this PR (or some other doc) as evidence for why that was the correct change to their rules.

@katre
Copy link
Member

katre commented Mar 21, 2024

Toolchains (uniquely) have two options for their dependencies:

  • Anything with cfg = "exec" is configured for the exec platform, which is the same as the requesting target's exec platform
  • Anything with cfg = "target" (which is the default if cfg isn't specified) is in the same configuration as the requesting target.

So part of the solution is probably to be more careful of the target/exec distinction, here. It's unfortunately easy to get this wrong in the default case, since the base case is that the host platform is used as both the only exec platform and as the default target platform.

I agree that there is probably more we can be doing to make this cleaner. I'm going to add this item to our backlog for tracking purposes. If you have any specific features you'd like to request, we'd be happy to see a proposal or further discussion to move this along.

@katre katre added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed more data needed untriaged labels Mar 21, 2024
@anguslees
Copy link
Author

anguslees commented Mar 23, 2024

So part of the solution is probably to be more careful of the target/exec distinction, here.

I think we're talking past each other. I (believe I am) familiar with cfg=target/cfg=exec and when to use them. This does not help with the problem described in this issue.

I'll explain one example here in detail, to see if that helps:

rules_oci has a target that can be used similar to bazel run //my/image:push to upload an artifact (it's an OCI image, but the specific file format is not relevant) to a repository. As with all other bazel-run targets, the rule for this //my/image:push target is an "executable" rule, which returns the file (commonly a script) that is going to be executed, but doesn't actually run the file itself.

This executable rule in this case is the following:

def _impl(ctx):
    crane = ctx.toolchains["@rules_oci//oci:crane_toolchain_type"]
    yq = ctx.toolchains["@aspect_bazel_lib//lib:yq_toolchain_type"]

    if not ctx.file.image.is_directory:
        fail("image attribute must be a oci_image or oci_image_index")

    _, _, _, maybe_digest, maybe_tag = util.parse_image(ctx.attr.repository)
    if maybe_digest or maybe_tag:
        fail("`repository` attribute should not contain digest or tag. got: {}".format(ctx.attr.repository))

    executable = ctx.actions.declare_file("push_%s.sh" % ctx.label.name)
    files = [ctx.file.image]
    substitutions = {
        "{{crane_path}}": crane.crane_info.binary.short_path,
        "{{yq_path}}": yq.yqinfo.bin.short_path,
        "{{image_dir}}": ctx.file.image.short_path,
        "{{fixed_args}}": " ".join(_quote_args(["--repository", ctx.attr.repository])),
    }
    if ctx.attr.remote_tags:
        files.append(ctx.file.remote_tags)
        substitutions["{{tags}}"] = ctx.file.remote_tags.short_path

    ctx.actions.expand_template(
        template = ctx.file._push_sh_tpl,
        output = executable,
        is_executable = True,
        substitutions = substitutions,
    )
    runfiles = ctx.runfiles(files = files)
    runfiles = runfiles.merge(yq.default.default_runfiles)
    runfiles = runfiles.merge(crane.default.default_runfiles)

    return DefaultInfo(executable = executable, runfiles = runfiles)

# Refactored slightly from the original for clarity
oci_push = rule(
    implementation = _impl,
    attrs = _attrs,
    toolchains = [
        "@rules_oci//oci:crane_toolchain_type",
        "@aspect_bazel_lib//lib:yq_toolchain_type",
    ],
    executable = True,
)

To summarise the relevant parts: this rule gets the paths for two tools (crane and yq) using bazel toolchains, substitutes those paths into a shell script template, and then returns that shell script as the executable output file that will eventually be invoked by bazel-run.

But this use of toolchains is incorrect when cross-compiling!

Specifically: When bazel host platform != exec platform (eg: run bazel on macos with a linux RBE cluster), the rule implementation above will run on the exec (linux) platform, but the output executable needs to work on the host (macos) platform. The toolchains will incorrectly resolve to the exec (linux) platform, and the executable output will be a script that points to linux tools that will fail when run on macos. The correct crane and yq tools to use here should be the tools for the target platform of this rule, and not the exec platform.

But there is NO WAY to resolve the toolchain for the target of a rule!

^^This is the bazel limitation described in this github issue. If there is anything I can do to expand or clarify the above explanation, please suggest it because it seems to be subtle/difficult for folks to recognise the issue here and I'd like to point folks to this comment for enlightenment.

Afaik cfg=target is not relevant here, since there is nowhere to put that that will affect toolchain resolution - the ctx.toolchains["@rules_oci//oci:crane_toolchain_type"] result will always be resolved for the exec platform.

Afaics this incorrect use (or misunderstanding) of toolchains is moderately common (see my examples in widely used rules), and can impact both bazel-run and bazel tests (both executable rules). Ideally, we improve bazel so that toolchains can be used in these important use cases.

@moroten
Copy link
Contributor

moroten commented Mar 26, 2024

This looks very related to #21805, where I suggest the --run_under target should be built for the host platform.

@anguslees
Copy link
Author

anguslees commented Apr 17, 2024

This looks very related to #21805

Related, but not the same - just to prevent an overzealous close-as-duplicate.

I agree --run_under (and run generally) should be built for host platform. Providing a better default --platform won't fix this no-way-to-resolve-toolchain-for-other-platform issue, however.

@EdSchouten
Copy link
Contributor

EdSchouten commented May 23, 2024

Hey @anguslees,

I did some experimenting, and it looks like what you want is possible by doing the following:

  1. You move the toolchains into a helper rule, which returns the toolchain information through a provider.
  2. You instantiate a target of that helper rule in some BUILD file.
  3. You let the original rule depend on the target of the helper rule, and extract the toolchains from the provider.
  4. You apply an outgoing edge transition on that dependency.

The outgoing edge dependency can make use of the following transition:

def _transition_exec_to_target_impl(settings, attr):
     return {
         "//command_line_option:extra_execution_platforms": settings["//command_line_option:platforms"],
     }

 transition_exec_to_target = transition(
     implementation = _transition_exec_to_target_impl,
     inputs = ["//command_line_option:platforms"],
     outputs = ["//command_line_option:extra_execution_platforms"],
 )

I will send out a PR against rules_oci soon to get oci_push() fixed.

That said, I do think that there is a bit of voodoo magic going on. Transitioning on command line flags is a bit funky. Especially because we're doing it on --extra_execution_platforms (which is fortunately documented as prepending to the list of registered platforms, which is exactly what we want). Maybe it makes sense to add something like this, either to @bazel_tools, @bazel_skylib, or @aspect_bazel_lib. Any preference for a place?

EdSchouten added a commit to EdSchouten/rules_oci that referenced this issue May 24, 2024
Bazel doesn't provide a very elegant way to specify that a toolchain
should be resolved in such a way that it can be executed on a *target*
platform:

bazelbuild/bazel#19645

To work around this, we can move the toolchains into a small helper rule
named oci_push_tools(). We can use an outgoing edge transition on the
oci_push() side to set --extra_execution_platforms equal to --platforms,
thereby causing the toolchains to be resolved the way we want.

Though I think using transitions for this is pretty slick, it's a shame
that this has to be done by patching up command line options.
@anguslees
Copy link
Author

anguslees commented May 27, 2024

Oh wow - I remember considering this, but I think I must have just incorrectly concluded that transitioning --extra_execution_platforms wouldn't work, without trying.

Fwiw on bazel 6.4.0, I needed to force the //command_line_option:platforms labels to strings first:

def _transition_exec_to_target_impl(settings, attr):
     return {
         "//command_line_option:extra_execution_platforms": [str(p) for p in settings["//command_line_option:platforms"]],
     }

Combined with the resolved_toolchain() idiom I've seen around a few places, this is actually not terrible. In fact I suspect many folks who have copied resolved_toolchain from rules_java - but are not actually portable across platforms - need to add this transition in order to be correct.

EdSchouten added a commit to EdSchouten/rules_rust that referenced this issue Jun 7, 2024
The toolchain_files() and current_rust_toolchain() rules return files
for the current 'exec' platform. This is perfecly reasonable if you want
to use these tools as part of ctx.actions.run(). But if you want to use
these as part of 'data = []' (runfiles), they must be built for the
target.

This change adds a helper rule, toolchain_files_for_target(), which can
take the aforementioned rules as an input, transitioning it so that it
yields files for the target platform. I'm not happy with how this rule
is implemented, but there is little we can do about that until
bazelbuild/bazel#19645 is addressed.
EdSchouten added a commit to EdSchouten/rules_rust that referenced this issue Jun 7, 2024
The toolchain_files() and current_rust_toolchain() rules return files
for the current 'exec' platform. This is perfecly reasonable if you want
to use these tools as part of ctx.actions.run(). But if you want to use
these as part of 'data = []' (runfiles), they must be built for the
target.

This change adds a helper rule, toolchain_files_for_target(), which can
take the aforementioned rules as an input, transitioning it so that it
yields files for the target platform. I'm not happy with how this rule
is implemented, but there is little we can do about that until
bazelbuild/bazel#19645 is addressed.
EdSchouten added a commit to EdSchouten/rules_rust that referenced this issue Jun 7, 2024
The toolchain_files() and current_rust_toolchain() rules return files
for the current 'exec' platform. This is perfecly reasonable if you want
to use these tools as part of ctx.actions.run(). But if you want to use
these as part of 'data = []' (runfiles), they must be built for the
target.

This change adds a helper rule, toolchain_files_for_target(), which can
take the aforementioned rules as an input, transitioning it so that it
yields files for the target platform. I'm not happy with how this rule
is implemented, but there is little we can do about that until
bazelbuild/bazel#19645 is addressed.
EdSchouten added a commit to EdSchouten/rules_rust that referenced this issue Jun 7, 2024
The toolchain_files() and current_rust_toolchain() rules return files
for the current 'exec' platform. This is perfecly reasonable if you want
to use these tools as part of ctx.actions.run(). But if you want to use
these as part of 'data = []' (runfiles), they must be built for the
target.

This change adds a helper rule, toolchain_files_for_target(), which can
take the aforementioned rules as an input, transitioning it so that it
yields files for the target platform. I'm not happy with how this rule
is implemented, but there is little we can do about that until
bazelbuild/bazel#19645 is addressed.
EdSchouten added a commit to EdSchouten/rules_rust that referenced this issue Jun 7, 2024
The toolchain_files() and current_rust_toolchain() rules return files
for the current 'exec' platform. This is perfecly reasonable if you want
to use these tools as part of ctx.actions.run(). But if you want to use
these as part of 'data = []' (runfiles), they must be built for the
target.

This change adds a helper rule, toolchain_files_for_target(), which can
take the aforementioned rules as an input, transitioning it so that it
yields files for the target platform. I'm not happy with how this rule
is implemented, but there is little we can do about that until
bazelbuild/bazel#19645 is addressed.
@BoleynSu
Copy link
Contributor

BoleynSu commented Jul 1, 2024

See bazel-contrib/bazel-lib#875 for how it can be simplified.

You're basically suggesting that every "tool" toolchain should declare/register two toolchains,

Not very toolchain, only those who want to be used in cfg=target. Note that a toolchain can have binary for both cfg=exec and cfg=target.

For example, python toolcahin can have python-bytecode-compiler with cfg=exec and python-interpreter with cfg=target. For a py_binary rule, it use python-bytecode-compiler to generate bytecode and package a binary that runs python-interpreter with the generated bytecode. How do you handle this case without having seperate toolchain for python-bytecode-compiler and python-interpreter?

@EdSchouten's transition approach although seems to work but it does not really. There are cases it won't work. See bazel-contrib/rules_oci#590 (comment)

@anguslees
Copy link
Author

anguslees commented Jul 7, 2024

Not very toolchain, only those who want to be used in cfg=target

As described earlier in this github comment thread, the toolchain author can't know if their toolchain is going to be used in a test/run (cfg=target), or used directly in an action (cfg=exec). So yes, we would have to pre-emptively declare and configure every 'tool' toolchain twice, just in case.

For a py_binary rule, it use python-bytecode-compiler to generate bytecode and package a binary that runs python-interpreter with the generated bytecode. How do you handle this case without having seperate toolchain for python-bytecode-compiler and python-interpreter?

This is ~exactly what transition and toolchains were designed to support aiui! The classic case is a C compiler, that simultaneously needs a compiler for 'exec' platform, and a libc for 'target'. This is why a single toolchain resolution matches exec and target simultaneously, and the idea is that the toolchain has some attributes with cfg=exec (for files that are executed as part of build), and cfg=target (for files that are copied opaquely into the output).

I agree that this is not how rules_python uses toolchains. In an alternate reality, we would have a single python toolchain-type that matched on exec and target, and this toolchain would have a python interpreter for both bytecode generation at build-time (matching the 'exec' constraints) and another python interpreter for copying into runfiles (matching 'target' constraints).

I think we're talking in circles, and a github issue is perhaps not the best forum for a discussion. Happy to discuss further on bazel slack, or elsewhere.

@BoleynSu
Copy link
Contributor

BoleynSu commented Jul 14, 2024

As described #19645 (comment) in this github comment thread, the toolchain author can't know if their toolchain is going to be used in a test/run (cfg=target), or used directly in an action (cfg=exec). So yes, we would have to pre-emptively declare and configure every 'tool' toolchain twice, just in case.

A toolchain should have a very clear API. For example, if it provides a binary, it should be clearly documented in the API that is should be used as part of the build action or as part of the output artifact.

This is ~exactly what transition and toolchains were designed to support aiui!

The transition solution is completely wrong. See bazel-contrib/rules_oci#590 (comment) and bazel-contrib/rules_oci#590 (comment)

I think we are confused with two different concepts, 1) toolchain used to run as part of the build action; 2) a (possibly prebuilt) binary that is registered through toolchain.

For 1), suppose a toolchain provides

CcCompileInfo(compile = a function)

where

CcCompileInfo.compile(srcs = "a.cc") returns a DefaultInfo that is a binary.

When defining a cc_toolchain, you can provide a exec binary, e.g., gcc and implment compile as using gcc to compile the source code to an executable. Or you can provide a exec binary, e.g., clang and a target binary lli (https://llvm.org/docs/CommandGuide/lli.html) and implement compile as using clang to compile the source code to LLVM IR and package the lli with the generated IR as the binary. Note that either way, the binaries are never exposed as part of the toolchain's API. The only API is the compile function.

For 2), we basically want a magic binary that works on very supported platform. One can acheive it by using

alias(name="cc-compiler", actual = select(...))

However, this is not flexible enough as one may want to register their own compiler. Then we can use toolchain that accept a cfg=target binary and a special rule to use this toolchain to produce the binary for the target platform.

toolchain_type(name="binary")
toolchain_def(name="def-{platfrom}", binary = "{platfrom}_gcc", target_compatible_with={platfrom})
toolchain(name="cc-{platfrom}", toolchain="def-{platfrom}", toolchain_type="binary")

resolved_cc_binary(name="resolved_cc") # resolved_cc_binary use toolchain_type binary to retrive the binary.

Then the user should just directly use resolved_cc however they want.
If they want to use it in build action

_tool = attr.label(cfg="exec", executable=True, default = ":resolved_cc")

If they want to use it in the output artifact

_runner = attr.label(cfg="target", executable=True, default = ":resolved_cc")

@lberki
Copy link
Contributor

lberki commented Sep 3, 2024

Let me see if I understand this correctly: in the current model of Bazel, a toolchain has two configurations: the one it runs on and the one it targets (e.g. a Rust compiler can run on x64 and produce ARM binaries). So each toolchain has to be tagged with these two configurations and Bazel selects one that matches the exec and target configurations and it assumes that the reason why one uses a toolchain is to run it during the build and consume its output.

However, some people want to use the loose coupling afforded by toolchains to choose what to incorporate into their actual outputs: say, one may want to produce an SDK that runs on x64 and package an x64 to ARM compiler into it. In this case, the right thing to do is to match the "target" configuration of Bazel to the "exec" configuration of the toolchain. Bazel currently can't to that.

The "two toolchain" workaround is to wrap every toolchain in two toolchain rules: one that's "honest" and one that "lies", in the above case saying that the target configuration of the compiler is x64. This is obviously wrong because, well, that's not what the toolchain does.

@katre any clever ideas here? I can imagine a lot of things but you have a lot more state about this in your head than I do.

@BoleynSu
Copy link
Contributor

BoleynSu commented Sep 3, 2024

The "two toolchain" workaround is to wrap every toolchain in two toolchain rules: one that's "honest" and one that "lies", in the above case saying that the target configuration of the compiler is x64. This is obviously wrong because, well, that's not what the toolchain does.

There is nothing wrong and no lies. See my above comment. It is totally fine for a toolchain to provides binaries for target and/or exec. Normally, compiler toolchain provides exec binaries to run at build time and runtime toolchain provides target binaries to use at runtime. The issue is that people misunderstood the concept such that they use the wrong binary. If you use a java runtime toolchain, you should not run java at build time. If you use a java toolchain, you should not package javac into your output and run it at runtime.

@lberki
Copy link
Contributor

lberki commented Sep 3, 2024

For Java, that's arguably the case since the JDK and the JVM are different, but what about a C++ compiler or just a regular tool? If you accept the above SDK use case as valid, Bazel currently forces you to put the exec configuration of the C++ toolchain into its target_compatible_with attribute for Bazel to do the right thing.

I think the conceptual difference between us is that you consider "toolchain to be used in the build" and "toolchain to be packaged as the result of the build" different things; in my mind, they are just one "thing" but two different use cases: it's true that one needs to provide much more rich information about the toolchain to be invoked during the build than just to package it, but a list of files the toolchain is comprised of is necessary for both and it feels silly to duplicate the exec platform compatibility information and the set of files like this.

It's a reasonable workaround, but it requires the owner of every toolchain to cooperate and very public interfaces should be very narrow.

@fmeum
Copy link
Collaborator

fmeum commented Sep 3, 2024

I agree with @BoleynSu here. The current approach is conceptually sound, the problem is that it's apparently too verbose or too hard to use correctly for simple cases that don't need the full flexibility afforded by toolchains.

That's where the resolved_X_binary rule mentioned in #19645 (comment) can come in to help. Maybe we just need to make this pattern less verbose by abstracting it into a shared helper macro?

@lberki
Copy link
Contributor

lberki commented Sep 3, 2024

resolved_*_binary() isn't a full solution, is it? In particular, it allows one to decide what toolchain to use based on either the exec platform or the target platform, but not both (i.e. it doesn't support the "this compiler runs on x64 and emits ARM code"). So resolved_*_binary() helps, but it needs to be used in addition to the "regular" toolchain things.

I don't think Bazel allows one to match the constraints of the target platform to the exec constraints of a toolchian, does it? (except by the awkward transition above #19645 (comment) )

There is some merit in the argument that "toolchain as functionality" and "toolchain as bunch of files" are different, though. The awkwardness here is that if it's to work, every author of a toolchain for a particular type must cooperate and it's extra boilerplate for all of them.

@EdSchouten
Copy link
Contributor

EdSchouten commented Sep 3, 2024

resolved_*_binary() isn't a full solution, is it? In particular, it allows one to decide what toolchain to use based on either the exec platform or the target platform, but not both (i.e. it doesn't support the "this compiler runs on x64 and emits ARM code"). So resolved_*_binary() helps, but it needs to be used in addition to the "regular" toolchain things.

Exactly. This is also the reason I've been highly skeptical of that approach. There won't be a way you can use your Mac to craft a Linux x86 SDK that targets some RTOS for ARM.

@BoleynSu
Copy link
Contributor

BoleynSu commented Sep 3, 2024

@lberki Firstly, note that the transition approach is broken.

resolved_*_binary() is to suppose simple use case where the tool's output is platform independent, e.g. jq/yq/protoc/wc/cut/sort/uniq.

but it needs to be used in addition to the "regular" toolchain things.

For your use case, I think you will need define a proper toolchain by bundling all the binaries(compiler/linker), configs(flags/config files), artifacts(static libraries/dynamtic libraries/source code libraries) instead of relying on the binaries provided by some other toolchain.

@EdSchouten
What you want should already be acheivable with exec-groups I believe. There will be some boilerplate though.

def toolchain_with_exec_platform(impl, toolchain_type, platform_constraints):
  return rule(
    impl,
    exec_groups = {
        “host”: exec_group(
            exec_compatible_with = platform_constraints,
            toolchains = toolchain_type,
        ),
    },
    attrs = {
        "tool": attr.label(cfg = config.exec("host"))
    },
  )

mac_cc_toolchain = toolchain_with_exec_platform(impl_to_pakcage_the_everything, cc_tolchain_type, //os:mac)
mac_cc_toolchain(name = "mac_cc_toolchain")

platform_transition_filegroup( # https://docs.aspect.build/rulesets/aspect_bazel_lib/docs/transitions/
name = "mac_cc_toolchain_targeting_rtos",
target_platform = rtos,
srcs = [":mac_cc_toolchain"]
)
bazel build :mac_cc_toolchain_targeting_rtos

@katre
Copy link
Member

katre commented Sep 4, 2024

This is a pretty interesting discussion, and the distinction between "toolchain which provides a binary as a tool for actions" and "toolchain which provides a binary to be packaged in an artifact" is a pretty subtle one.

I think there are several patterns in this thread that could be applied to rules, using the current APIs. So, I think the next step is to figure out what is the correct outcome of this issue?

Idea 1: Someone (possibly @BoleynSu?) could write up some documentation for bazel.build describing when this pattern is useful, and what rule authors need to do to use it.

Idea 2: Possibly simultaneously, someone (@BoleynSu? @fmeum?) could write some helper macros for constructing these distinct varieties of toolchains (updating the documentation from the step above).

Question 1: Is there an actual request for changes in Bazel (presumably in toolchain resolution) here?

From the discussion so far, this feels like a problem in 1) documenting the underlying patterns and when they are useful, and 2) convincing rule authors to use them to make these available to users. Is that a reasonable summary?

@lberki
Copy link
Contributor

lberki commented Sep 5, 2024

The only goal I have here is that this use case is doable somehow. I don't have a preference towards changing Bazel or providing libraries to do so in principle, however, in this case, the former seems more reasonable because (in order of what I think is the strongest to the weakest argument):

  1. In order for a toolchain type to be usable this way, all the toolchains that implement it would have to use whatever wrapper we'd cook up
  2. The necessary information is already there in Bazel, but not usable by users
  3. I make sense conceptually to represent a toolchain as one "thing" and all proposed wrapper code breaks this

@BoleynSu
Copy link
Contributor

BoleynSu commented Sep 7, 2024

@katre @lberki I would prefer to make no changes to Bazel if these can be implemented as library (to make Bazel less complex). I can write some docs and would be great if @fmeum can also help.

There is no need to ask very toolchain to be usable with the wrapper. For example, if I define a cc_toolchain, I may or may not want to expose the compiler to user of the toolchain, i.e., CcToolchainInfo(compile=a function to use the binary). Only if I do want to expose the binary, we can register the compiler binary to "//gcc:toolchain_type" and use "//gcc:resolved_binary" with cfg="exec" in cc_toolchain.

Also I want to emphasis that the transition_exec_to_target approach is broken. See bazel-contrib/rules_oci#590 (comment)

There are aleady a few PRs based on such hack merged and some pendding. @EdSchouten could you hold on pending PRs?

@lberki
Copy link
Contributor

lberki commented Sep 9, 2024

I'm not sure "making Bazel less complex" is the right thing to optimze for here.

It's a good idea to optimize for "less complexity is seen by the people who use Bazel", but that complexity is not just that of Bazel, but also that of all external Bazel code they depend on and that includes toolchain type definitions and their implementations. IOW, if we can buy simplicity in many toolchains with a little additional wrinkle in Bazel, that's a good deal.

That said, your argument that toolchains are not necessarily single groups of files is a good one for arguing against finding a solution in Bazel (for example, even the C++ compiler is not just a single group of file because Bazel has the ability to take different sets of inputs for e.g. compile and link actions).

@katre
Copy link
Member

katre commented Sep 17, 2024

Coming back to this: @lberki, I can see your point, but I'm very unclear if there's an actual request besides "make this easier".

I'm happy to review and discuss a concrete proposal, but I'm not sure the Configurability team can actually spend any time on this in the near future.

@lberki
Copy link
Contributor

lberki commented Sep 18, 2024

@katre fair enough; I think the insight above that a toolchain doesn't even necessarily need to provide any files to be a toolchain (which manifests in the code as toolchains being free to provider any provider, DefaultInfo is not necessarily needed) makes it undesirable that we provide a general "put the files comprising this toolchain on the inputs of that action" functionality.

I think what may make sense is being able to ask Bazel "give me a toolchain whose host configuration matches my target configuration", but given that it has a workaround in Starlark (declaring two toolchains and two toolchain types) and that we'd have to decide what target configuration that toolchain should have, I don't think we should treat this as a priority.

I guess if there was a concrete proposal to do so, @katre would entertain it, wouldn't he?

@katre
Copy link
Member

katre commented Sep 18, 2024

I would be very happy to read and discuss a concrete proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Configurability platforms, toolchains, cquery, select(), config transitions type: feature request
Projects
None yet
Development

No branches or pull requests