Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

host_platform, platforms and toolchains are incompatible with select #10396

Closed
ozio85 opened this issue Dec 10, 2019 · 17 comments
Closed

host_platform, platforms and toolchains are incompatible with select #10396

ozio85 opened this issue Dec 10, 2019 · 17 comments
Assignees
Labels
more data needed team-Configurability Issues for Configurability team

Comments

@ozio85
Copy link

ozio85 commented Dec 10, 2019

Description of the problem / feature request:

I have stared to use cc_toolchain resolution via toolchains by enabling:
--incompatible_enable_cc_toolchain_resolution
and instead of cc_toolchain_suite, register all toolchains via "register_toolchain"

  • This means I use --host_platform and --platforms to select what platform I compile.
  • I have about 10 different compilers
  • I download the compilers as external dependencies

The selection works fine, however I want to avoid to download all external compiler dependencies, therefore i would like to use SELECT on the compiler dependencies. This is currently not possible

First the constraint_setting are set to target platform controlled by the "--platforms" argument.
Then the contraint_setting are set to the host platform controlled by the "--host_platform" argument.
THEN the compiler files are resolved, which means they are always resolved to the "host_platform".

This means the target compiler files are never available in the sandboxed actions, instead the host compiler files are available (which they should not).

Feature requests: what underlying problem are you trying to solve with this feature?

toolchains should use config_settings and build_settings to determine toolchains, not constaint_settings and contraint_values.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Use select to resolve compiler files.

What operating system are you running Bazel on?

Windows and Linux

What's the output of bazel info release?

1.2.0

Have you found anything relevant by searching the web?

No

@iirina iirina added team-Configurability Issues for Configurability team untriaged labels Dec 16, 2019
@katre
Copy link
Member

katre commented Dec 16, 2019

First, I'd like to apologize that this wasn't responded to sooner: most of us just got back from BazelCon, but we should have been checking new issues, so I am sorry.

Secondly, your actual issue:

  1. Why are you setting --host_platform? Obviously you need --platform to control your compilation target, but why is the host platform detection not working for you?
  2. I don't understand your distinction between "target compiler files" and "host compiler files". When the --incompatible_enable_cc_toolchain_resolution flag is set, the cc toolchains themselves are configured for the execution platform (which is typically the host platform), but the selection uses the target_compatible_with and exec_compatible_with attributes from the toolchain target. You should be able to use this to select the proper compiler for each target platform without worrying about the host platform.

I think I'm not understanding your system, is there any way you can let me take a look at your BUILD files, or at a simple example of the same setup?

@katre katre self-assigned this Dec 16, 2019
@ozio85
Copy link
Author

ozio85 commented Dec 16, 2019


# I use select to avoid downloading all files, since register_toolchains seem to
# want to download all its compiler dependencies.
filegroup(
    name = "compiler_files",
    srcs = select({
        "@MyPlatforms///:aarch64_gcc": [
            "//Tools:aarch64_tools",
            "@MyAArch64Compiler//:linux_all" # A Lot of file i don't want to download
        ],
        "@MyPlatforms///:arm32_gcc": [
            "//Tools:arm32_tools",
            "@MyGccArmCompiler//:linux_all" # A Lot of file i don't want to download
        ],
        "@MyPlatforms///:mingw32": [
            "//Tools:mingw_tools",
            "@MyMingw32Compiler//:windows_all", # A Lot of file i don't want to download
        ],
        "@MyPlatforms///:mingw64": [
            "//Tools:mingw_tools",
            "@MyMingw64Compiler//:windows_all", # A Lot of file i don't want to download
        ],
        "@MyPlatforms///:msvc64": [
            "//Tools:msvc64_tools",
            "@MyMsvc64Compiler//:all_x64", # A Lot of file i don't want to download
            "@WindowsKits_10//:all_x64", # A Lot of file i don't want to download
        ],
        "@MyPlatforms///:gcc": [
            "//Tools:pclinux64_tools",
        ],
        "//conditions:default": []
    }),
)

cc_toolchain(
    name = "gcc_linux_toolchain",
    toolchain_identifier = "gcc_linux_toolchain",
    toolchain_config = ":gcc_linux_toolchain_config",

    supports_param_files = 1,

    all_files = ":compiler_files",
    ar_files = ":compiler_files",
    compiler_files = ":compiler_files",
    dwp_files = ":empty",
    linker_files = ":compiler_files",
    objcopy_files = ":empty",
    strip_files = ":empty",
)

toolchain(
    name = "cc-toolchain-x64-linux",
    exec_compatible_with = [
        "@MyPlatforms//Cpu:x86_64",
        "@MyPlatforms//Os:linux",
    ],
    target_compatible_with = [
        "@MyPlatforms//Cpu:x86_64",
        "@MyPlatforms//Os:linux",
        "@MyPlatforms//Compiler:gcc",
    ],
    toolchain = ":gcc_linux_toolchain",
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
)

# This is the problematic toolchain..
# host_platform is set to:
#  "@MyPlatforms//Cpu:x86_64",
#  "@MyPlatforms//Os:linux",
# platforms is set to:
#  "@MyPlatforms//Cpu:arm",
#  "@MyPlatforms//Compiler:arm_gcc",
# 
# BUT the only files that will be available is linux gcc files!

cc_toolchain(
    name = "arm32_linux_toolchain",
    toolchain_identifier = "arm32_linux_toolchain",
    toolchain_config = ":arm32_linux_toolchain_config",

    supports_param_files = 1,

    all_files = ":compiler_files",
    ar_files = ":compiler_files",
    compiler_files = ":compiler_files",
    dwp_files = ":empty",
    linker_files = ":compiler_files",
    objcopy_files = ":empty",
    strip_files = ":empty",
)

toolchain(
    name = "cc-toolchain-x32-arm",
    exec_compatible_with = [
        "@MyPlatforms//Cpu:x86_64",
        "@MyPlatforms//Os:linux",
    ],
    target_compatible_with = [
        "@MyPlatforms//Cpu:arm",
        "@MyPlatforms//Compiler:arm_gcc",
    ],
    toolchain = ":arm32_linux_toolchain",
    toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
)

# more compilers
...

@katre
Copy link
Member

katre commented Dec 16, 2019

Using register_toolchains shouldn't cause the actual cc toolchains to be resolved and downloaded unless they are being used. Are you passing the labels for only the toolchain targets, or the toolchain and cc_toolchain targets?

@ozio85
Copy link
Author

ozio85 commented Dec 16, 2019

I'm only registering toolchain targets

@ozio85
Copy link
Author

ozio85 commented Dec 16, 2019

But the thing is that i can see that the select in the compiler_files constraint, first gets set to "platforms", then "host_platform", then the files are resolved

@katre
Copy link
Member

katre commented Dec 18, 2019

Many targets, in both the target and host configuration, require a cc toolchain, so it's not surprising that you will see your toolchains loaded in both configurations. I'm not sure the debugging you are seeing is showing what you think you see.

Can we take a step back? What is the actual error you are seeing in your builds?

@ozio85
Copy link
Author

ozio85 commented Dec 21, 2019

Sorry for the late reply. Yes it is very likely that i have done something wrong, but I know what fixes the problem :)

So this is my host_platform:

platform(
    name = "linux_x86_64",
    constraint_values = [
        "@MyPlatforms//Os:linux",
        "@MyPlatforms//Cpu:x86_64",
        "@MyPlatforms//Compiler:gcc",
    ],
)

And my target --platforms:

platform(
    name = "linux_arm_gcc",
    constraint_values = [
        "@MyPlatforms//Os:linux",
        "@MyPlatforms//Cpu:arm",
        "@MyPlatforms//Compiler:arm_gcc",
    ],
)

So now I expect my host tools to be compiled with GCC, (which they are), and my target compiled with ARM, which it TRIES.. but in the sandbox the GCC files are made avaliable... not the ARM files, so it simply cannot find any ARM resources..

So what fixes the problem:

Splitting the filegroup "compiler_files" (see above), into one filegroup per compiler with no SELECT:
Now the compiler files are found and everything works......

Now for the big BUT:
This causes all files to be downloaded for ALL compilers, which adds 20Gb download for each user.
If I can avoid this, i can skip the select...

@katre
Copy link
Member

katre commented Dec 21, 2019 via email

@ozio85
Copy link
Author

ozio85 commented Dec 28, 2019

Ok I have made a repro, note that the toolchais are NOT complete:
repro.zip

But it is still enough to show the error, the point is that "arm_tool.sh" is missing in the sandbox, even though the correct toolchain was selected:

> INFO: Analyzed 11 targets (1 packages loaded, 18 targets configured).
> INFO: Found 11 targets...
> ERROR: /mnt/d/repro/BUILD:2:1: C++ compilation of rule '//:fake_c' failed (Exit 1) arm_tool.sh failed: error executing command Toolchains/Tools/arm/arm_tool.sh -MD -MF bazel-out/k8-fastbuild/bin/_objs/fake_c/fake.d '-frandom-seed=bazel-out/k8-fastbuild/bin/_objs/fake_c/fake.o' -iquote . -iquote bazel-out/k8-fastbuild/bin -c ... (remaining 3 argument(s) skipped)
> 
> Use --sandbox_debug to see verbose messages from the sandbox
> src/main/tools/process-wrapper-legacy.cc:58: "execvp(Toolchains/Tools/arm/arm_tool.sh, ...)": No such file or directory
> INFO: Elapsed time: 0.332s, Critical Path: 0.05s
> INFO: 0 processes.
> FAILED: Build did NOT complete successfully

@katre
Copy link
Member

katre commented Jan 2, 2020

Thank you for the reproduction case. It looks to me like both of your problems stem from the use of the single filegroup with the select. I see from the comments you have already noticed this.

Problem 1: The arm_tool.sh file is not found.

It appears that the evaluation of the select inside the filegroup is selecting the "gcc" case, not the "arm" case. I plan to debug this further to see if this is a bug or some complication with how cc toolchains are resolved, and I will report back. Changing the cc_toolchain targets to use the correct filegroup fixes this issue, as the arm toolchain correctly depends on the arm files and they exist in the sandbox.

Problem 1.B: The arm_tool.sh file is not executable.

The second error I saw was this:

ERROR: /usr/local/google/home/jcater/repos/select/BUILD:2:1: Couldn't build file _objs/fake_c/fake.o: C++ compilation of rule '//:fake_c' failed (Exit 1) arm_tool.sh failed: error executing command Toolchains/Tools/arm/arm_tool.sh -MD -MF bazel-out/k8-fastbuild/bin/_objs/fake_c/fake.d '-frandom-seed=bazel-out/k8-fastbuild/bin/_objs/fake_c/fake.o' -iquote . -iquote bazel-out/k8-fastbuild/bin -c ... (remaining 3 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
src/main/tools/linux-sandbox-pid1.cc:427: "execvp(Toolchains/Tools/arm/arm_tool.sh, 0x2280ab0)": Permission denied
Target //:fake_c failed to build

This indicates that bazel tried to execute arm_tool.sh but due to the missing exec bit, that failed. Changing the tools to be executable fixed that, and the further errors are due to the fake.c file being empty.

Problem 2: Downloading unneeded dependencies.

I think this is also due to the problem with the filegroup-and-select. When you use dedicated filegroups, do you still see this problem?

@katre
Copy link
Member

katre commented Jan 2, 2020

Okay, investigated the problem with filegroup and select further, this is caused by the lack of toolchain transitions: currently, toolchains are configured in the execution configuration, not the target, and so the select in the filegroup sees the platform as being //Platforms/Host:linux_x86_64, and so selects incorrectly.

@ozio85
Copy link
Author

ozio85 commented Jan 2, 2020

About problem 1b, ye ... i used WSL to test .. so my bad :) still the problem is illustrated.
Edit: Actually i think it is the zip that drop the permissions.. does zip have support for linux permission flags? I have seen a lot of problems that Bazel cannot handle zip files on linux.

About problem 2: Well, i tried to extend the example to also catch this problem .. but i need a bit more time to reproduce this in a concise way.

I tried to add a third toolchain, which is not used, but it it not enough to trigger the any problems (i just need to set tags = ["manual"], to avoid the cc_toolchain from beeing resolved when building "//...").

I need to get back to you next week.

@katre
Copy link
Member

katre commented Jan 6, 2020

Tracking issue for implementing toolchain transitions: #10523.

@ozio85
Copy link
Author

ozio85 commented Jan 13, 2020

Seems like the problem was that i had missed setting tags = ["manual"] on all filegroups and cc_toolchains, so with this change i have switched to using the separated filegroups. Still this issue is valid, but of lesser importance.

@katre
Copy link
Member

katre commented Jan 22, 2020

I see. Yes, if you don't have the targets tagged as manual, and you are building "//...", they will be configured and dependencies downloaded. Should we provide more documentation of this?

@ozio85
Copy link
Author

ozio85 commented Jan 22, 2020

If it is standard procedure to do this, i think one or two lines would not hurt.

Right now its more of a ”recommendation” or best practice that ”//...” should work to build. I think it should be a requirement, but then the tags might need an overhaul. (Currently the only option is to tag it manual, but some items are linked to a platform)

I am still missing some of the features the environment_groups provide, to be able to filer groups of items.. but i am hoping that platforms or build_settings have that on the roadmap.

We are moving into transitons, but we are still doing some initial development. We will probably put something on Bazel discuss when we are ready. (A totaly self-contained multi-repo structure (including python, compilers and other tools), supporting a multi-variant build using transitions, solved completely with Bazel.)

@katre
Copy link
Member

katre commented Jan 23, 2020

Okay, I've filed #10641 to track clarifying the docs. I'm going to close this now, feel free to file a new issue if you run into further confusing or poorly documented areas of toolchains or transitions.

@katre katre closed this as completed Jan 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more data needed team-Configurability Issues for Configurability team
Projects
None yet
Development

No branches or pull requests

4 participants