Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: js_binary launcher script not portable from host to exec platform #1168

Open
alexeagle opened this issue Jul 20, 2023 · 5 comments
Open
Labels
bug Something isn't working investigation needed Investigation required to proceed further

Comments

@alexeagle
Copy link
Member

What happened?

When RBE is used for cross-compilation, the host platform may be different from the exec platform.

In my case I have a linux_x86 host platform, so the launcher created by ctx.actions.expand_template here

ctx.actions.expand_template(
template = ctx.file._launcher_template,
output = launcher,
substitutions = launcher_subst,
is_executable = True,
)

will create a file with a node path pointing to the host-resolved toolchain, with linux_x86 arch.

Now, I enable RBE and the exec platform is linux_arm64. The launcher script is copied to the remote and tries to spawn node for the wrong arch, which of course fails with executable format error cannot execute binary file ...nodejs_linux_amd64...`

Version

Bazel 6.2.1, latest of rules_js

How to reproduce

Tricky since you need an RBE setup with alternate architecture.

Any other information?

No response

@alexeagle alexeagle added the bug Something isn't working label Jul 20, 2023
@github-actions github-actions bot added the untriaged Requires traige label Jul 20, 2023
@alexeagle
Copy link
Member Author

alexeagle commented Jul 20, 2023

Studied this with @gregmagolan

Let's look at the output of bazel aquery //some:build_smoke_test --config=rbe --config=aarch64 where those config flags enable the cross-platform RBE behavior:

action 'Expanding template some/build_smoke_test.sh'
  Mnemonic: TemplateExpand
  Configuration: k8-fastbuild-aarch64
  Execution platform: //tools/platforms:linux_x86_jetpack5
...
  Substitutions: [
    {{{node}}: my_workspace/../nodejs_linux_amd64/bin/nodejs/bin/node}

...

runfiles for //some:build_smoke_test
  Mnemonic: Middleman
  Target: //some/aerial/frontend:build_smoke_test
  Configuration: k8-fastbuild-aarch64
  Execution platform: //tools/platforms:linux_x86_jetpack5
  ActionKey: 709e80c88487a2411e1ee4dfb9f22a861492d20c4765150c0c794abd70f8147c
  Inputs: [..., external/nodejs_linux_amd64/bin/nodejs/bin/node]

action 'Testing //some:build_smoke_test'
  Mnemonic: TestRunner
  Target: //some:build_smoke_test
  Configuration: k8-fastbuild-aarch64
  Execution platform: //tools/platforms:linux_aarch64_jetpack5
  Command Line: (exec external/bazel_tools/tools/test/test-setup.sh \

What we see here is that even if we fixed the {{node}} template variable we put in the launcher, we would still have the wrong nodejs executable in the runfiles for the test, because the "middleman" action which generates the runfiles has an x86 exec platform. That makes this seem like a Bazel limitation with cross-platform RBE.

@alexeagle
Copy link
Member Author

I think it's just a general problem that cross-platform RBE doesn't work with platform-specific inputs that come from runfiles.

@fmeum
Copy link
Contributor

fmeum commented Jul 24, 2023

I don't think the execution platform for the middleman action matters, copying runfiles into the final location is a completely platform independent action. At first glance this looks like https://bazelbuild.slack.com/archives/CA31HN1T3/p1690184400360329?thread_ts=1690176577.746239&cid=CA31HN1T3: You may need to define an additional toolchain that matches on the target platform, not the exec platform.

@alexeagle
Copy link
Member Author

How is the target platform relevant here? This is a script used in a build action.

@fmeum
Copy link
Contributor

fmeum commented Jul 25, 2023

It's a script that references a binary obtained from the toolchain, both by substituting in its path and adding its files to runfiles. But as far as I can tell, there are always two Node toolchains of the same type, one with a target constraint and one with an exec constraint: https://github.com/bazelbuild/rules_nodejs/blob/cc742d3b02c95eb56fce241c8fff6605d9e9c315/nodejs/private/toolchains_repo.bzl#L105-L116

This can cause this problem if the exec platform is linux_arm64 and a js_binary is built in the exec configuration (that is, for linux_arm64), as then the linux_amd64 toolchain with the exec constraint for linux_amd64 can end up being selected.

This could be solved by having a second, distinct toolchain type for Node runtimes with target constraints, similar to what the native Java toolchains do.

@alexeagle alexeagle added investigation needed Investigation required to proceed further and removed untriaged Requires traige labels Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working investigation needed Investigation required to proceed further
Projects
Status: No status
Development

No branches or pull requests

2 participants