Bazel 0.21.0 flipped --incompatible_strict_action_env to true and broke itself #7026

philwo · 2019-01-02T15:37:43Z

Description of the problem / feature request:

Bazel 0.21.0 flipped --incompatible_strict_action_env to true via #6648, which broke CI on macOS and Windows CI. On macOS we can no longer find "md5" (which is in /sbin) and on Windows we can no longer find Python, PowerShell and other tools.

We didn't discover these breakages during testing, because apparently we don't actually test Bazel itself in the "Bazel@HEAD + Downstream Projects".

We'll revert this via a flag in the CI config files, but flag owners should work on fixing our tests (and taking user feedback into account whether this new behavior is actually working well for others) so that we can eventually remove the config entry again.

Relevant logs:

Shell tests on macOS can't find "md5": https://storage.googleapis.com/bazel-buildkite-artifacts/08029882-a54b-44b4-bb8c-01d3fca82488/src/test/shell/shell_utils_test/attempt_1.log
Tests on Windows can't find Python: https://storage.googleapis.com/bazel-buildkite-artifacts/31d11339-45e6-4efa-8eb1-b61f7db6eba8/src%5Ctest%5Cjava%5Ccom%5Cgoogle%5Cdevtools%5Cbuild%5Clib%5Cblackbox%5Ctests%5CPythonBlackBoxTest%5Cattempt_1.log
Tests on Windows can't find Git: https://storage.googleapis.com/bazel-buildkite-artifacts/31d11339-45e6-4efa-8eb1-b61f7db6eba8/src%5Ctest%5Cshell%5Cbazel%5Cgit_repository_test%5Cattempt_1.log
Tests on Windows can't find Powershell: https://storage.googleapis.com/bazel-buildkite-artifacts/31d11339-45e6-4efa-8eb1-b61f7db6eba8/src%5Ctest%5Cjava%5Ccom%5Cgoogle%5Cdevtools%5Cbuild%5Clib%5Cblackbox%5Cframework%5Cframework_tests%5Cattempt_1.log

laszlocsomor · 2019-01-02T15:48:44Z

Philipp, thanks for finding, filing, and working around this bug.

We (@philwo , @hlopko , and myself) argued about this flag today and realized that, while sh_* rules should probably not run python directly (why would a shell script need to do that?), they should be able to run their py_binary data-dependencies, which rightfully expect to have python / python.exe on the PATH.

It's unclear to me what should Bazel do. It seems that PATH's whole purpose is to contain the locations of these common tools, in which case PATH should be exempt from the "strict" env and be available to the action. On the other hand, PATH is inherently machine-specific and as such it poisons remote caches.

Another important observation @philwo made is that the "strict" action env is just as non-hermetic as the non-strict one; we just replace the clientenv's PATH with a hardcoded PATH string, but whatever is in those directories is out of our control (and not tracked by Bazel). And on one machine they might have python (and point to python2) but on another they might have no python at all or have python pointing to python3.

benjaminp · 2019-01-02T16:23:13Z

One option would be to make PATH work more like TMPDIR, where it's not part of the action key and injected by the spawn runner at the last moment. That would be a step towards fixing #6392, too.

This will help prevent bazelbuild/bazel#7026

meteorcloudy · 2019-01-02T16:25:14Z

To prevent this from happening again, I filed bazelbuild/continuous-integration#434

…eakage. Context: #7026 PiperOrigin-RevId: 227524684

This will help prevent bazelbuild/bazel#7026

brandjon · 2019-01-04T19:49:58Z

they should be able to run their py_binary data-dependencies, which rightfully expect to have python / python.exe on the PATH. [...] On the other hand, PATH is inherently machine-specific and as such it poisons remote caches.

I can't speak for PATH in general, but from the point of view of the Python interpreter: The long-term solution is to define the interpreter as a toolchain rule, where a particular toolchain target is appropriate for a specific system / path installation. The toolchain target for the current host system could be generated by a repository rule.

Currently py_runtime is the closest thing we have to a toolchain. The Python stub script only runs the python on PATH if no py_runtime was specified via --python_top.

laszlocsomor · 2019-01-07T09:56:18Z

@brandjon : Can you explain how executing a py_binary-generated target would work if the python interpreter is a toolchain? As I see it, running bazel-bin/foo/py/bin (which is currently a symlink to the stub python script, whose shebang line is #!/usr/bin/env python) requires an interpreter on the PATH.

laszlocsomor · 2019-01-07T09:58:03Z

And this is only the Linux/macOS stub script. On Windows we have another stub that is a C++ program (called the "launcher"). @meteorcloudy could correct me on this, but IIRC the local python interpreter's path is embedded in the launcher.

meteorcloudy · 2019-01-07T10:29:10Z

@laszlocsomor The native launcher also only uses python on PATH if no py_runtime was specified, basically it always uses the same python binary as the python stub template, see

bazel/src/main/java/com/google/devtools/build/lib/bazel/rules/python/BazelPythonSemantics.java

Line 200 in f451cd8

return createWindowsExeLauncher(ruleContext, pythonBinary, executable, true);

That said, the python binary (including the native launcher for python) won't be a problem if python path is explicitly specified even with --incompatible_strict_action_env

ulfjack · 2019-01-07T12:48:54Z

I think we need to change the default PATH on macos and Windows. It might be worthwhile to roll back the flag flip, except it may be too late now?

laszlocsomor · 2019-01-07T12:54:41Z

@meteorcloudy : Thanks, that's comforting to know.
@ulfjack : I agree that it's probably too late to roll back the flag flip commit now. What do you think the default PATH should contain on macOS and Windows?

filipesilva · 2019-01-07T14:33:14Z

Regarding Windows:

Since the Bazel windows prereq page lists the minimum Windows version as Windows Server 2008 R2, and that version includes powershell 2.0, it's reasonable to add it on the path.
The prereq page doesn't list python though. It should only be added if it's a prereq.

ulfjack · 2019-01-07T15:08:55Z

I don't have a mac available right now, but I take it /sbin is on the PATH by default, so we should add it to the default path in Bazel as well.

It's
/Users/[user]/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:
according to https://www.reddit.com/r/MacOS/comments/5zabo3/question_what_is_the_default_path_for_macos/, although we should leave out the users home bin directory for hermeticity.

ulfjack · 2019-01-07T15:10:13Z

For a specific proposal, I'd say we should at least set it to /usr/bin:/bin:/usr/sbin:/sbin on macOS.

meteorcloudy · 2019-01-07T15:28:09Z

I assume the strict PATH env should contain the most common native tools available on that platform. For Windows, I think they should be:

C:\Windows and C:\Windows\System32, for common Windows cmd commands, eg. findstr.exe, whoami.exe, where.exe etc.
C:\Windows\System32\WindowsPowerShell\v1.0, for powershell.exe. @philwo pointed out different powershell versions are all installed in this same path, even if v1.0 is in the path.
C:\tools\msys64\usr\bin and C:\tools\msys64\bin, for bash bin tools. This will change if users specify a different bash executable by --shell_executable or shell toolchain.

But there are two issues to be considered on Windows:

System root could change. Although the most common system root is C:\Windows, users could have different system root. If that's not C:\Windows, the default PATH env is wrong. One possible solution is to replace C:\Windows with %SYSTEMROOT%. We always set %SYSTEMROOT% for every action on Windows and it's a special env var set before launching the process, which won't affect any action cache. See

bazel/src/main/java/com/google/devtools/build/lib/windows/WindowsSubprocessFactory.java

Line 121 in bd78bc5

String[] systemEnvironmentVars = {"SYSTEMROOT", "SYSTEMDRIVE"};
File paths are case insensitive on Windows. Currently, if you change PATH from C:\foo\bar to C:\Foo\bar, the action will be re-executed, I think we should fix this.

ulfjack · 2019-01-08T10:44:07Z

We need to achieve three things here:

Have a default that's generally usable for local builds
The default must work with remote execution (e.g., no local usernames, machine-specific directories, etc.); this is to avoid cache thrashing / cross-user cache misses by default
Be easy to override as needed (with the caveat that different settings across users can cause cache misses / thrashing)

I think the images that will be built / are already being built for remote execution will largely put binaries in the locations matching the default PATH.

(I don't think I'm saying anything new, just trying to clarify the situation. Using %SYSTEMROOT% in PATH should be fine; however, we do need to figure out how to handle %SYSTEMROOT% with remote execution to prevent remote cache thrashing.)

laszlocsomor · 2019-01-08T15:51:42Z

Re @meteorcloudy :
I agree those paths (powershell, system32, msys2 bin and usr/bin) should be on the strict PATH, and nothing more.
Challenge 1 seems to be workaround-able.
I don't see why challenge 2 is a problem in the strict env case (or did you mean this for non-strict env builds?) What am I missing?

Also, it's not clear to me anymore who's waiting on whose input and who are decision makers on this bug. How can we move forward?

I'm happy to volunteer to add an incompatible flag enabling the strict PATH that @meteorcloudy proposed.

alexeagle · 2019-09-10T13:24:44Z

Is anyone working to roll this flag forward again?

In node, it's common to run Bazel underneath some other workflow tool, and these can set a different environment every time (yarn sets $PATH with a new temp directory for each run). Users will observe non-incremental builds until they discover the need for this flag.

laszlocsomor · 2019-09-13T07:45:37Z

If anyone is, it's @buchgr or @ishikhman .
(@buchgr is out for this week and the next though.)

ishikhman · 2019-09-13T11:42:32Z

there were some discussions around this. @buchgr will be able to give a better update when he is back in a week.

'--experimental_strict_action_env' was a Bazel 0.20.0 option which got renamed to '--incompatible_strict_action_env' and was planned to be the default as of 0.21.0 [1]. However, this planned flip still hasn't happened for at least the baseline of 1.0.0 (as of rc3), so we can't remove it as of now [2]. In general, we should try to avoid having --incompatible_* or --experimental_* flags in the bazelrc, as per explanation in Issue 11122 [3] as well as on the 'Backward Compatibility' page of Bazel [4]: > Users should never run their production builds with --experimental_* > or --incompatible_* flags. However, for this case we have a compelling reason to have '--incompatible_strict_action_env'. [1]: bazelbuild/bazel#6648 [2]: bazelbuild/bazel#7026 [3]: https://bugs.chromium.org/p/gerrit/issues/detail?id=11122 [4]: https://docs.bazel.build/versions/0.29.1/backward-compatibility.html Change-Id: Ie78708ffaeb1bfe9ebceb924939833f7c30eaeaf

UebelAndre · 2020-11-11T20:41:37Z

Any updates here? Can this be flipped again?

philwo · 2020-11-11T21:12:00Z

Unfortunately I’m not aware of any further work that has happened in this area. I think the situation is still unchanged from a year ago.

UebelAndre · 2020-11-12T03:17:28Z

is there an RFC for how this might be addressed?

alexeagle · 2022-01-02T19:50:49Z

Users continue to trip over this, for example
https://stackoverflow.com/questions/70385847/build-docker-images-with-bazel-targets-built/70522042

I created this PR to trigger our CI and see if Bazel still fails to build without it. See bazelbuild#7026 for context.

I created this PR to trigger our CI and see if Bazel still fails to build with this flag. See bazelbuild#7026 for context.

github-actions · 2023-06-13T01:33:11Z

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

cpsauer · 2023-06-14T04:58:33Z

@bazelbuild/triage, this issue is definitely still relevant, IMO, per Alex's comment above. Seems like this default should really get flipped.

(@bazelbuild/triage didn't seem to create a tag last I tried (no link) so I'm going to also tag @sgowroji and @Pavank1992 manually. Please coach me if you'd have preferred otherwise--and maybe update the bot's instructions)

philwo added P1 I'll work on this now. (Assignee required) breakage labels Jan 2, 2019

philwo assigned buchgr, ulfjack and laszlocsomor Jan 2, 2019

laszlocsomor assigned hlopko Jan 2, 2019

laszlocsomor mentioned this issue Jan 2, 2019

[singlejar] Fix Windows long path issue #6992

Closed

meteorcloudy added a commit to bazelbuild/continuous-integration that referenced this issue Jan 2, 2019

Add Bazel itself in downstream project

5b51ae2

This will help prevent bazelbuild/bazel#7026

meteorcloudy mentioned this issue Jan 2, 2019

Add Bazel itself in downstream project bazelbuild/continuous-integration#434

Merged

bazel-io pushed a commit that referenced this issue Jan 2, 2019

Use --noincompatible_strict_action_env on macOS and Windows due to br…

6e20cd2

…eakage. Context: #7026 PiperOrigin-RevId: 227524684

philwo pushed a commit to bazelbuild/continuous-integration that referenced this issue Jan 3, 2019

Add Bazel itself in downstream project (#434)

8910fa3

This will help prevent bazelbuild/bazel#7026

laszlocsomor mentioned this issue Jan 3, 2019

0.21.0 --incompatible_strict_action_env and unzip [duplicate of #7026] #6999

Closed

meteorcloudy mentioned this issue Jan 4, 2019

yarn bazel does not use cached builds, but bazel does angular/angular#27514

Closed

brandjon mentioned this issue Jan 4, 2019

Document effect of --incompatible_strict_action_env on autodetecting the Python interpreter #6817

Closed

ulfjack removed their assignment Jan 7, 2019

ozio85 mentioned this issue Oct 2, 2019

No way to unset path on Windows for all actions #9467

Closed

msarahan mentioned this issue Oct 17, 2019

FAILED: Build did NOT complete successfully tensorflow/tensorboard#1434

Closed

benjaminp mentioned this issue Nov 19, 2019

--incompatible_strict_action_env #6648

Closed

mariusgrigoriu mentioned this issue Nov 28, 2019

Bazel rebuilds host-configured targets when $PATH changes even with --action_env=PATH=x #10291

Closed

buchgr removed their assignment Jan 9, 2020

BenTheElder mentioned this issue Mar 3, 2020

Update repo-infra dependencies kubernetes/repo-infra#178

Merged

uri-canva mentioned this issue Mar 22, 2020

strict_action_env equivalent for repository rules #10996

Open

iphydf mentioned this issue Apr 21, 2020

--incompatible_strict_action_env breaks msan builds GoogleCloudPlatform/layer-definitions#810

Open

jin added the type: process label May 19, 2020

UebelAndre mentioned this issue Nov 11, 2020

Hermetic build.rs bazelbuild/rules_rust#490

Closed

aherrmann mentioned this issue Nov 17, 2021

Bazel: PATH differs between ctx.actions.run and ctx.actions.run_shell NixOS/nixpkgs#94222

Open

alexeagle mentioned this issue Mar 22, 2022

Flip default for --incompatible_strict_action_env bazel-contrib/SIG-rules-authors#42

Open

tjgq added a commit to tjgq/bazel that referenced this issue Jun 1, 2022

DO NOT MERGE: turn on --incompatible_strict_action_env

be0a1c3

I created this PR to trigger our CI and see if Bazel still fails to build without it. See bazelbuild#7026 for context.

tjgq added a commit to tjgq/bazel that referenced this issue Jun 1, 2022

DO NOT MERGE: turn on --incompatible_strict_action_env

c82cbdb

I created this PR to trigger our CI and see if Bazel still fails to build with this flag. See bazelbuild#7026 for context.

tjgq mentioned this issue Jun 1, 2022

DO NOT MERGE: turn on --incompatible_strict_action_env #15602

Open

jvolkman mentioned this issue Jun 1, 2022

Add bazel init to setup new workspaces #15604

Open

sgowroji mentioned this issue Jun 27, 2022

Set --incompatible_strict_action_env to true by default #15743

Open

github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Jun 13, 2023

Pavank1992 added not stale Issues or PRs that are inactive but not considered stale and removed stale Issues or PRs that are stale (no activity for 30 days) labels Jun 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bazel 0.21.0 flipped --incompatible_strict_action_env to true and broke itself #7026

Bazel 0.21.0 flipped --incompatible_strict_action_env to true and broke itself #7026

philwo commented Jan 2, 2019 •

edited by hlopko

Loading

laszlocsomor commented Jan 2, 2019 •

edited

Loading

benjaminp commented Jan 2, 2019

meteorcloudy commented Jan 2, 2019

brandjon commented Jan 4, 2019

laszlocsomor commented Jan 7, 2019

laszlocsomor commented Jan 7, 2019

meteorcloudy commented Jan 7, 2019

ulfjack commented Jan 7, 2019

laszlocsomor commented Jan 7, 2019

filipesilva commented Jan 7, 2019

ulfjack commented Jan 7, 2019

ulfjack commented Jan 7, 2019

meteorcloudy commented Jan 7, 2019 •

edited

Loading

ulfjack commented Jan 8, 2019

laszlocsomor commented Jan 8, 2019

alexeagle commented Sep 10, 2019

laszlocsomor commented Sep 13, 2019

ishikhman commented Sep 13, 2019

UebelAndre commented Nov 11, 2020

philwo commented Nov 11, 2020

UebelAndre commented Nov 12, 2020

alexeagle commented Jan 2, 2022

github-actions bot commented Jun 13, 2023

cpsauer commented Jun 14, 2023

Bazel 0.21.0 flipped --incompatible_strict_action_env to true and broke itself #7026

Bazel 0.21.0 flipped --incompatible_strict_action_env to true and broke itself #7026

Comments

philwo commented Jan 2, 2019 • edited by hlopko Loading

Description of the problem / feature request:

laszlocsomor commented Jan 2, 2019 • edited Loading

benjaminp commented Jan 2, 2019

meteorcloudy commented Jan 2, 2019

brandjon commented Jan 4, 2019

laszlocsomor commented Jan 7, 2019

laszlocsomor commented Jan 7, 2019

meteorcloudy commented Jan 7, 2019

ulfjack commented Jan 7, 2019

laszlocsomor commented Jan 7, 2019

filipesilva commented Jan 7, 2019

ulfjack commented Jan 7, 2019

ulfjack commented Jan 7, 2019

meteorcloudy commented Jan 7, 2019 • edited Loading

ulfjack commented Jan 8, 2019

laszlocsomor commented Jan 8, 2019

alexeagle commented Sep 10, 2019

laszlocsomor commented Sep 13, 2019

ishikhman commented Sep 13, 2019

UebelAndre commented Nov 11, 2020

philwo commented Nov 11, 2020

UebelAndre commented Nov 12, 2020

alexeagle commented Jan 2, 2022

github-actions bot commented Jun 13, 2023

cpsauer commented Jun 14, 2023

philwo commented Jan 2, 2019 •

edited by hlopko

Loading

laszlocsomor commented Jan 2, 2019 •

edited

Loading

meteorcloudy commented Jan 7, 2019 •

edited

Loading