Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-process runfile tree creation from SymlinkTreeAction fails for some consecutive builds #20266

Closed
Andrius-B opened this issue Nov 20, 2023 · 4 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Rules-Server Issues for serverside rules included with Bazel type: bug

Comments

@Andrius-B
Copy link

Description of the bug:

We are facing an interesting hermeticity issue when creating runfiles locally. It is solvable by running bazel clean but preferably should be addressed. The premise of the bug is that the in-process runfile tree creation does not handle all cases of the previuos runfile tree.
With the flags

build --enable_runfiles
build --experimental_inprocess_symlink_creation

the in-process symlink creation will be used and thus the bug can be triggered.
In the example reproduction case, we have a target output that can be produced from two different inputs depending on a flag:

config_setting(
    name = "use_mode",
    flag_values = {"//:use": "True"},
)

make_folder(name = "make_folder")

use_folder(
    name = "use_folder",
    srcs = glob(["prepared/**/*"]),
)

output(
    name = "output",
    src = select({
        "//:use_mode": ":use_folder",
        "//conditions:default": ":make_folder",
    }),
)

then the output target is built two times by using different input rules. The make_folder rule produces the files by running a shell script and the use_folder produces the same files by globbing them from the sources. Both of these inputs are linked using runfiles to the same output path albeit a little differently, with:

  • make_folder using ctx.runfiles(symlinks = {"lib": directory})
  • use_folder using ctx.runfiles(symlinks = symlinks) with a complete dict for each file.

The bug is triggered by building the output target twice in a row using different configurations, therefore forcing the runfiles tree builder into an edge case.
Depending on the build oder, the error is slightly different:

$ ./repro1.sh
[...]
ERROR: /private/tmp/repr/BUILD.bazel:21:7: Creating runfiles tree bazel-out/darwin_arm64-fastbuild/bin/exec_output.runfiles failed: java.io.IOException: <root>/execroot/repo/bazel-out/darwin_arm64-fastbuild/bin/exec_output.runfiles/repo/lib (File exists)
Target //:output failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.105s, Critical Path: 0.01s
INFO: 4 processes: 4 internal.
FAILED: Build did NOT complete successfully

From debugging, it seems that the FileSystemUtils.ensureSymbolicLink function does not do enough to ensure a symbolic link is created (i.e. if there is a folder where the symlink is supposed to go). This happens in the repro1.sh case.

In the second case repro2.sh it seems like a symlink is followed where it should not be:

$ ./repro2.sh
[...]
ERROR: /private/tmp/repr/BUILD.bazel:21:7: Creating runfiles tree bazel-out/darwin_arm64-fastbuild/bin/exec_output.runfiles failed: java.io.IOException:<root>/execroot/repo/bazel-out/darwin_arm64-fastbuild/bin/exec_output.runfiles/repo/lib/sample1.txt (File exists)
Target //:output failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.088s, Critical Path: 0.00s
INFO: 3 processes: 3 internal.
FAILED: Build did NOT complete successfully

The file that the SymlinkTreeAction is trying to overwrite is actually outside the runfiles root:

$ realpath <root>/execroot/repo/bazel-out/darwin_arm64-fastbuild/bin/exec_output.runfiles/repo/lib/sample1.txt
<root>/execroot/repo/bazel-out/darwin_arm64-fastbuild/bin/lib/sample1.txt

In practice we have bumped into this by when migrating from rules_node to rules_js and when trying to optionally pull out some pre-built artifacts.

Which category does this issue belong to?

Core

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I have attached a zip file with a minimal reproduction example, there are two errors that can be produced by running repro1.sh or the repro2.sh scripts.

Which operating system are you running Bazel on?

MacOS

What is the output of bazel info release?

release 6.4.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

I have tested with bazel 6.3.0, 6.4.0 and latest master (sha 487bb42410488a3919e2b4d0c7fd2e37efe8e830), this is reproducible on all the tested versions

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added the team-Rules-Server Issues for serverside rules included with Bazel label Nov 20, 2023
@comius comius added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Dec 27, 2023
@carpenterjc
Copy link

This sounds like its a similar area to: #19018 where the symlink action can happen before the action that creates the linked target is executed.

@tjgq
Copy link
Contributor

tjgq commented Jan 31, 2024

Note that this issue is specifically about --experimental_inprocess_symlink_creation, which is not the default. (In addition, in-process just changes the mechanism through which symlinks are created, it doesn't change when they're created relative to other parts of the build).

@fmeum fmeum self-assigned this Apr 23, 2024
@fmeum fmeum added P1 I'll work on this now. (Assignee required) and removed P3 We're not considering working on this, but happy to review a PR. (No assignee) labels Apr 23, 2024
bazel-io pushed a commit to bazel-io/bazel that referenced this issue Apr 25, 2024
`SymlinkTreeUpdater` now correctly replaces symlinks with directories and vice versa in runfiles trees.

Fixes bazelbuild#20266

Closes bazelbuild#22087.

PiperOrigin-RevId: 628028591
Change-Id: Id94cea42f9cdfab001e56242a329b8d6d253ba29
github-merge-queue bot pushed a commit that referenced this issue Apr 29, 2024
…2116)

`SymlinkTreeUpdater` now correctly replaces symlinks with directories
and vice versa in runfiles trees.

Fixes #20266

Closes #22087.

PiperOrigin-RevId: 628028591
Change-Id: Id94cea42f9cdfab001e56242a329b8d6d253ba29

Commit
4b18dbe

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
Kila2 pushed a commit to Kila2/bazel that referenced this issue May 13, 2024
`SymlinkTreeUpdater` now correctly replaces symlinks with directories and vice versa in runfiles trees.

Fixes bazelbuild#20266

Closes bazelbuild#22087.

PiperOrigin-RevId: 628028591
Change-Id: Id94cea42f9cdfab001e56242a329b8d6d253ba29
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.2.0 RC1. Please test out the release candidate and report any issues as soon as possible.
If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.2.0rc1. Thanks!

@Andrius-B
Copy link
Author

I tried 7.2.0rc1 and the issue that we were experiencing is indeed fixed there. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Rules-Server Issues for serverside rules included with Bazel type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants