Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize unresolved symbolic links (--experimental_allow_unresolved_symlinks) #10298

Closed
jmillikin opened this issue Nov 24, 2019 · 43 comments
Closed
Labels
P1 I'll work on this now. (Assignee required) team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: feature request

Comments

@jmillikin
Copy link
Contributor

Description of the problem / feature request:

#9075 added support for unresolved symlinks, via the actions.declare_symlink() and actions.symlink() functions. That functionality is currently gated by --experimental_allow_unresolved_symlinks.

I've discovered that unresolved symlinks are critical for supporting hermetic Node.JS builds. Specifically, Node's use of symlinks for module resolution semantics mean that we need to be able to create symlink artifacts that reference dependencies and can be used as runfiles, but exist outside the runfiles root.

What are the current blockers to stabilizing unresolved symlinks? Is there a list of known remaining work to get them enabled by default?

cc @lberki @laszlocsomor

@lberki lberki changed the title Stablize unresolved symbolic links (--experimental_allow_unresolved_symlinks) Stabilize unresolved symbolic links (--experimental_allow_unresolved_symlinks) Nov 25, 2019
@lberki lberki changed the title Stabilize unresolved symbolic links (--experimental_allow_unresolved_symlinks) Stabilize unresolved symbolic links (--experimental_allow_unresolved_symlinks) Nov 25, 2019
@lberki
Copy link
Contributor

lberki commented Nov 25, 2019

@laurentlb , @laszlocsomor , what do you think?

The main blockers I see is that RBE doesn't support this yet (the protocol itself does, but we haven't wired it up yet within Bazel) and that it probably fails on Windows in all sorts of interesting ways.

Neither of these are insurmountable problems and if I squinted the right way, I could even argue that they are not necessary for declaring the functionality stable, but I just can't sign up for doing them at the moment.

@lberki lberki added the team-Core Skyframe, bazel query, BEP, options parsing, bazelrc label Nov 25, 2019
@laszlocsomor
Copy link
Contributor

@lberki : Why do you suppose it fails on Windows?
Is there a test case I could try on Windows?

@lberki
Copy link
Contributor

lberki commented Nov 25, 2019

Try //third_party/bazel/src/test/shell/bazel:bazel_unresolved_symlink_test.

I admit I haven't tried it on Windows. There is a chance that it works since all that's needed is ActionMetadataHandler#fileArtifactValueFromArtifact() to work which in turn only needs Path#statIfFound(Symlinks.NOFOLLOW) and Path#readSymbolicLink() .

@lberki
Copy link
Contributor

lberki commented Nov 25, 2019

...oh, and Path#createSymbolicLink() for SymlinkAction#execute().

@laszlocsomor
Copy link
Contributor

Thanks! The test fails here:

bazel build --experimental_allow_unresolved_symlinks //a:bsg && fail "build succeeded"

...because the build succeeds though it shouldn't. (Note: I edited the test to use c:/badsymlink instead of /badsymlink.)

@laszlocsomor
Copy link
Contributor

laszlocsomor commented Nov 25, 2019

The culprit is I think the culprit is that WindowsFileSystem does not create file symlinks:

if (!target.toFile().exists() || target.toFile().isDirectory()) {
WindowsFileOperations.createJunction(link.toString(), target.toString());
} else {
Files.copy(target, link);
}

@laszlocsomor
Copy link
Contributor

@lberki : test_file_instead_of_symlink breaks on Linux if I add --spawn_strategy=standalone to the bazel calls.

@AlessandroPatti
Copy link
Contributor

This seems to be failing on macOS if paired with remote caching;

java.lang.IllegalStateException: Encountered symlink input '<some nice path here>', but all symlinks should have been resolved by SkyFrame. This is a bug.
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.fromActionInputs(DirectoryTreeBuilder.java:106)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.fromActionInputs(DirectoryTreeBuilder.java:47)
	at com.google.devtools.build.lib.remote.merkletree.MerkleTree.build(MerkleTree.java:123)
	at com.google.devtools.build.lib.remote.RemoteSpawnCache.lookup(RemoteSpawnCache.java:121)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:119)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:94)
	at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:39)
	at com.google.devtools.build.lib.exec.ProxySpawnActionContext.beginExecution(ProxySpawnActionContext.java:60)
	at com.google.devtools.build.lib.rules.cpp.CppCompileAction.beginExecution(CppCompileAction.java:1381)
	at com.google.devtools.build.lib.actions.Action.execute(Action.java:124)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$4.execute(SkyframeActionExecutor.java:966)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:1114)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1085)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:136)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:80)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:612)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:907)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:297)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:438)

@pauldraper
Copy link
Contributor

pauldraper commented Nov 1, 2021

AFAIK this has been experimental since Bazel 1.0.

Is progress likely?

@bjulian5
Copy link

bjulian5 commented Apr 5, 2022

Any updates?

@gregmagolan
Copy link
Contributor

gregmagolan commented Apr 12, 2022

I hit this recently as well. I was using the feature to break circular deps between npm packages (which sadly exist in some 3rd party packages). Confirmed that it does not work with remote execution.

I also hit a bug in bazel itself related to this where the --experimental_allow_unresolved_symlinks configuration setting is not propagated to the host and exec configs so when building a target under either of those configs that calls actions.create_symlink, bazel errors out with

.

@lberki
Copy link
Contributor

lberki commented Apr 19, 2022

Looking at the history of this issue, the main blockers seem to be Windows and RBE support. Once those are fixed (or we decide that they don't matter), we can move this feature out of the experimental domain.

@coeuvre : What do you think about updating our RBE support to support unresolved symbolic links? I realize you are busy so I was thinking about soliciting pull requests from the community.

@meteorcloudy : given that symlinks are not a very common thing on Windows, can we get away without supporting unresolved symlinks? I don't even know if NTFS supports dangling symlinks; if it doesn't, then the decision is easy because then unresolved symlinks don't make all that much sense, do they?

@meteorcloudy
Copy link
Member

@lberki My guess is that it will work correctly [--windows_enable_symlinks](with https://bazel.build/reference/command-line-reference#flag--windows_enable_symlinks), NTFS does support dangling symlinks.

Is there a simple test that I can run to verify?

@lberki
Copy link
Contributor

lberki commented Apr 19, 2022

test_smoke in bazel_symlink_test?

@meteorcloudy
Copy link
Member

meteorcloudy commented Apr 19, 2022

This test passes on Windows
https://buildkite.com/bazel/bazel-bazel/builds/19324#75eafab1-c5ae-4830-add5-fbe84692c31e

//src/test/shell/bazel:bazel_symlink_test                                PASSED in 29.4s

Oh, it passes because it's skipped on windows..

@coeuvre
Copy link
Member

coeuvre commented Apr 20, 2022

@lberki I am happy to review PRs :)

cc @tjgq

@meteorcloudy
Copy link
Member

@lberki
test_smoke passed on Windows with --windows_enable_symlinks after fixing the path style for Windows (/nonexistent => C:/nonexistent)

** test_smoke ******************************************************************
$TEST_TMPDIR defined: output root default is 'c:\b\erp625is\execroot\io_bazel\_tmp\f8978fe5636e0739a65f37d28c69faa9' and max_idle_secs default is '15'.
Loading: 
Loading: 0 packages loaded
Analyzing: target //a:a (1 packages loaded, 0 targets configured)
INFO: Analyzed target //a:a (4 packages loaded, 6 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //a:a up-to-date:
  bazel-bin/a/a
INFO: Elapsed time: 0.597s, Critical Path: 0.01s
INFO: 2 processes: 2 internal.
INFO: Build completed successfully, 2 total actions
INFO: Build completed successfully, 2 total actions
total 0
lrwxrwxrwx 1 b None 14 Apr 20 09:21 a -> /c/nonexistent
INFO[bazel_symlink_test 2022-04-20 09:21:04 (+0000)] Cleaning up workspace

�[0;32mPASSED�[0m: test_smoke

I believe Windows isn't really a blocker, dangling symlinks and dangling junctions both work.

@lberki
Copy link
Contributor

lberki commented Apr 20, 2022

@meteorcloudy then that leaves RBE support, I guess.

(plus some glue logic to hard-wire the value of this flag to false within Google, but that, I can throw in since contributions don't work for that one)

@pauldraper
Copy link
Contributor

Is Alessandro's comment about remote caching on MacOS still relevant?

#10298 (comment)

@tjgq
Copy link
Contributor

tjgq commented Sep 16, 2022

@gregmagolan Could you confirm whether remote execution support for actions that produce unresolved symlinks is a blocker for rules_nodejs? (This doesn't currently work in Bazel, even after @fmeum's changes; consuming unresolved symlinks remotely, on the other hand, does work.)

copybara-service bot pushed a commit that referenced this issue Sep 16, 2022
Unresolved symlink artifacts created with `ctx.actions.symlink(target_path = ...)` are now created without making the target path absolute by prepending the exec root, which diverged from the documentation and intended goal and also gave rise to hermeticity issues as such symlinks would regularly resolve outside the sandbox.

Furthermore, in order to bring local execution in line with other execution types, the runfiles manifest entry (and thus the runfiles directory contents) for symlink artifacts are now their target paths rather than their exec paths, which along the way resolves another soure of non-hermetic resolution outside the runfiles directory.

This requires making symlink artifacts (but not artifacts representing regular files) inputs of `SourceManifestAction`, which previously had no inputs. The change flattens a nested set in `getInputs`, but benchmarks confirm that this does not result in a performance regression. Alternatives considered either result in a substantially worse Starlark API or wouldn't work for symlinks created by spawns.

Work towards #10298

Closes #16272.

PiperOrigin-RevId: 474784371
Change-Id: I15d318c30542c1da54d86d9b1ae769fe2a0ec970
tjgq added a commit to tjgq/bazel that referenced this issue Sep 16, 2022
The tests don't currently pass due to bazelbuild#16290 and bazelbuild#16289. Making them pass might
be considered a blocker for bazelbuild#10298 (declaring unresolved symlinks stable).
@gregmagolan
Copy link
Contributor

gregmagolan commented Sep 16, 2022

@gregmagolan Could you confirm whether remote execution support for actions that produce unresolved symlinks is a blocker for rules_nodejs? (This doesn't currently work in Bazel, even after @fmeum's changes; consuming unresolved symlinks remotely, on the other hand, does work.)

@tjgq Thanks for checking. Actions that output unresolved symlinks running in RBE are not needed. Only that RBE can handle unresolved symlink inputs.

@larsrc-google
Copy link
Contributor

Does anyone have a simple-to-reproduce case where dangling symlinks work with standalone execution but not with sandboxed execution? The cases I thought we have actually fail without --experimental_allow_unresolved_symlinks, so they're not useful for this.

@larsrc-google
Copy link
Contributor

I have resolved a major bug in my understanding of the symlink handling, and believe that with #15982 there are no more blockers in the way symlinks are handled for sandboxes.

@lberki
Copy link
Contributor

lberki commented Oct 4, 2022

@larsrc-google do I understand correctly that the last blocker was #16290 which just got fixed?

@larsrc-google
Copy link
Contributor

No more blockers from my side. @tjgq is tracking the RBE-related bits.

@lberki
Copy link
Contributor

lberki commented Oct 4, 2022

...and @meisterT says that it's all good on the remote execution side, so then I assume all bits are ready, except for flag administration (we don't want this to be enabled within Google)

@meteorcloudy
Copy link
Member

Can someone clarify what else action items are needed for this issue? If none, let's close this one to reduce the number of release blockers for 6.0.

@larsrc-google
Copy link
Contributor

I do believe that covers everything needed for 6.0.

@fmeum
Copy link
Collaborator

fmeum commented Oct 5, 2022

@larsrc-google Aren't we missing a commit that removes the experimental prefix, enables the flag by default and disables it in Google's .bazelrc?

@meteorcloudy meteorcloudy reopened this Oct 5, 2022
@larsrc-google
Copy link
Contributor

Arg. Yes, you're right. Starting on that.

aiuto pushed a commit to aiuto/bazel that referenced this issue Oct 12, 2022
Also adds support for such symlinks to the remote worker implementation.

Work towards bazelbuild#10298

Closes bazelbuild#15781.

PiperOrigin-RevId: 469196671
Change-Id: I441da6d0a658d142153ada6cc0f836bf2ed3bcff
aiuto pushed a commit to aiuto/bazel that referenced this issue Oct 12, 2022
Unresolved symlink artifacts created with `ctx.actions.symlink(target_path = ...)` are now created without making the target path absolute by prepending the exec root, which diverged from the documentation and intended goal and also gave rise to hermeticity issues as such symlinks would regularly resolve outside the sandbox.

Furthermore, in order to bring local execution in line with other execution types, the runfiles manifest entry (and thus the runfiles directory contents) for symlink artifacts are now their target paths rather than their exec paths, which along the way resolves another soure of non-hermetic resolution outside the runfiles directory.

This requires making symlink artifacts (but not artifacts representing regular files) inputs of `SourceManifestAction`, which previously had no inputs. The change flattens a nested set in `getInputs`, but benchmarks confirm that this does not result in a performance regression. Alternatives considered either result in a substantially worse Starlark API or wouldn't work for symlinks created by spawns.

Work towards bazelbuild#10298

Closes bazelbuild#16272.

PiperOrigin-RevId: 474784371
Change-Id: I15d318c30542c1da54d86d9b1ae769fe2a0ec970
@brentleyjones
Copy link
Contributor

@meisterT what about the rename?

@meisterT
Copy link
Member

See my comment here: #16458 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.