Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support symbolic links inside TreeArtifact to point out of the TreeArtifact #20891

Closed
thesayyn opened this issue Jan 13, 2024 · 12 comments
Closed
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts type: feature request

Comments

@thesayyn
Copy link
Contributor

Description of the feature request:

Currently, Bazel disallows symlinks that contains .. (relative path) inside of a TreeArtifact exactly at this line

if (intermediatePath.containsUplevelReferences()) {
String errorMessage =
String.format(
"A TreeArtifact may not contain relative symlinks whose target paths traverse "
+ "outside of the TreeArtifact, found %s pointing to %s.",
path, linkTarget);
throw new IOException(errorMessage);

Which category does this issue belong to?

No response

What underlying problem are you trying to solve with this feature?

I am trying to symlink files from other actions into TreeArtifact to improve performance in rules_oci.

Which operating system are you running Bazel on?

No response

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@thesayyn
Copy link
Contributor Author

Probably related to #15454

@tjgq
Copy link
Contributor

tjgq commented Jan 15, 2024

Note: #15454 is about permitting a tree artifact to contain symlinks with "unresolved" semantics (i.e., Bazel doesn't care what the symlink points to).

I suspect what you want here is to relax the current restrictions on symlinks inside tree artifacts, but keep their "resolved" semantics (i.e., Bazel dereferences the symlink, verifies that the target path exists, and includes its contents in the tree artifact digest) - am I correct?

@thesayyn
Copy link
Contributor Author

In my particular case, I don't care if the target file is tracked by Skyframe but the rest is what i need.

@thesayyn
Copy link
Contributor Author

Currently, if i patch bazel removing

PathFragment intermediatePath = subDir;
for (String pathSegment : linkTarget.segments()) {
intermediatePath = intermediatePath.getRelative(pathSegment);
if (intermediatePath.containsUplevelReferences()) {
String errorMessage =
String.format(
"A TreeArtifact may not contain relative symlinks whose target paths traverse "
+ "outside of the TreeArtifact, found %s pointing to %s.",
path, linkTarget);
throw new IOException(errorMessage);
}
}
it seem to work but I am not sure if that's enough in this case. I am assuming no as no information about the symlink is being tracked.

@sgowroji sgowroji added the team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts label Jan 16, 2024
@tjgq
Copy link
Contributor

tjgq commented Jan 16, 2024

I think we generally have concerns about allowing symlinks because they let actions do non-hermetic things (i.e., access files that aren't in their inputs), but since we already allow absolute symlinks, I see little point in disallowing relative symlinks pointing outside of the tree artifact. I'm tempted to say the check can be removed. @justinhorvitz, WDYT? Is there a deep reason why this form of symlink should be forbidden?

@justinhorvitz
Copy link
Contributor

justinhorvitz commented Jan 17, 2024

I don't have a concrete case where allowing this would cause issues. b/31654651 (which predates my time) has some history about why the current policy was chosen. Absolute symlinks were allowed with the expectation that downstream actions would fail if necessary.

It's kind of a moot point internally, since tree artifact symlinks created on <redacted name of internal remote execution service> are ignored. So I would say, feel free to proceed but with caution.

@thesayyn
Copy link
Contributor Author

thesayyn commented Jan 18, 2024

Okay, i'll make a PR for it with tests. Now the question is should it still allow any kind of symlinks, eg dangling, or the symlink always has to point to a known file by bazel?

@comius comius added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) and removed untriaged labels Jan 23, 2024
@comius
Copy link
Contributor

comius commented Jan 23, 2024

Could you please provide motivation for introducing this? Otherwise P4. We can't accepting new things, even if they work, without motivation.

@thesayyn
Copy link
Contributor Author

It's identical to why would anyone use symlinks, ctx.actions.symlink, avoid copying large from other actions that needs to be in a special place for each action to work on. For example tar actions, npm packages, in my particular case: oci-layout. I need to symlink large artifacts from other actions into a treeartifact (oci-layout) to make external tooling happy.

In our case, this makes oci_image 4x faster comparing to copying large outputs from other actions.

It's weird to allow this type of symlinks in other contexes but limit it in the treeartifacts. the change is small enough to not take too much time if everybody believes it wouldn't create reproducibility and determinism problems.

@comius comius added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) labels Jan 23, 2024
@comius
Copy link
Contributor

comius commented Jan 23, 2024

Thanks, that's a good case!

@tjgq
Copy link
Contributor

tjgq commented Jan 28, 2024

Now the question is should it still allow any kind of symlinks, eg dangling, or the symlink always has to point to a known file by bazel?

From Bazel's perspective, a tree artifact is defined to contain regular files, not symlinks (the symlink is just a filesystem layout optimization, if you will). So when constructing a TreeArtifactValue we do need to ensure that a file exists at the other end of the symlink. (Permitting dangling symlinks in a tree artifact is #15454, which is not what we're discussing here.)

I believe the code you linked to already does this correctly; this FR should be only a matter of removing the check that forbids this kind of symlink and adding a test to cover the new case.

bazel-io pushed a commit to bazel-io/bazel that referenced this issue Feb 21, 2024
…not point outside the tree.

As discussed in bazelbuild#20891, the restriction is pointless, as it can already be bypassed with an absolute symlink. Note that the symlinks are still required not to dangle in all cases.

Also rename and reorganize the tests in TreeArtifactBuildTest in a more logical manner.

Fixes bazelbuild#20891.

Closes bazelbuild#21263.

PiperOrigin-RevId: 608926604
Change-Id: I967a383b9891360700f868abd5c2d292e0e7974e
github-merge-queue bot pushed a commit that referenced this issue Feb 21, 2024
…act may not point outside the tree. (#21449)

As discussed in #20891, the
restriction is pointless, as it can already be bypassed with an absolute
symlink. Note that the symlinks are still required not to dangle in all
cases.

Also rename and reorganize the tests in TreeArtifactBuildTest in a more
logical manner.

Fixes #20891.

Closes #21263.

Commit
5506a0f

PiperOrigin-RevId: 608926604
Change-Id: I967a383b9891360700f868abd5c2d292e0e7974e

Co-authored-by: thesayyn <thesayyn@gmail.com>
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.1.0 RC1. Please test out the release candidate and report any issues as soon as possible. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts type: feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants