Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TreeArtifactValue collects a symlink to a directory as a TreeFileArtifact with a DirectoryArtifactValue #20418

Closed
tjgq opened this issue Dec 3, 2023 · 0 comments
Labels
team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: feature request untriaged

Comments

@tjgq
Copy link
Contributor

tjgq commented Dec 3, 2023

Consider a tree artifact containing a symlink to a directory, as e.g. would be produced by the following action:

d = ctx.actions.declare_directory(...)
ctx.actions.run_shell(
  outputs = [d],
  command = "cd $1 && mkdir dir && touch dir/file.txt && ln -s dir sym",
  arguments = [d.path],
)

Currently, this will result in ActionOutputMetadataStore#constructTreeArtifactValueFromFilesystem collecting a TreeArtifactValue with two entries: dir/file.txt, mapping to a FileArtifactValue, and sym, mapping to a DirectoryArtifactValue.

In addition to DirectoryArtifactValue being unsound, the mismatch between the "artifact type" and "metadata type" is a source of bugs (#20415 is a recent example). I think it would be better for the symlink to be transparently followed (i.e., collect a TreeFileArtifact+FileArtifactValue for sym/file.txt).

One thing we need to be careful about is that the symlink could point outside of the tree artifact (which is explicitly permitted), and in this case, while we still want to collect the metadata, we should avoid calling chmod on the subtree. (Alternatively, we could disallow that form of symlink, but I suspect our hands might already be tied by existing use cases.)

@tjgq tjgq added team-Core Skyframe, bazel query, BEP, options parsing, bazelrc untriaged labels Dec 3, 2023
tjgq added a commit to tjgq/bazel that referenced this issue Dec 3, 2023
When a tree artifact contains a symlink to a directory, the TreeArtifactValue
contains a TreeFileArtifact mapping to a DirectoryArtifactValue (see bazelbuild#20418).

Because the file type to be uploaded to the BES is determined solely by the
Artifact type, this causes the uploader to attempt to upload a directory as if
it were a file. This fails silently when building with the bytes; when building
without the bytes, it crashes when attempting to digest the (in-memory)
directory, which has no associated inode (see bazelbuild#20415).

This PR fixes the file type computation to take into account both the artifact
and its metadata, preferring the latter when available.

Fixes bazelbuild#20415.
tjgq added a commit to tjgq/bazel that referenced this issue Dec 3, 2023
…gree.

When a tree artifact contains a symlink to a directory, the TreeArtifactValue
contains a TreeFileArtifact mapping to a DirectoryArtifactValue (see bazelbuild#20418).

Because the file type to be uploaded to the BES is determined solely by the
Artifact type, this causes the uploader to attempt to upload a directory as if
it were a file. This fails silently when building with the bytes; when building
without the bytes, it crashes when attempting to digest the (in-memory)
directory, which has no associated inode (see bazelbuild#20415).

This PR fixes the file type computation to take into account both the artifact
and its metadata, preferring the latter when available.

Fixes bazelbuild#20415.
tjgq added a commit to tjgq/bazel that referenced this issue Dec 3, 2023
…gree.

When a tree artifact contains a symlink to a directory, the TreeArtifactValue
contains a TreeFileArtifact mapping to a DirectoryArtifactValue (see bazelbuild#20418).

Because the file type to be uploaded to the BES is determined solely by the
Artifact type, this causes the uploader to attempt to upload a directory as if
it were a file. This fails silently when building with the bytes; when building
without the bytes, it crashes when attempting to digest the (in-memory)
directory, which has no associated inode (see bazelbuild#20415).

This PR fixes the file type computation to take into account both the artifact
and its metadata, preferring the latter when available.

Fixes bazelbuild#20415.
tjgq added a commit to tjgq/bazel that referenced this issue Dec 3, 2023
…gree.

When a tree artifact contains a symlink to a directory, the TreeArtifactValue
contains a TreeFileArtifact mapping to a DirectoryArtifactValue (see bazelbuild#20418).

Because the file type to be uploaded to the BES is determined solely by the
Artifact type, this causes the uploader to attempt to upload a directory as if
it were a file. This fails silently when building with the bytes; when building
without the bytes, it crashes when attempting to digest the (in-memory)
directory, which has no associated inode (see bazelbuild#20415).

This PR fixes the file type computation to take into account both the artifact
and its metadata, preferring the latter when available.

Fixes bazelbuild#20415.
tjgq added a commit to tjgq/bazel that referenced this issue Dec 3, 2023
…gree.

When a tree artifact contains a symlink to a directory, the TreeArtifactValue
contains a TreeFileArtifact mapping to a DirectoryArtifactValue (see bazelbuild#20418).

Because the file type to be uploaded to the BES is determined solely by the
Artifact type, this causes the uploader to attempt to upload a directory as if
it were a file. This fails silently when building with the bytes; when building
without the bytes, it crashes when attempting to digest the (in-memory)
directory, which has no associated inode (see bazelbuild#20415).

This PR fixes the file type computation to take into account both the artifact
and its metadata, preferring the latter when available.

Fixes bazelbuild#20415.
copybara-service bot pushed a commit that referenced this issue Dec 4, 2023
…gree.

When a tree artifact contains a symlink to a directory, the TreeArtifactValue contains a TreeFileArtifact mapping to a DirectoryArtifactValue (see #20418).

Because the file type to be uploaded to the BES is determined solely by the Artifact type, this causes the uploader to attempt to upload a directory as if it were a file. This fails silently when building with the bytes; when building without the bytes, it crashes when attempting to digest the (in-memory) directory, which has no associated inode (see #20415).

This PR fixes the file type computation to take into account both the artifact and its metadata, preferring the latter when available.

Fixes #20415.

Closes #20419.

PiperOrigin-RevId: 587654070
Change-Id: Ib62bbaaed62b00c12bcabdc2bc9bee57aee0bcef
bazel-io pushed a commit to bazel-io/bazel that referenced this issue Dec 4, 2023
…gree.

When a tree artifact contains a symlink to a directory, the TreeArtifactValue contains a TreeFileArtifact mapping to a DirectoryArtifactValue (see bazelbuild#20418).

Because the file type to be uploaded to the BES is determined solely by the Artifact type, this causes the uploader to attempt to upload a directory as if it were a file. This fails silently when building with the bytes; when building without the bytes, it crashes when attempting to digest the (in-memory) directory, which has no associated inode (see bazelbuild#20415).

This PR fixes the file type computation to take into account both the artifact and its metadata, preferring the latter when available.

Fixes bazelbuild#20415.

Closes bazelbuild#20419.

PiperOrigin-RevId: 587654070
Change-Id: Ib62bbaaed62b00c12bcabdc2bc9bee57aee0bcef
keertk pushed a commit that referenced this issue Dec 4, 2023
…act containing symlink to directory + remote cache + BES (#20424)

When a tree artifact contains a symlink to a directory, the
TreeArtifactValue contains a TreeFileArtifact mapping to a
DirectoryArtifactValue (see #20418).

Because the file type to be uploaded to the BES is determined solely by
the Artifact type, this causes the uploader to attempt to upload a
directory as if it were a file. This fails silently when building with
the bytes; when building without the bytes, it crashes when attempting
to digest the (in-memory) directory, which has no associated inode (see
#20415).

This PR fixes the file type computation to take into account both the
artifact and its metadata, preferring the latter when available.

Fixes #20415.

Closes #20419.

Commit
bb5ff63

PiperOrigin-RevId: 587654070
Change-Id: Ib62bbaaed62b00c12bcabdc2bc9bee57aee0bcef

Co-authored-by: Tiago Quelhas <tjgq@google.com>
bazel-io pushed a commit to bazel-io/bazel that referenced this issue Feb 19, 2024
As explained in bazelbuild#20418, when a tree artifact contains a symlink to a directory, it is collected as a single TreeFileArtifact with DirectoryArtifactValue metadata. With this change, the symlink is followed and the directory expanded into its contents, which is more incrementally correct and removes a special case that tree artifact consumers would otherwise have to be aware of.

This also addresses bazelbuild#21171, which is due to the metadata for the directory contents not being available when building without the bytes, causing the input Merkle tree builder to fail. (While I could have fixed this by falling back to reading the directory contents from the filesystem, I prefer to abide by the principle that input metadata should be collected before execution; source directories are the other case where this isn't true, which I also regard as a bug.)

Fixes bazelbuild#20418.
Fixes bazelbuild#21171.

PiperOrigin-RevId: 608389141
Change-Id: I956f3f8a4b1bfd279091e179d1cba3cdd0e5019b
github-merge-queue bot pushed a commit that referenced this issue Feb 20, 2024
…actValue. (#21418)

As explained in #20418, when a
tree artifact contains a symlink to a directory, it is collected as a
single TreeFileArtifact with DirectoryArtifactValue metadata. With this
change, the symlink is followed and the directory expanded into its
contents, which is more incrementally correct and removes a special case
that tree artifact consumers would otherwise have to be aware of.

This also addresses #21171,
which is due to the metadata for the directory contents not being
available when building without the bytes, causing the input Merkle tree
builder to fail. (While I could have fixed this by falling back to
reading the directory contents from the filesystem, I prefer to abide by
the principle that input metadata should be collected before execution;
source directories are the other case where this isn't true, which I
also regard as a bug.)

Fixes #20418.
Fixes #21171.

Commit
4247c20

PiperOrigin-RevId: 608389141
Change-Id: I956f3f8a4b1bfd279091e179d1cba3cdd0e5019b

Co-authored-by: Googler <tjgq@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Core Skyframe, bazel query, BEP, options parsing, bazelrc type: feature request untriaged
Projects
None yet
Development

No branches or pull requests

2 participants