-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add directory support for remote caching and execution #4011
Conversation
Can one of the admins verify this patch? |
This commit makes artifact directories usable in a remote context both for caching and execution. I realise this is quite an unwieldy commit, but it resolves all the TODOs I found in the code revolving around directories in com.google.devtools.build.lib.remote. I also took this as an opportunity to add some unit tests for the SimpleBlobStoreActionCache, which this commit modifies. Basically, this is simply an implementation hinted at by @buchgr augmented by some changes in TreeNodeRepository. One thing I am not happy with if the necessity of stating all the input files to check whether they are directories or not, this information, to my knowledge, not being provided anywhere else (however, some TODOs algorithms point at some improvements to the diffing algorithm, but this is out of the scope of this commit). I also don't have enough knowledge of the code base to know whether everything is optimized as it could be. |
wooohoooo 🎆 ... let me review this 😀 |
OK to test |
Test //src/test/shell/bazel:empty_package_test failed on Darwin because: "Xcode version must be specified to use an Apple CROSSTOOL". Test //scripts/release:release_test timed out. Both pass locally. |
} | ||
} else { | ||
downloadBlob(digest, path); | ||
System.err.println("DOWNLOADED path " + path + ": " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was testing this because I need it too :) I think you want to pull this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks I completely forgot this one!
Also curious. It'd be great if this could land in 0.9 or 0.10! |
@ulfjack @buchgr - I'm resubmitting bazelbuild#3984 on behalf of @mterring to get past CLA issues that are holding it up from merging. This is a temporary fix for the issue bazelbuild#3891 while we wait for bazelbuild#4011 to be reviewed and tested Closes bazelbuild#4188. PiperOrigin-RevId: 177276751
I apologize I lost track of it. I ll review it Monday. @rahul-malik @kamalmarhubi did you test it on your iOS project with remote caching? Does it work? |
Hi, thanks for the update @buchgr. @rahul-malik @kamalmarhubi thanks for the interest. Currently we're using it in production for a workspace that uses lots of target directories, both for remote caching and remote execution. Using directories is the only sane way we found for dealing with external npm dependencies (JavaScript) for our frontend (whether it is a good idea or not can be discussed, but it is another subject :)), so we cache whole node_modules folders without problem. I followed Jakob's blueprint on how this should be implemented using Merkle trees and the remote API. I took this as an opportunity to increase test coverage and to deduplicate a bit the code for remote caching and execution in order to not have the logic for dealing with directories in multiple places. Things that might need some input:
|
Thinking about it the change might be "big" enough to warrant writing an integration test in src/test/shell/bazel/remote_execution_test.sh, what do you think @buchgr ? |
@hchauvin Yes please ... an integration test would be great :-)! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! This is great work @hchauvin! Thank you.
* and {@link GrpcRemoteCache}. | ||
*/ | ||
@ThreadSafety.ThreadSafe | ||
public abstract class RemoteActionCacheBase implements RemoteActionCache { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that you could just remove the RemoteActionCache interface and rename RemoteActionCacheBase to AbstractRemoteActionCache (it's a naming convention used throughout Bazel).
Having an abstract class with abstract upload/download methods seems like a good abstraction, that renders the interface unnecessary.
@@ -188,7 +188,14 @@ private ActionResult execute(Action action, Path execRoot) | |||
FileSystemUtils.createDirectoryAndParents(file.getParentDirectory()); | |||
outputs.add(file); | |||
} | |||
// TODO(olaola): support output directories. | |||
for (String output : action.getOutputDirectoriesList()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An integration testing both the client and remote worker implementation would be great!
if (digest == null) { | ||
// If the artifact does not have a digest, it is because it is a directory. | ||
// We get the digest from the set of Merkle hashes computed in this TreeNodeRepository. | ||
return Preconditions.checkNotNull(inputDirectoryDigestCache.get(input)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add a message stating what went wrong. makes errors easier to debug.
@@ -105,15 +122,22 @@ public int hashCode() { | |||
} | |||
|
|||
// Should only be called by the TreeNodeRepository. | |||
private TreeNode(Iterable<ChildEntry> childEntries) { | |||
this.actionInput = null; | |||
private TreeNode(Iterable<ChildEntry> childEntries, ActionInput actionInput) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: @Nullable
List<Dirent> sortedDirent = new ArrayList<>(path.readdir(symlinkPolicy)); | ||
sortedDirent.sort(Comparator.comparing(Dirent::getName)); | ||
|
||
List<TreeNode.ChildEntry> entries = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: initialize with sortedDirent.size().
@buchgr I took your comments, added 3 integration tests and rebased. |
function test_directory_artifact() { | ||
set_directory_artifact_testfixtures | ||
|
||
bazel --host_jvm_args=-Dbazel.DigestFunction=SHA1 build \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should no longer be needed after you rebase.
Add support for directory trees as artifacts
@buchgr got it. Rebased again and removed digest function. |
Thanks again! It's in the internal review! Should be exported soon! |
@buchgr - Did this pass internal review? |
Sorry for the delay. It's merged finally! Will be in 0.10.0 |
Add support for directory trees as artifacts. Closes bazelbuild#4011. PiperOrigin-RevId: 179691001
Add support for directory trees as artifacts. Closes bazelbuild#4011. PiperOrigin-RevId: 179691001
@ulfjack @buchgr - I'm resubmitting bazelbuild/bazel#3984 on behalf of @mterring to get past CLA issues that are holding it up from merging. This is a temporary fix for the issue bazelbuild/bazel#3891 while we wait for bazelbuild/bazel#4011 to be reviewed and tested Closes #4188. PiperOrigin-RevId: 177276751
Add support for directory trees as artifacts