Cache CAS interactions at lower layer to fix lost input handling#29551
Open
fmeum wants to merge 12 commits into
Open
Cache CAS interactions at lower layer to fix lost input handling#29551fmeum wants to merge 12 commits into
fmeum wants to merge 12 commits into
Conversation
8b0b556 to
1b38764
Compare
Collaborator
Author
|
@iancha1992 This would be another candidate for a 9.1.1 release. |
Collaborator
Author
|
@bazel-io flag |
Member
|
@bazel-io fork 9.1.1 |
Member
|
@bazel-io fork 9.2.0 |
This was referenced May 18, 2026
coeuvre
reviewed
May 21, 2026
Co-authored-by: Son Luong Ngoc <sluongng@gmail.com>
…calls Restores the dedup behavior the lower-layer CAS caching removed, by adding a second AsyncTaskCache<Digest, Boolean> that tracks in-flight findMissingDigests results. The actual upload deduplication still happens in casUploadCache (after the per-consumer isAvailableLocally check inside uploadFile), so each consumer's lost-input path remains distinct. Also wraps uploadBlob(byte[]), uploadVirtualActionInput, and the additionalInputs upload path with casUploadCache so directory blobs and action/command messages are deduped consistently with file uploads.
Converts RemoteCacheClient from interface to abstract class and moves the casUploadCache dedup logic out of CombinedCache and RemoteExecutionCache into the base class. Adds `force` overloads on the public upload methods. Concrete implementations now override the protected-flavored uploadFileImpl / uploadBlobImpl methods that perform the raw network call; the public uploadFile / uploadBlob wrappers dedupe via casUploadCache before invoking them. The previous double-wrap (one cache in CombinedCache, another in RemoteExecutionCache around the same digest) caused a lock-ordering deadlock once the client also had its own cache, so the wrappers in CombinedCache/RemoteExecutionCache are removed. AsyncTaskCache and RxFutures are extracted into a small `rx_helpers` java_library so the `common` package can depend on them without creating a cycle with `remote/util`. Tests that mocked the public upload methods are updated to mock the *Impl variants instead, and switched from mock() to spy() of InMemoryCacheClient where the deduplication wrapper needs to run.
- Extract a dedupedUpload() helper in RemoteCacheClient to remove the repeated casUploadCache.execute / RxFutures plumbing in uploadFile and uploadBlob(Blob). - Tighten visibility of CombinedCache.uploadFile(force) and uploadBlob(force) from protected to private; they are only used internally now that subclasses no longer wrap them. - Drop the second InMemoryCacheClient instance and findMissingDigests stub from upload_failedUploads_doNotDeduplicate; the spy can fall through via callRealMethod() for the success branch.
Before this commit, a digest that findMissingCache returned as "missing" stayed cached as missing forever, even after the triggered upload turned it present. After action rewinding (BuildWithoutTheBytesIntegrationTest) this caused an infinite rewind loop: the bar action's retry would still see the cached "missing", re-trigger the upload path, fail the isAvailableLocally check in BWoB mode, and ask for another rewind. Fix: after each upload attempt, update findMissingCache to reflect the actual remote state — replace with "present" on success, or invalidate on failure so the next caller re-queries. To enable doing this without the previous lock-ordering deadlock between findMissingCache.lock and casUploadCache.lock, AsyncTaskCache's `finished` map becomes a ConcurrentHashMap and the new put/invalidate operations on it don't take the cache's lock.
coeuvre
approved these changes
May 22, 2026
Wyverald
reviewed
May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The error handling for uploads in
UploadTaskdepends on the exec path of the uploaded file, which wasn't part of thecasUploadCachecache key. Fix this by moving deduplication logic for CAS uploads andFindMissingcalls intoRemoteCacheClient.Motivation
Avoids spurious Bazel failures caused by lost inputs that can't be recovered from via build or action rewinding due to
CacheNotFoundExceptionbeing marked with exec paths of inputs to concurrent actions (see the new test case).Build API Changes
No
Checklist
Release Notes
RELNOTES: Fixed an issue that caused Bazel to fail on a lost input even with build or action rewinding enabled.