Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.2.0]Automatically retry the build if encountered remote cache eviction error #17938

Closed

Conversation

ShreeM01
Copy link
Contributor

With #17358, Bazel will exit with code 39 if remote cache evicts blobs during the build. With #17462 and #17747, Bazel is able to continue the build without bazel clean or bazel shutdown.

However, even with #17639 and following changes to extend the lease, remote cache can still evict blobs in some rare cases.

Based on above changes, this PR makes bazel retry the invocation if it encountered the remote cache eviction error during previous invocation if --experimental_remote_cache_eviction_retries is set, or build rewinding.

$ bazel build --experimental_remote_cache_eviction_retries=5 ...
INFO: Invocation ID: b7348bfa-9446-4c72-a888-0a0ad012f225
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: .../workspace/a/BUILD:8:8: Executing genrule //a:bar failed: Failed to fetch blobs because they do not exist remotely: Missing digest: b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c/4
Target //a:bar failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.447s, Critical Path: 0.05s
INFO: 2 processes: 2 internal.
ERROR: Build did NOT complete successfully
Found remote cache eviction error, retrying the build...
INFO: Invocation ID: 983f60dc-8bb9-4b82-aa33-a378469ce140
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //a:bar up-to-date:
  bazel-bin/a/bar.out
INFO: Elapsed time: 0.866s, Critical Path: 0.35s
INFO: 3 processes: 1 internal, 1 processwrapper-sandbox, 1 remote.
INFO: Build completed successfully, 3 total actions
$

Part of #16660.

Closes #17711.
Commit: 24b458

PiperOrigin-RevId: 520610524
Change-Id: I20d43d1968767a03250b9c8f8a6dda4e056d4f52

With bazelbuild#17358, Bazel will exit with code 39 if remote cache evicts blobs during the build. With bazelbuild#17462 and bazelbuild#17747, Bazel is able to continue the build without bazel clean or bazel shutdown.

However, even with bazelbuild#17639 and following changes to extend the lease, remote cache can still evict blobs in some rare cases.

Based on above changes, this PR makes bazel retry the invocation if it encountered the remote cache eviction error during previous invocation if `--experimental_remote_cache_eviction_retries` is set, or **build rewinding**.

```
$ bazel build --experimental_remote_cache_eviction_retries=5 ...
INFO: Invocation ID: b7348bfa-9446-4c72-a888-0a0ad012f225
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
ERROR: .../workspace/a/BUILD:8:8: Executing genrule //a:bar failed: Failed to fetch blobs because they do not exist remotely: Missing digest: b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c/4
Target //a:bar failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.447s, Critical Path: 0.05s
INFO: 2 processes: 2 internal.
ERROR: Build did NOT complete successfully
Found remote cache eviction error, retrying the build...
INFO: Invocation ID: 983f60dc-8bb9-4b82-aa33-a378469ce140
Loading:
Loading:
Loading: 0 packages loaded
Analyzing: target //a:bar (0 packages loaded, 0 targets configured)
INFO: Analyzed target //a:bar (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
[0 / 2] [Prepa] BazelWorkspaceStatusAction stable-status.txt
Target //a:bar up-to-date:
  bazel-bin/a/bar.out
INFO: Elapsed time: 0.866s, Critical Path: 0.35s
INFO: 3 processes: 1 internal, 1 processwrapper-sandbox, 1 remote.
INFO: Build completed successfully, 3 total actions
$
```

Part of bazelbuild#16660.

Closes bazelbuild#17711.

PiperOrigin-RevId: 520610524
Change-Id: I20d43d1968767a03250b9c8f8a6dda4e056d4f52
@ShreeM01 ShreeM01 closed this Mar 30, 2023
@ShreeM01 ShreeM01 deleted the ks/cherry-pick17711 branch March 31, 2023 18:50
@brentleyjones
Copy link
Contributor

Why was this closed? Too many other changes needed? On its own I assume this could be cherry-picked as it's gated behind a flag.

cc @coeuvre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants