Skip to content

[build_manager] Raise exception on disk exhaustion instead of fatal exit#5270

Merged
PauloVLB merged 5 commits into
masterfrom
build-manager-exception
May 13, 2026
Merged

[build_manager] Raise exception on disk exhaustion instead of fatal exit#5270
PauloVLB merged 5 commits into
masterfrom
build-manager-exception

Conversation

@PauloVLB

@PauloVLB PauloVLB commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Context

When build setup fails to make disk space, calling log_fatal_and_exit (which calls sys.exit()) silently crashes the bot in the middle of utask_main. This prevents task post-processing or proper debugging in the uworker model.

For more details, see b/506045195.

Solution

  • Replaced the fatal log_fatal_and_exit calls with logs.error logging and a custom BuildManagerError exception in _download_and_open_build_archive and _unpack_build.
  • Updated CustomBuild._unpack_custom_build to return False on disk exhaustion, aligning its behavior with RegularBuild.

This allows tasks encountering disk exhaustion to fail gracefully, complete their post-processing (reporting the build setup failure back to the coordinator), and allow the bot to continue running subsequent tasks without a process restart.


Verification & Testing Logs

The following logs from a local test run illustrate the new execution flow:

  1. The first task runs, encounters simulated disk exhaustion, logs the error, raises the exception, and fails gracefully to reach postprocess.
  2. The second task runs on the same bot immediately afterwards, successfully allocates space, and finishes successfully.
INFO:run_bot:>>> STARTING RUN 1: Executing task with FAILING _make_space mock <<<
INFO:run_bot:Starting utask_preprocess...
INFO:run_bot:Picked fuzz target webnn_graph_impl_fuzzer_WebNNGraphImplFuzzer_NPU_SingleOpLstm_fuzzer for fuzzing.
INFO:run_bot:Starting utask_main...
...
INFO:run_bot:Opening an archive over HTTP, skipping archive download.
ERROR:run_bot:Failed to make space for build. Cleared all data directories to free up space.
ERROR:run_bot:Unable to unpack build archive gs://chromium-browser-libfuzzer/linux-release-asan-schema-v1/libfuzzer-schema-v1-linux-release-1628559.zip: Failed to make space for build.
Traceback (most recent call last):
  File "src/clusterfuzz/_internal/build_management/build_manager.py", line 584, in _unpack_build
    raise BuildManagerError('Failed to make space for build.')
clusterfuzz._internal.build_management.build_manager.BuildManagerError: Failed to make space for build.
ERROR:run_bot:Failed to set up a build.
INFO:run_bot:Starting postprocess on trusted worker.
INFO:run_bot:Utask local: done.

INFO:run_bot:>>> RESTORED ORIGINAL build_manager._make_space BEHAVIOR <<<
INFO:run_bot:>>> STARTING RUN 2: Executing task with ORIGINAL DEFAULT _make_space behavior <<<
INFO:run_bot:Starting utask_preprocess...
INFO:run_bot:Picked fuzz target services_unittests_StructTraitsTest_BeginFrameAckFuzz_fuzzer for fuzzing.
...
INFO:run_bot:Build took 0.01 minutes to unpack.
INFO:run_bot:Retrieved build r1628559.
INFO:run_bot:Fuzzing round 0.
...
INFO:run_bot:All fuzzing rounds complete.
INFO:run_bot:Utask local: done.

As this error is difficult to safely reproduce in dev environments, we are deploying this change directly to production and will monitor bot metrics.

@PauloVLB PauloVLB marked this pull request as ready for review May 12, 2026 12:31
@PauloVLB PauloVLB requested a review from a team as a code owner May 12, 2026 12:31
@PauloVLB PauloVLB requested review from ViniciustCosta and decoNR May 12, 2026 12:31
Comment thread src/clusterfuzz/_internal/bot/tasks/task_types.py Outdated
@PauloVLB PauloVLB merged commit 0ac00ac into master May 13, 2026
10 checks passed
@PauloVLB PauloVLB deleted the build-manager-exception branch May 13, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants