Skip to content

[CI] Reliability issues with lit timing file caching #162294

@boomanaiden154

Description

@boomanaiden154

https://github.com/llvm/llvm-project/actions/runs/18167747610/job/51714369096

Traceback (most recent call last):
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/blob.py", line 4401, in _prep_and_do_download
    self._do_download(
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/blob.py", line 1094, in _do_download
    response = download.consume(transport, timeout=timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/_media/requests/download.py", line 280, in consume
    return _request_helpers.wait_and_retry(retriable_request, self._retry_strategy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/_media/requests/_request_helpers.py", line 107, in wait_and_retry
    return func()
           ^^^^^^
  File "/home/gha/.local/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 294, in retry_wrapped_func
    return retry_target(
           ^^^^^^^^^^^^^
  File "/home/gha/.local/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 156, in retry_target
    next_sleep = _retry_error_helper(
                 ^^^^^^^^^^^^^^^^^^^^
  File "/home/gha/.local/lib/python3.12/site-packages/google/api_core/retry/retry_base.py", line 214, in _retry_error_helper
    raise final_exc from source_exc
  File "/home/gha/.local/lib/python3.12/site-packages/google/api_core/retry/retry_unary.py", line 147, in retry_target
    result = target()
             ^^^^^^^^
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/_media/requests/download.py", line 262, in retriable_request
    self._process_response(result)
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/_media/_download.py", line 232, in _process_response
    _helpers.require_status_code(
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/_media/_helpers.py", line 105, in require_status_code
    raise InvalidResponse(
google.cloud.storage.exceptions.InvalidResponse: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/gha/actions-runner/_work/llvm-project/llvm-project/.ci/cache_lit_timing_files.py", line 75, in <module>
    download_timing_files(storage_client, bucket_name)
  File "/home/gha/actions-runner/_work/llvm-project/llvm-project/.ci/cache_lit_timing_files.py", line 63, in download_timing_files
    future.get()
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/usr/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/home/gha/actions-runner/_work/llvm-project/llvm-project/.ci/cache_lit_timing_files.py", line 48, in _maybe_download_timing_file
    blob.download_to_filename(file_name)
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/blob.py", line 1403, in download_to_filename
    self._handle_filename_and_download(
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/blob.py", line 1276, in _handle_filename_and_download
    self._prep_and_do_download(
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/blob.py", line 4415, in _prep_and_do_download
    _raise_from_invalid_response(exc)
  File "/home/gha/.local/lib/python3.12/site-packages/google/cloud/storage/blob.py", line 4887, in _raise_from_invalid_response
    raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.NotFound: 404 GET https://storage.googleapis.com/download/storage/v1/b/llvm-premerge-cluster-us-central-object-cache-linux/o/lit_timing%2Fbuild%2Fruntimes%2Fruntimes-bins%2Fcompiler-rt%2Flib%2Fasan%2Ftests%2FX86_64LinuxConfig%2F.lit_test_times.txt?generation=1759332110900091&alt=media: No such object: llvm-premerge-cluster-us-central-object-cache-linux/lit_timing/build/runtimes/runtimes-bins/compiler-rt/lib/asan/tests/X86_64LinuxConfig/.lit_test_times.txt: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

We are not gracefully handling failures here for some reason when we should be.

Metadata

Metadata

Labels

infrastructureBugs about LLVM infrastructure

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions