Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bionic should retry GCS uploads/downloads if they time out #78

Open
jqmp opened this issue Jan 24, 2020 · 1 comment
Open

Bionic should retry GCS uploads/downloads if they time out #78

jqmp opened this issue Jan 24, 2020 · 1 comment
Labels
bug Something isn't working enhancement New feature or request

Comments

@jqmp
Copy link
Collaborator

jqmp commented Jan 24, 2020

Sometimes GCS file uploads (and presumably downloads) can time out (stack trace attached below). For most of these operations we use the GCS Python API rather than gsutil, so it's probably not retrying by default. We should probably add some retry logic to reduce the chance of transient failures crashing the whole process.

  File "/usr/local/lib/python3.7/site-packages/bionic/cache.py", line 284, in _blob_from_file
    self._cloud.upload(file_path, blob_url)
  File "/usr/local/lib/python3.7/site-packages/bionic/cache.py", line 601, in upload
    self._tool.blob_from_url(url).upload_from_filename(str(path))
  File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1320, in upload_from_filename
    predefined_acl=predefined_acl,
  File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1265, in upload_from_file
    client, file_obj, content_type, size, num_retries, predefined_acl
  File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1175, in _do_upload
    client, stream, content_type, size, num_retries, predefined_acl
  File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1122, in _do_resumable_upload
    response = upload.transmit_next_chunk(transport)
  File "/usr/local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 425, in transmit_next_chunk
    retry_strategy=self._retry_strategy,
  File "/usr/local/lib/python3.7/site-packages/google/resumable_media/requests/_helpers.py", line 136, in http_request
    return _helpers.wait_and_retry(func, RequestsMixin._get_status_code, retry_strategy)
  File "/usr/local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 150, in wait_and_retry
    response = func()
  File "/usr/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 287, in request
    **kwargs
  File "/usr/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 110, in __exit__
    raise self._timeout_error_type()
requests.exceptions.Timeout
@jqmp jqmp added bug Something isn't working enhancement New feature or request labels Feb 6, 2020
@jqmp
Copy link
Collaborator Author

jqmp commented Jun 11, 2020

Just noting that we've also seen a google.resumable_media.common.DataCorruption error in the wild; however, I don't know if this is something that would be fixed with a retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant