-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retries when polling for the job status in BigQuery #1946
Conversation
@fabriziodemaria, thanks for your PR! By analyzing the history of the files in this pull request, we identified @mikekap, @ulzha and @mbruggmann to be potential reviewers. |
b1fe155
to
4c8cb30
Compare
try: | ||
status = self.client.jobs().get(projectId=project_id, jobId=job_id).execute() | ||
if status['status']['state'] == 'DONE': | ||
if status['status'].get('errors'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Errors being present doesn't imply that the job failed, presence of errorResult means that the job failed.
See https://cloud.google.com/bigquery/loading-data#loading_avro_files and example here: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/bigquery/cloud-client/load_data_from_file.py#L58
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, will update this to use errorResult
instead.
if int(httpError.resp.status) == 503 and poll_counter < self.MAX_POLLING_RETRIES: | ||
poll_counter = poll_counter + 1 | ||
logger.warning('Got error 503 while requesting info for job %s:%s, retrying in %d seconds...', project_id, job_id, poll_counter) | ||
time.sleep(float(poll_counter)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to be float, at least according to docstring of time.sleep
. You could remove the .0
in the other call too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@fabriziodemaria Have you tried |
Awesome @mbruggmann, thanks for pointing that out! I am changing the PR to make use of that instead. |
4c8cb30
to
5e787a9
Compare
👍 |
if status['status'].get('errors'): | ||
raise Exception('BigQuery job failed: {}'.format(status['status']['errors'])) | ||
if status['status'].get('errorResult'): | ||
raise Exception('BigQuery job failed: {}'.format(status['status']['errorResult'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 'errors'
no more in the dict at this point? I would try to make sure that the exception dumps any information available. Why not all the status
?
Great work! If we have tested that the code is capable of handling a success scenario and a failure scenario no worse than the previous code, then |
|
👍 |
Description
Add retries when polling for the job status in BigQuery, in case of specific HTTP errors encountered.
Motivation and Context
As suggested by documentation from Google, in case of
5xx
errors encountered while polling for the job status viajobs.get()
, it is advised to try to poll again [1].Without this fix, Luigi might exit with a failing Task even if the underlying loading job to BigQuery is still running.
Additionally, the way the job success is verified is changed to use
status.errorResult
instead ofstatus.errors
[2].Have you tested this? If so, how?
errorResult
:5xx
http error to test the retrying mechanism [3], but I think in the worst case Luigi would just fail instead of retrying thus handling this scenario no worse than before.[1] - https://cloud.google.com/bigquery/troubleshooting-errors#backendError
[2] - https://cloud.google.com/bigquery/loading-data#loading_avro_files
[3] - https://google.github.io/google-api-python-client/docs/epy/googleapiclient.http.HttpRequest-class.html#execute