Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: don't retry getQueryResults as often with ambiguous errors in QueryJob.result() #1903

Open
tswast opened this issue Apr 16, 2024 · 2 comments
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p3 Desirable enhancement or fix. May not be included in next release.

Comments

@tswast
Copy link
Contributor

tswast commented Apr 16, 2024

Important

Do not change the retries for jobs.getQueryResults REST API calls in RowIterator, only in QueryJob.result() where the HTTP status code are ambiguous. Once we get to RowIterator we know the job has succeeded and these errors cease to be ambigous.

Is your feature request related to a problem? Please describe.

When a job fails due to quota issues, the jobs.getQueryResults BigQuery REST API translates job failure status into failure HTTP codes. This means that exceptions like google.api_core.exceptions.TooManyRequests are ambiguous. It could be at the Google Frontend level and mean we've hit our API request quota or could be because the job has failed due to query quota issues.

Because these ambiguous errors are included in the default retry predicate, we end up retrying the jobs.getQueryResults request until our retries expire. Only after which do we do a call to jobs.get to see that the job has failed for a retriable reason.

Describe the solution you'd like

Likely we need the default retry object for our calls to jobs.getQueryResults (the ones where we know we're waiting for a job to complete, not the ones where we're downloading the results) to be different from the default retry for all other API requests.

This may require an additional parameter to QueryJob.result() and the methods that call it.

Describe alternatives you've considered

Note: This issue has been mitigated by #1734 which ensures that the default job_retry has a deadline that exceeds the default deadline of retry. It means we don't retry the job nearly as quickly as we could, though.

Additional context

See the discussion at https://github.com/googleapis/python-bigquery/pull/1900/files#r1565837480 also internal folks can see similar discussions on issue 311358887 and 312216177 in the Java client.

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Apr 16, 2024
@tswast
Copy link
Contributor Author

tswast commented Apr 16, 2024

Looking more closely at #1794, this might not be as bad as I thought, as at least some job quota errors return jobRateLimitExceeded as the error reason, which isn't ambiguous.

@tswast tswast added the priority: p3 Desirable enhancement or fix. May not be included in next release. label Apr 16, 2024
@tswast
Copy link
Contributor Author

tswast commented Apr 16, 2024

Another thing I'm noticing, the google.api_core.exceptions.RetryError doesn't seem to be retried at all even if the wrapped exception is because of a retriable reason, so the mitigation didn't work as expected. I'll address that part in #1900.

@tswast tswast changed the title perf: don't retry ambiguous getQueryResults errors perf: don't retry getQueryResults as often with ambiguous getQueryResults errors Apr 16, 2024
@tswast tswast changed the title perf: don't retry getQueryResults as often with ambiguous getQueryResults errors perf: don't retry getQueryResults as often in QueryJob.result() Apr 16, 2024
@tswast tswast changed the title perf: don't retry getQueryResults as often in QueryJob.result() perf: don't retry getQueryResults as often with ambiguous errors in QueryJob.result() Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. priority: p3 Desirable enhancement or fix. May not be included in next release.
Projects
None yet
Development

No branches or pull requests

2 participants