Skip to content

Commit

Permalink
bigquery: retry idempotent RPCs (#4148)
Browse files Browse the repository at this point in the history
Add retry logic to every RPC for which it makes sense.

Following the BigQuery team, we ignore the error code and
use the "reason" field of the error to determine whether
to retry.

Outstanding issues:

- Resumable upload consists of an initial call to get a URL,
  followed by posts to that URL. Getting the retry right on that
  initial call requires modifying the ResumableUpload class. At the
  same time, the num_retries argument should be removed.

- Users can't modify the retry behavior of Job.result(), because
  PollingFuture.result() does not accept a retry argument.
  • Loading branch information
jba authored and tswast committed Oct 16, 2017
1 parent a9a0b30 commit 0209991
Show file tree
Hide file tree
Showing 5 changed files with 228 additions and 62 deletions.
2 changes: 2 additions & 0 deletions bigquery/google/cloud/bigquery/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
__version__ = get_distribution('google-cloud-bigquery').version

from google.cloud.bigquery._helpers import Row
from google.cloud.bigquery._helpers import DEFAULT_RETRY
from google.cloud.bigquery.client import Client
from google.cloud.bigquery.dataset import AccessEntry
from google.cloud.bigquery.dataset import Dataset
Expand Down Expand Up @@ -61,4 +62,5 @@
'Table',
'TableReference',
'UDFResource',
'DEFAULT_RETRY',
]
26 changes: 26 additions & 0 deletions bigquery/google/cloud/bigquery/_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

import six

from google.api.core import retry
from google.cloud._helpers import UTC
from google.cloud._helpers import _date_from_iso8601_date
from google.cloud._helpers import _datetime_from_microseconds
Expand Down Expand Up @@ -520,3 +521,28 @@ def _rows_page_start(iterator, page, response):
total_rows = int(total_rows)
iterator.total_rows = total_rows
# pylint: enable=unused-argument


def _should_retry(exc):
"""Predicate for determining when to retry.
We retry if and only if the 'reason' is 'backendError'
or 'rateLimitExceeded'.
"""
if not hasattr(exc, 'errors'):
return False
if len(exc.errors) == 0:
return False
reason = exc.errors[0]['reason']
return reason == 'backendError' or reason == 'rateLimitExceeded'


DEFAULT_RETRY = retry.Retry(predicate=_should_retry)
"""The default retry object.
Any method with a ``retry`` parameter will be retried automatically,
with reasonable defaults. To disable retry, pass ``retry=None``.
To modify the default retry behavior, call a ``with_XXX`` method
on ``DEFAULT_RETRY``. For example, to change the deadline to 30 seconds,
pass ``retry=bigquery.DEFAULT_RETRY.with_deadline(30)``.
"""
Loading

0 comments on commit 0209991

Please sign in to comment.