Skip to content

Datastore: Deadline exceeded while calling functools.partial object - 504 Deadline Exceeded #10

@speedplane

Description

@speedplane

My requests to the datastore keep on timing out with 504 errors, despite a very long retry handler. Here is the stack trace:

Traceback (most recent call last):
...
  File "/home/vmagent/app/updater/views.py", line 6484, in _get_most_recent_update
    doc = list(doc.order_by("-last_get")[:1])
  File "/home/vmagent/app/django/db/models/query.py", line 96, in __iter__
    self._fetch_all()
  File "/home/vmagent/app/django/db/models/query.py", line 854, in _fetch_all
    self._result_cache = list(self.iterator())
  File "/home/vmagent/app/django/db/models/query.py", line 220, in iterator
    for row in compiler.results_iter():
  File "/home/vmagent/app/djangotoolbox/db/basecompiler.py", line 380, in results_iter
    for entity in results:
  File "/home/vmagent/app/djangoappengine/db_gcl/compiler.py", line 256, in fetch
    for entity in results:
  File "/env/local/lib/python2.7/site-packages/google/api_core/page_iterator.py", line 199, in _items_iter
    for page in self._page_iter(increment=False):
  File "/env/local/lib/python2.7/site-packages/google/api_core/page_iterator.py", line 230, in _page_iter
    page = self._next_page()
  File "/env/local/lib/python2.7/site-packages/google/cloud/datastore/query.py", line 517, in _next_page
    query=query_pb,
  File "/env/local/lib/python2.7/site-packages/google/cloud/datastore_v1/gapic/datastore_client.py", line 294, in run_query
    request, retry=retry, timeout=timeout, metadata=metadata)
  File "/env/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py", line 139, in __call__
    return wrapped_func(*args, **kwargs)
  File "/env/local/lib/python2.7/site-packages/google/api_core/retry.py", line 260, in retry_wrapped_func
    on_error=on_error,
  File "/home/vmagent/app/appengine_config.py", line 250, in new_retry_target
    deadline, target), last_exc), last_exc)
  File "/app/libs/six.py", line 692, in raise_from
    raise value
RetryError: Deadline of 900.0s exceeded while calling <functools.partial object at 0x7fdd6b7e1a48>, last exception: 504 Deadline Exceeded
"  

I am using the following versions of the API:

google-cloud-datastore==1.7.0
google-cloud-logging==1.6.0
google-cloud-storage==1.10.0

Running Python 2.7 on Google Cloud Flexible with the following app.yaml:

runtime: custom
api_version: 1
service: vm

env: flex
entrypoint: gunicorn -b :$PORT main:app
beta_settings:
  enable_app_engine_apis: true

automatic_scaling:
  min_num_instances: 3
  max_num_instances: 50
  cool_down_period_sec: 240
  cpu_utilization:
    target_utilization: 0.6

resources:
  cpu: 8
  memory_gb: 32
  disk_size_gb: 20

It seems that requests to the datastore are timing out with 504 errors even after retrying many times. To better debug the issue, I monkey patched the google.api_core.retry_target function which inserts more logging on retry calls. The monkey patch is as follows (compare to original funciton here):

def improve_retry_logging():
    '''Improve debug logging for our retry handler.'''
    try:
        from google.api_core import retry
    except AllImportErrors as e:
        return
    logging.info("Monkey patching retry.retry_target: %s", retry)
    def new_retry_target(target, predicate, sleep_generator, deadline,
            on_error = None):
        if deadline is not None:
            deadline_datetime = (
                retry.datetime_helpers.utcnow() +
                retry.datetime.timedelta(seconds=deadline))
        else:
            deadline_datetime = None
        last_exc = None
        total_sleep = 0.0
        for sleep_i, sleep in enumerate(sleep_generator):
            try:
                result = target()
            # pylint: disable=broad-except
            # This function explicitly must deal with broad exceptions.
            except Exception as exc:
                if not predicate(exc):
                    raise
                last_exc = exc
                if on_error is not None:
                    on_error(exc)
            else:
                if sleep_i > 0:
                    logging.info('retry_target success. Retries: %d. '
                        'Slept %.2f Secs', sleep_i, total_sleep)
                return result

            now = retry.datetime_helpers.utcnow()
            if deadline_datetime is not None and deadline_datetime < now:
                logging.warning('retry_target deadline reached. Retries: %d.'
                    'Deadline: %.1f S. Slept: %.2f S, Last %.2f S. Err: %s',
                        sleep_i, deadline, total_sleep, last_exc)
                retry.six.raise_from(
                    retry.exceptions.RetryError('Retry ({}) deadline of '
                        '{:.1f}s exceeded while calling {}'.format(
                            sleep_i, deadline, target,), last_exc), last_exc)
            logging.info('retry_target retry due to "%s" %s (%d), sleep %.2fs',
                last_exc, target, sleep_i, sleep)
            retry.time.sleep(sleep)
            total_sleep += sleep
        raise ValueError('Sleep generator stopped yielding sleep values.')
    # Here we do the monkey patch.
    retry.retry_target = new_retry_target
improve_retry_logging()

With the logging instrumentation above, you can clearly see the retry handler being hit multiple times, and failing every single time with these 504 errors.

E  web deferred_free_update ctqueue 'Connecticut State, Supreme Court', '19966': Deadline of 900.0s exceeded while calling <functools.partial object at 0x7fdd6b7e1a48>, last exception: 504 Deadline Exceeded (50.03% mem, 187 threads) ...
I retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (7), sleep 3.21s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (7), sleep 3.21s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (6), sleep 13.02s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (5), sleep 6.76s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (4), sleep 4.10s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (3), sleep 2.17s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (2), sleep 0.08s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (1), sleep 0.50s 140588964886272
I  retry_target retry due to "504 Deadline Exceeded" <functools.partial object at 0x7fdd6b7e1a48> (0), sleep 0.26s 140588964886272

Also, I have already opened support request 16456733, but they do not seem able to help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: datastoreIssues related to the googleapis/python-datastore API.type: questionRequest for information or clarification. Not an issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions