Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add job_retry argument to load_table_from_uri #969

Open
tswast opened this issue Sep 14, 2021 · 2 comments
Open

feat: add job_retry argument to load_table_from_uri #969

tswast opened this issue Sep 14, 2021 · 2 comments
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@tswast
Copy link
Contributor

tswast commented Sep 14, 2021

In internal issue 195911158, a customer is struggling to retry jobs that fail with "403 Exceeded rate limits: too many table update operations for this table". One can encounter this exception by attempting to run hundreds of load jobs in parallel.

Thoughts:

  1. Try to reproduce. Does the exception happen at result() or load_table_from_uri()? If result(), continue with job_retry, otherwise see if we can modify the default retry predicate for load_table_from_uri() to find this rate limiting reason and retry.
  2. Assuming the exception does happen at result(), modify load jobs (or more likely the base class) to retry if job retry is set, similar to what we do for query jobs.

Notes:

  • I suspect we'll need a different default job_retry object for load_table_from_uri(), as the retryable reasons will likely be different than what we have for queries.
  • I don't think the other load_table_from_* are as retryable as load_table_from_uri(), since they would require rewinding file objects, which isn't always possible. We'll probably want to consider adding job_retry to those load job methods in the future, but for now load_table_from_uri is what's needed.
@tswast tswast added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Sep 14, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Sep 14, 2021
@tswast
Copy link
Contributor Author

tswast commented Sep 14, 2021

Here's a stacktrace from a Googler who tried to reproduce this on their own project.

---------------------------------------------------------------------------

RemoteTraceback                           Traceback (most recent call last)

RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "<ipython-input-75-e46c7b68e71a>", line 12, in load_data
    job = load_job.result()  # Waits for the job to complete.
  File "/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/job/base.py", line 679, in result
    return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/google/api_core/future/polling.py", line 134, in result
    raise self._exception
google.api_core.exceptions.Forbidden: 403 Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas
"""


The above exception was the direct cause of the following exception:


Forbidden                                 Traceback (most recent call last)

<ipython-input-77-bef363ce70e2> in <module>
      1 with multiprocessing.Pool() as pool:
----> 2     pool.map(load_data, args)


/opt/conda/lib/python3.7/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    266         in a list that is returned.
    267         '''
--> 268         return self._map_async(func, iterable, mapstar, chunksize).get()
    269 
    270     def starmap(self, func, iterable, chunksize=None):


/opt/conda/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
    655             return self._value
    656         else:
--> 657             raise self._value
    658 
    659     def _set(self, i, obj):


Forbidden: 403 Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas

Indeed the exception does throw from result(). It might be nice to see the structured error data to help with our retry predicate though.

@urwa
Copy link

urwa commented Jun 6, 2022

Having this exact problem in a cloud function triggered when data is uploaded to cloud bucket. Having job_retry argument to load_table_from_uri will definitely be very useful.

Right now, considering cloud function retry option but I plan to add monitoring on top of cloud function and want to keep logs clean for that even if retry was successful.

So now implementing exponential backoff in case of exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

2 participants