-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance Transient Error handling in Async.start() method #1
Conversation
…o retry_transient_errors False. Renamed parameter in _insert_tasks 'retry_errors' to 'retry_transient_errors' for clarity. Pass retry_transient_errors parameter onto recursive calls in _insert_tasks when we are splitting and retrying errors.
… to re-add a task in Async.start() method. Add option to override retry of TransientErrors in the Async.start() method
+1 |
if not retry_transient: | ||
raise | ||
|
||
import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I hate inline imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - me too, but that is the pattern in this file unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why with a stdlib one. I would have no issue with this moving to the top of the file. And I agree in general with inline imports. I understand in libs like this were often things in the same file will not execute in the same context. But meh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - I'll move the time inline imports from this file and put them up top. We've still got some other stdlib imports in this file, but I'll restrict my changes to the imports I've added.
+1 |
2 similar comments
+1 |
+1 |
Adding a 4 second sleep to something called Async seems wrong. |
What would happen with the existing method when a Transient Error popped? |
@johnlockwood-wf - For the existing method, For the context _insert_tasks() method raising a TransientError, I suspect that one would be able to detect the failure of the insert using the |
Since we've had a few suggestions from @macleodbroad-wf , @tylertreat-wf and @johnlockwood-wf about the delay - let me add a new option |
+1 |
@markshaule-wf do you have any empirical evidence that 4 seconds is a good magical number? |
I see recommendations to use a backoff, where it retries after a very short time, then if that fails, retry after a little longer, etc. |
No empirical evidence that 4 seconds is a good magical number. At the time in our discussion, 'around 5 seconds' was recommended by @beaulyddon-wf, which was the Google recommendation. We've got the ability to override the value now, so if callers want to use a different value, they can. We only retry once, if a TransientError (or other error) occurs on the retry, an exception is thrown. |
From Google: "Ideally you would not immediately retry as it can cause more issues" is the general statement. However that's mostly to keep people from killing them. They recognize that's not ideal for critical pass operations. So for critical they say a single immediate retry is ok. However after that they recommend a back off of at least 5 seconds as these types of failures/errors/contention tend to clean themselves up after a couple of seconds. |
+1 |
+1 |
3 similar comments
+1 |
+1 |
+1 |
…ansactional=True has been specified in context, or Async.start()
@beaulyddon-wf @tannermiller-wf @macleodbroad-wf @tylertreat-wf - The latest commit will always re-raise if transactional=True on a TransientError, regardless of the new option. |
+1 |
3 similar comments
+1 |
+1 |
+1 |
+1 |
…ent_error_retry_on_start
@beaulyddon-wf @robertkluin-wf @johnlockwood-wf @tannermiller-wf @rosshendrickson-wf @jasonaguilon-wf @tylertreat-wf
FYI: @erikpetersen-wf @macleodbroad-wf
Async Changes:
Async.start() now sleeps before attempting to re-add task on a TransientError.
Add option retry_transient_errors to override the retry behaviour in Async.start(). False can be specified to just re-raise the TransientError, and not attempt a retry.
Context Changes:
Context _insert_tasks now re-raises TransientError if retry option has been set to False.
Renamed parameter 'retry_errors' to 'retry_transient_errors' in _insert_tasks function.
_insert_tasks - retry_transient_errors parameter is now passed onto recursive calls correctly.
Notes on compatibility:
Since I've renamed the parameter for _insert_tasks, that could potentially break someone's custom implementation of that function.