Exponential backoff retries for ProvisionedThroughputExceeded errors #222

JohnEmhoff · 2017-01-18T17:23:19Z

Currently this is a hard failure -- retry with backoff instead. Handling this client side is hard because batch_write.commit() clears its list of pending operations on failure.

Related issue: #218

…putExceeded error

coveralls · 2017-01-18T17:26:56Z

Coverage remained the same at 97.912% when pulling 47eb37f on JohnEmhoff:retry-throughput-errors into 24e14ca on jlafon:devel.

danielhochman · 2017-02-06T21:07:03Z

pynamodb/connection/base.py

-                    elif response.status_code < 500:
-                        # We don't retry on a ConditionalCheckFailedException or other 4xx because we assume they will
-                        # fail in perpetuity. Retrying when there is already contention could cause other problems
+                    elif response.status_code < 500 and code != 'ProvisionedThroughputExceededException':


let's make it configurable

brandond · 2017-03-01T20:52:51Z

@JohnEmhoff Are you still working on this?

JohnEmhoff · 2017-03-01T22:21:57Z

I haven't had a chance to update it to be configurable, no. Honestly I'm not sure what the use case is to have it off. For example, when writing, pynamo clears the batch object ahead of the write -- the application code thus can't retry the writes that failed because it doesn't know what they are.

…

On Mar 1, 2017 3:52 PM, "Brandon Davidson" ***@***.***> wrote: @JohnEmhoff <https://github.com/JohnEmhoff> Are you still working on this? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#222 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABnTd-BpiFtiHDljLUn5bJRIx0-5qYd1ks5rhdqkgaJpZM4LnKNL> .

bedge · 2017-03-11T00:54:30Z

I"m doing this (some ugly variant anyway) in all my clients already, so definitely a +1 to merge.
I see no reason not to do this for all cases.

lukedeo · 2017-03-11T01:04:58Z

+1 here as well

bedge · 2017-03-15T22:09:12Z

Any updates on this? Merging this addresses a known issue, which as of right now is a PITA to deal with.
Is there really any downside to retrying?

danielhochman · 2017-03-15T23:02:46Z

at Lyft we prefer backpressure to go to the client. i will merge since we actually don't use the retry within PynamoDB in most cases.

it actually seems like a bigger bug that batch_write will clear on error. retry won't guarantee success, particularly when experiencing ProvisionedThroughputExceptions.

can someone create an issue to track that in more detail?

bedge · 2017-03-16T16:54:22Z

Agree that the batch_write clear on error is a more significant issue. However, this addresses a specific case we see regularly, which is that capacity provisioning lags the dynamodb I/O load and retries early on in the capacity scale-up phase eliminate a bunch of client-side retries.

thanks!

danielhochman · 2017-03-16T17:07:07Z

@JohnEmhoff @bedge @lukedeo @brandond released as part of 2.1.5

Apply the exponential backoff retries when given a ProvisionedThrough…

47eb37f

…putExceeded error

danielhochman reviewed Feb 6, 2017

View reviewed changes

danielhochman merged commit e94d496 into pynamodb:devel Mar 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exponential backoff retries for ProvisionedThroughputExceeded errors #222

Exponential backoff retries for ProvisionedThroughputExceeded errors #222

JohnEmhoff commented Jan 18, 2017

coveralls commented Jan 18, 2017 •

edited

danielhochman Feb 6, 2017

brandond commented Mar 1, 2017

JohnEmhoff commented Mar 1, 2017 via email

bedge commented Mar 11, 2017

lukedeo commented Mar 11, 2017

bedge commented Mar 15, 2017

danielhochman commented Mar 15, 2017

bedge commented Mar 16, 2017

danielhochman commented Mar 16, 2017

Exponential backoff retries for ProvisionedThroughputExceeded errors #222

Exponential backoff retries for ProvisionedThroughputExceeded errors #222

Conversation

JohnEmhoff commented Jan 18, 2017

coveralls commented Jan 18, 2017 • edited

danielhochman Feb 6, 2017

Choose a reason for hiding this comment

brandond commented Mar 1, 2017

JohnEmhoff commented Mar 1, 2017 via email

bedge commented Mar 11, 2017

lukedeo commented Mar 11, 2017

bedge commented Mar 15, 2017

danielhochman commented Mar 15, 2017

bedge commented Mar 16, 2017

danielhochman commented Mar 16, 2017

coveralls commented Jan 18, 2017 •

edited