Exponential backoff #361

VVMichaelSawyer · 2017-04-21T14:49:12Z

No description provided.

No new test failures.

geemus · 2017-04-21T16:11:38Z

Seems reasonable.

That said, there is some existing retry code for any/all requests that are flagged idempotent via excon, though it does NOT use backoff.

This is an improvement , as it adds backoff and will work for otherwise non-idempotent commands. I do worry a little about this potentially making it harder to understand retry behavior though. Just want us to consider if we can mirror existing behavior as much as possible. ie we need the backoff for this case, unlike normal idempotent, but perhaps they should use the same thing for number of retries?

This may also be much adou about nothing, so to speak, but I wanted to at least have the discussion.

Thanks!

VVMichaelSawyer · 2017-04-24T19:20:02Z

I hesitate to make this change in the excon project since exponential backoff is not always necessary for a connection library. I suppose it could be an additional option, but since AWS in particular recommends using exponential backoff, I figured this would be the appropriate project to add this to.

I do like your suggestion, if I understood correctly, of reusing the number of max retries between the idempotent option and the exponential backoff, but I don't see a current global retry_limit variable being used by fog-aws. Am I just missing it?

zmt · 2017-04-25T22:55:17Z

+1

lanej · 2017-04-25T23:03:33Z

lib/fog/aws/compute.rb

+                when 'RequestLimitExceeded'
+                  if retries < max_retries
+                    sleep (2.0 ** (1.0 + retries) * 100) / 1000.0
+                    _request(body, headers, idempotent, parser, retries + 1)


I would advocate for calling retry instead of making this a recursive function. This would simplify error backtraces and reduce pressure on the stack.

Good thinking, I'll work on this now.

geemus · 2017-04-26T16:48:59Z

@VVMichaelSawyer I wouldn't change the existing excon behavior directly, but adding an option for backoff (instead of just immediate retries) as part of idempotent might make sense. Then for something like this you would just need to set that option. AWS specifically recommends it, as you note, but it is a pretty common pattern (it just differs a bit from the original intention of the idempotent option). One advantage of this would be that max_retries would more directly/naturally be re-used (as I believe you are correct that fog-aws has no direct notion of retry limits presently). Does that make sense? What do you think?

austinkao91 · 2017-04-26T20:39:21Z

@geemus I took a look at the idempotent module in Excon, and it seems that excon handles all http errors equally Currently the retry logic is implemented for any HTTP error code. Ideally I think the exponential back off logic should only be implemented for AWS RequestLimitExceeded errors. Even if we just pass an option for exponential backoff in the Excon module, simple errors like 403 or 301 should not exponentially backoff before failing out. I think @VVMichaelSawyer's solution is fine where it is.

VVMichaelSawyer · 2017-04-26T23:56:43Z

@geemus, I would agree with @austinkao91. If excon handles all HTTP errors the same, we shouldn't add exponential backoffs on things such as unauthorized (403) errors, this would add unnecessary time to the drop dead. Also, the status code returned by AWS appears to be 503 according to this forum article, which does not always mean an exponential backoff is necessary, I think we should only retry after reading the error message passed back, which would be an AWS specific message.

I think it would be best to merge in as is and leave the idempotent logic as is for excon. A future improvement, as the retry logic is spread to other services in fog-aws, could be to use a global variable to control the number of exponential retries in one place.

geemus · 2017-04-28T16:17:52Z

Sounds good, thanks for discussing options.

masstamike added 2 commits April 20, 2017 13:38

ET-2352 Implements exponential backoff for compute requests.

31136f1

No new test failures.

ET-2352 Throw RequestLimitExceeded error if out of retries.

8ffa533

This was referenced Apr 25, 2017

RequestLimitExceeded exception mitchellh/vagrant-aws#221

Open

Reduce the number of DescribeInstances calls mitchellh/vagrant-aws#403

Open

Exponential Backoff required to prevent RequestLimitExceeded error mitchellh/vagrant-aws#483

Closed

lanej reviewed Apr 25, 2017

View reviewed changes

ET-2352 Use retries instead of recursion.

5f9439c

geemus merged commit 1577e12 into fog:master Apr 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Exponential backoff #361

Exponential backoff #361

Uh oh!

VVMichaelSawyer commented Apr 21, 2017

Uh oh!

geemus commented Apr 21, 2017

Uh oh!

VVMichaelSawyer commented Apr 24, 2017

Uh oh!

zmt commented Apr 25, 2017

Uh oh!

lanej Apr 25, 2017

Uh oh!

VVMichaelSawyer Apr 25, 2017

Uh oh!

geemus commented Apr 26, 2017

Uh oh!

austinkao91 commented Apr 26, 2017

Uh oh!

VVMichaelSawyer commented Apr 26, 2017

Uh oh!

geemus commented Apr 28, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Exponential backoff #361

Exponential backoff #361

Uh oh!

Conversation

VVMichaelSawyer commented Apr 21, 2017

Uh oh!

geemus commented Apr 21, 2017

Uh oh!

VVMichaelSawyer commented Apr 24, 2017

Uh oh!

zmt commented Apr 25, 2017

Uh oh!

lanej Apr 25, 2017

Choose a reason for hiding this comment

Uh oh!

VVMichaelSawyer Apr 25, 2017

Choose a reason for hiding this comment

Uh oh!

geemus commented Apr 26, 2017

Uh oh!

austinkao91 commented Apr 26, 2017

Uh oh!

VVMichaelSawyer commented Apr 26, 2017

Uh oh!

geemus commented Apr 28, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants