Use exponential backoff from botocore #533

ejholmes · 2018-02-14T02:16:56Z

Since botocore 1.6.0, it's possible to pass max_attempts to a generated client, like so:

from botocore.config import Config

config = Config(
    retries = dict(
        max_attempts = 10
    )
)

ec2 = boto3.client('ec2', config=config)

This removes the internal retry_with_backoff function, in favor of using botocore's internal exponential backoff with a higher value of max_attempts.

The old behavior was a combination of the internal botocore backoff, combined with a linear backoff in stacker, where stacker's algorithm was:

sleep_time = min(max_delay, min_delay * attempt)

So, in the event of throttling, the request would have been retried 4 times in boto (as determined by https://github.com/boto/botocore/blob/1.6.1/botocore/data/_retry.json#L77) with exponential backoff, then retried immediately after by stacker, which would reset the backoff in botocore.

With this change, the backoff continues to grow instead of getting reset.

Note, I've set max_attempts on the CloudFormation client to 10, which is pretty high (about 8 minutes worst case), but I think a large value makes sense for stacker's use case, especially with parallelization. It's rare that you would actually want stacker to abort on throttling.

ejholmes · 2018-02-14T02:29:04Z

I can forcefully cause throttling using this stacker config, Which creates 60 stacks in parallel:

from stacker.config import Config, Stack, dump

stacks = []
for i in range(60):
    stack = Stack({
        "name": "vpc%d" % i,
        "class_path": "stacker.tests.fixtures.mock_blueprints.Dummy",
    })
    stacks += [stack]

config = Config()
config.namespace = "ejholmes"
config.stacks = stacks

print dump(config)

Without this change, throttling exceptions are raised pretty frequently. After this change, everything executes, without throwing any exceptions (albeit, it takes a while because of throttling).

phobologic

This looks good- my only concern is all the other places where we don't use retry_on_throttling already,mostly in hooks/handlers/actions - will those be affected by the throttling issues do you think?

ejholmes · 2018-02-14T02:42:37Z

hooks shouldn't be throttled, since they aren't run in parallel.
For lookups, most that are using aws clients directly (kms, dynamodb, ssm, ami), aren't doing any retry, other than the defaults in botocore, so that will remain unchanged for now. Output lookups are pre-cached, so they don't make unnecessary API calls.
For actions, the only aws API calls that happen outside of the provider are for pushing templates to S3, but that won't get throttled (I don't think AWS has limits on PutObject).

ejholmes · 2018-02-14T04:18:35Z

I pulled this into #531 and I'll work on any polish there.

russellballestrini

Awesome!

Use exponential retry from botocore

b9fee47

ejholmes requested a review from phobologic February 14, 2018 02:16

ejholmes mentioned this pull request Feb 14, 2018

Parallel graph walk #531

Merged

4 tasks

phobologic approved these changes Feb 14, 2018

View reviewed changes

ejholmes merged commit b9fee47 into dag-concurrent Feb 14, 2018

russellballestrini reviewed Feb 14, 2018

View reviewed changes

russellballestrini deleted the dag-throttling branch February 14, 2018 15:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use exponential backoff from botocore #533

Use exponential backoff from botocore #533

Uh oh!

ejholmes commented Feb 14, 2018

Uh oh!

ejholmes commented Feb 14, 2018

Uh oh!

phobologic left a comment

Uh oh!

ejholmes commented Feb 14, 2018

Uh oh!

ejholmes commented Feb 14, 2018

Uh oh!

russellballestrini left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Use exponential backoff from botocore #533

Use exponential backoff from botocore #533

Uh oh!

Conversation

ejholmes commented Feb 14, 2018

Uh oh!

ejholmes commented Feb 14, 2018

Uh oh!

phobologic left a comment

Choose a reason for hiding this comment

Uh oh!

ejholmes commented Feb 14, 2018

Uh oh!

ejholmes commented Feb 14, 2018

Uh oh!

russellballestrini left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants