Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloudRetry/AWSRetry backoff decorator with unit tests #17039

Merged
merged 11 commits into from Sep 13, 2016

Conversation

@linuxdynasty
Copy link
Contributor

@linuxdynasty linuxdynasty commented Aug 10, 2016

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME

CloudRetry Base class inside of module_utils/cloud.py
AWSRetry decorator inside of module_utils/ec2.py

ANSIBLE VERSION
ansible 2.1.1.0
SUMMARY

The CloudRetry class can be implemented by any other cloud provider, that wants to implement a basic backoff algorithm in a decorator.

AWSRetry class overwrites 2 methods.

  • status_code_from_exception (Parses the exception for the exact string)
  • found (Iterates over a list of exceptions to match against.

AWSRetry.backoff() decorator can be applied to any function that is making an aws boto/boto3 call.
It will only retry on the following Exceptions.
This list of failures is based on this API Reference.

  • RequestLimitExceeded
  • Unavailable
  • ServiceUnavailable
  • InternalFailure
  • InternalError
  • "\w+.NotFound" (Eventual Consistency)

The CloudRetry.backoff decorator takes on the following kwargs.

  • tries (number of times to try) default=10
  • delay (initial delay between retries in seconds) default=3
  • backoff (This is the multiplier, that will double on each retry) default=2

This change is Reviewable

@bcoca
Copy link
Member

@bcoca bcoca commented Aug 11, 2016

so instead of re implementing from scratch, would it not be better to expand the existing retry functions to allow for the custom conditions?

https://github.com/ansible/ansible/blob/devel/lib/ansible/module_utils/api.py

@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 11, 2016

@bcoca I read api.py before writing my own. This one is purely focused on AWS and not just on the RateLimiting issue. This one also gets around the NotFound" (Eventual Consistency) issue as well. Could we add an argument to the decorator in API to take in Exception Strings. Sure, but I think this way the writer of the modules will not have to think of the multiple use cases beyond RequestLimitExceeded. Also if an exception is not part of the list of retries, it will just raise it.

@bcoca
Copy link
Member

@bcoca bcoca commented Aug 11, 2016

@linuxdynasty i'm not saying modules should wrap the existing retries, i was suggesting this function could be re-implemented as a simple wrapper to an expanded 'generic retry'.

This would make 'cloud specific' retry functions much easier in general.

@linuxdynasty linuxdynasty force-pushed the linuxdynasty:aws_retry_decorator branch Aug 11, 2016
@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 11, 2016

I refactored it @bcoca. Now there is a base class called CloudRetry in module_utils/cloud.py. AWSRetry class in module_utils/ec2.py inherits from CloudRetry. Now only two staticmethods need to be overridden when another cloud provider wants to use this decorator. Test also updated.

@rodrickbrown
Copy link

@rodrickbrown rodrickbrown commented Aug 11, 2016

+1 this looks good well needed feature request thanks @linuxdynasty

@linuxdynasty linuxdynasty force-pushed the linuxdynasty:aws_retry_decorator branch Aug 11, 2016
@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 11, 2016

Can we add botocore, boto3 to the ansible core testing framework. As one of the failures, is that botocore does not exist.

@mattclay
Copy link
Member

@mattclay mattclay commented Aug 11, 2016

@linuxdynasty You can add botocore and boto3 to test/utils/shippable/sanity-requirements.txt.

@mattclay
mattclay reviewed Aug 11, 2016
View changes
lib/ansible/module_utils/cloud.py Outdated
while max_tries > 1:
try:
return f(*args, **kwargs)
except Exception as e:

This comment has been minimized.

@mattclay

mattclay Aug 11, 2016
Member

This isn't compatible with Python 2.4, which is causing it to fail the py24 tests:

2016-08-11 20:25:51 + python2.4 -m compileall -fq -x 'module_utils/(a10|rax|openstack|ec2|gce|docker_common|azure_rm_common|vca|vmware|gcp|gcdns).py' lib/ansible/module_utils
2016-08-11 20:25:51 Compiling lib/ansible/module_utils/cloud.py ...
2016-08-11 20:25:51   File "lib/ansible/module_utils/cloud.py", line 86
2016-08-11 20:25:51     except Exception as e:
2016-08-11 20:25:51                       ^
2016-08-11 20:25:51 SyntaxError: invalid syntax

You can use this instead:

except Exception:
    e = get_exception()
@mattclay
Copy link
Member

@mattclay mattclay commented Aug 12, 2016

@linuxdynasty My apologies, I had you put the dependencies in the wrong requirements file. You should add them to the two requirements files in the test/utils/tox directory instead.

@mattclay
Copy link
Member

@mattclay mattclay commented Aug 12, 2016

@linuxdynasty If you want to run the unit tests locally with an environment similar to Shippable, use the following command: TOXENV=py27 test/utils/shippable/sanity.sh

That will run the unit tests on Python 2.7. You can use other Python versions as well, if you have them installed.

@linuxdynasty linuxdynasty force-pushed the linuxdynasty:aws_retry_decorator branch Aug 12, 2016
@mattclay
mattclay reviewed Aug 12, 2016
View changes
test/utils/shippable/sanity-requirements.txt Outdated
@@ -2,3 +2,5 @@ tox
pyyaml
jinja2
setuptools
botocore
boto3

This comment has been minimized.

@mattclay

mattclay Aug 12, 2016
Member

Changes to this file aren't needed since the dependencies are in the test/utils/tox requirements files.

from ansible.module_utils.pycompat24 import get_exception


class CloudRetry(object):

This comment has been minimized.

@bcoca

bcoca Aug 12, 2016
Member

not exactly what i was asking, but good enough

This comment has been minimized.

@senorsmile

senorsmile Aug 16, 2016

I do think this could be useful for any module (not just cloud modules).

@linuxdynasty linuxdynasty changed the title aws_retry decorator function with unit tests CloudRetry/AWSRetry decorator with unit tests Aug 12, 2016
@linuxdynasty linuxdynasty changed the title CloudRetry/AWSRetry decorator with unit tests CloudRetry/AWSRetry backoff decorator with unit tests Aug 12, 2016
@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 12, 2016

@mattclay thank you for all your help. Test are finally passing :D

@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 14, 2016

tests are failing, but not do to this PR.

@mattclay
Copy link
Member

@mattclay mattclay commented Aug 14, 2016

I've restarted the build. The failure was due to a temporary timeout.

@AWSRetry.backoff(tries=2, delay=0.1)
def retry_once():
self.counter += 1
if self.counter < 2:

This comment has been minimized.

@fearphage

fearphage Aug 16, 2016

These tests would be a lot cleaner using Mock.

This comment has been minimized.

@bcoca

bcoca Aug 16, 2016
Member

Which is why the original 'retry' is in api.py, I might migrate this after merge

@linuxdynasty linuxdynasty force-pushed the linuxdynasty:aws_retry_decorator branch Aug 18, 2016
@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 18, 2016

@bcoca any updates on if you want me to move CloudRetry from cloud.py to api.py and still leave AWSRetry in ec2.py?

@ryansb
ryansb reviewed Aug 24, 2016
View changes
lib/ansible/module_utils/cloud.py Outdated
pass

@classmethod
def backoff(cls, tries=10, delay=3, backoff=2):

This comment has been minimized.

@ryansb

ryansb Aug 24, 2016
Contributor

I'm not sure about these defaults - a delay of 3 and backoff of 2 for 10 tries would mean that, to fail, this retry decorator would wait for 3069 seconds (3 + 3*2 + 3*2*2 ...., or sum([3 * 2**i for i in range(10)])) or about 50 minutes. That seems like a really long time, especially since most modules make several calls.

A better default might be 4 tries, for a total default wait time of 45 seconds and having a max of, say, a minute between tries. That way, if someone wanted 10 tries it would only take about 7.5 minutes to fail.

This comment has been minimized.

@linuxdynasty

linuxdynasty Aug 24, 2016
Author Contributor

Makes sense and I will update it shortly.

This comment has been minimized.

@linuxdynasty

linuxdynasty Aug 24, 2016
Author Contributor

I'm thinking of changing the default backoff of 2 seconds to 1.1 seconds. I think this is more reasonable. sum([3 * 1.1**i for i in range(10)])
About a total of 48 seconds in 10 tries. @ryansb thoughts?

This comment has been minimized.

@jjshoe

jjshoe Aug 24, 2016
Contributor

This should be configurable, because this isn't enough for the work I do with auto scale groups.

This comment has been minimized.

@ryansb

ryansb Aug 24, 2016
Contributor

@jjshoe It is configurable, these values are just the defaults that will be used if the developer calls the decorator without arguments.

@linuxdynasty That seems reasonable to me - go for it. Should be plenty to wait out most rate limits.

@linuxdynasty linuxdynasty force-pushed the linuxdynasty:aws_retry_decorator branch Aug 24, 2016
@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 24, 2016

@ryansb I pushed the changes. The build failed cause it could not spin a container.

This will be about a total of 48 seconds in 10 tries. This is
configurable.
@linuxdynasty linuxdynasty force-pushed the linuxdynasty:aws_retry_decorator branch to 1acdd4e Aug 24, 2016
@linuxdynasty
Copy link
Contributor Author

@linuxdynasty linuxdynasty commented Aug 30, 2016

@ryansb does everything else look good to you?

@ryansb ryansb merged commit b510abc into ansible:devel Sep 13, 2016
1 check passed
1 check passed
Shippable Run 2487 status is SUCCESS.
Details
sereinity added a commit to sereinity-forks/ansible that referenced this pull request Jan 25, 2017
* Added aws_retry decorator function with unit tests

* Restructured the code to be used with a base class.

This base class CloudRetry can be reused by any other cloud provider.
This decorator should be used in situations, where you need to implement
a backoff algorithm and want to retry based on the status code from the
exception.

* updated documentation

* fixed tabs

* added botocore and boto3 to requirements.txt

* removed cloud.py from py24 tests, as it depends on boto3

* fix relative imports

* updated test to be 2.6 compat

* updated method name from retry to backoff

* readded lxd

* Updated default backoff from 2 seconds to 1.1s.

This will be about a total of 48 seconds in 10 tries. This is
configurable.
@ansibot ansibot added feature and removed feature_pull_request labels Mar 4, 2018
@ansible ansible locked and limited conversation to collaborators Apr 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

10 participants
You can’t perform that action at this time.