-
Notifications
You must be signed in to change notification settings - Fork 23.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CloudRetry/AWSRetry backoff decorator with unit tests #17039
Conversation
so instead of re implementing from scratch, would it not be better to expand the existing retry functions to allow for the custom conditions? https://github.com/ansible/ansible/blob/devel/lib/ansible/module_utils/api.py |
@bcoca I read api.py before writing my own. This one is purely focused on AWS and not just on the RateLimiting issue. This one also gets around the NotFound" (Eventual Consistency) issue as well. Could we add an argument to the decorator in API to take in Exception Strings. Sure, but I think this way the writer of the modules will not have to think of the multiple use cases beyond RequestLimitExceeded. Also if an exception is not part of the list of retries, it will just raise it. |
@linuxdynasty i'm not saying modules should wrap the existing retries, i was suggesting this function could be re-implemented as a simple wrapper to an expanded 'generic retry'. This would make 'cloud specific' retry functions much easier in general. |
efba8c6
to
6c0e46c
Compare
I refactored it @bcoca. Now there is a base class called CloudRetry in module_utils/cloud.py. AWSRetry class in module_utils/ec2.py inherits from CloudRetry. Now only two staticmethods need to be overridden when another cloud provider wants to use this decorator. Test also updated. |
+1 this looks good well needed feature request thanks @linuxdynasty |
c82e70e
to
7807dcc
Compare
Can we add botocore, boto3 to the ansible core testing framework. As one of the failures, is that botocore does not exist. |
@linuxdynasty You can add |
while max_tries > 1: | ||
try: | ||
return f(*args, **kwargs) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't compatible with Python 2.4, which is causing it to fail the py24 tests:
2016-08-11 20:25:51 + python2.4 -m compileall -fq -x 'module_utils/(a10|rax|openstack|ec2|gce|docker_common|azure_rm_common|vca|vmware|gcp|gcdns).py' lib/ansible/module_utils
2016-08-11 20:25:51 Compiling lib/ansible/module_utils/cloud.py ...
2016-08-11 20:25:51 File "lib/ansible/module_utils/cloud.py", line 86
2016-08-11 20:25:51 except Exception as e:
2016-08-11 20:25:51 ^
2016-08-11 20:25:51 SyntaxError: invalid syntax
You can use this instead:
except Exception:
e = get_exception()
@linuxdynasty My apologies, I had you put the dependencies in the wrong requirements file. You should add them to the two requirements files in the |
@linuxdynasty If you want to run the unit tests locally with an environment similar to Shippable, use the following command: That will run the unit tests on Python 2.7. You can use other Python versions as well, if you have them installed. |
a9de44e
to
1350fe2
Compare
@@ -2,3 +2,5 @@ tox | |||
pyyaml | |||
jinja2 | |||
setuptools | |||
botocore | |||
boto3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to this file aren't needed since the dependencies are in the test/utils/tox
requirements files.
@mattclay thank you for all your help. Test are finally passing :D |
ba62d5b
to
519a504
Compare
tests are failing, but not do to this PR. |
I've restarted the build. The failure was due to a temporary timeout. |
@AWSRetry.backoff(tries=2, delay=0.1) | ||
def retry_once(): | ||
self.counter += 1 | ||
if self.counter < 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests would be a lot cleaner using Mock
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is why the original 'retry' is in api.py
, I might migrate this after merge
519a504
to
5b5c415
Compare
@bcoca any updates on if you want me to move CloudRetry from cloud.py to api.py and still leave AWSRetry in ec2.py? |
pass | ||
|
||
@classmethod | ||
def backoff(cls, tries=10, delay=3, backoff=2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about these defaults - a delay of 3 and backoff of 2 for 10 tries would mean that, to fail, this retry decorator would wait for 3069 seconds (3 + 3*2 + 3*2*2 ....
, or sum([3 * 2**i for i in range(10)])
) or about 50 minutes. That seems like a really long time, especially since most modules make several calls.
A better default might be 4 tries, for a total default wait time of 45 seconds and having a max of, say, a minute between tries. That way, if someone wanted 10 tries it would only take about 7.5 minutes to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense and I will update it shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of changing the default backoff of 2 seconds to 1.1 seconds. I think this is more reasonable. sum([3 * 1.1**i for i in range(10)])
About a total of 48 seconds in 10 tries. @ryansb thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be configurable, because this isn't enough for the work I do with auto scale groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjshoe It is configurable, these values are just the defaults that will be used if the developer calls the decorator without arguments.
@linuxdynasty That seems reasonable to me - go for it. Should be plenty to wait out most rate limits.
This base class CloudRetry can be reused by any other cloud provider. This decorator should be used in situations, where you need to implement a backoff algorithm and want to retry based on the status code from the exception.
d048b06
to
f807b6c
Compare
@ryansb I pushed the changes. The build failed cause it could not spin a container. |
This will be about a total of 48 seconds in 10 tries. This is configurable.
f807b6c
to
1acdd4e
Compare
@ryansb does everything else look good to you? |
* Added aws_retry decorator function with unit tests * Restructured the code to be used with a base class. This base class CloudRetry can be reused by any other cloud provider. This decorator should be used in situations, where you need to implement a backoff algorithm and want to retry based on the status code from the exception. * updated documentation * fixed tabs * added botocore and boto3 to requirements.txt * removed cloud.py from py24 tests, as it depends on boto3 * fix relative imports * updated test to be 2.6 compat * updated method name from retry to backoff * readded lxd * Updated default backoff from 2 seconds to 1.1s. This will be about a total of 48 seconds in 10 tries. This is configurable.
ISSUE TYPE
COMPONENT NAME
CloudRetry Base class inside of module_utils/cloud.py
AWSRetry decorator inside of module_utils/ec2.py
ANSIBLE VERSION
SUMMARY
The CloudRetry class can be implemented by any other cloud provider, that wants to implement a basic backoff algorithm in a decorator.
AWSRetry class overwrites 2 methods.
AWSRetry.backoff() decorator can be applied to any function that is making an aws boto/boto3 call.
It will only retry on the following Exceptions.
This list of failures is based on this API Reference.
The CloudRetry.backoff decorator takes on the following kwargs.
This change is