vs manager wait_for_ready exception handling #869

allmightyspiff · 2017-08-31T22:03:51Z

Will allow wait_for_ready to gracefully handle exceptions, and raise the delay for each exception. Raised the default delay from 1s to 10s

def wait_for_ready(self, instance_id, limit, delay=10, pending=False):
When an exception is encountered, delay will be multiplied by 2.
Will still return False if a ready state was not encountered before limit is reached.

…layer-python into odyhunter-delay_exponential

coveralls · 2017-08-31T22:08:40Z

Coverage increased (+0.01%) to 85.159% when pulling a58c26b on allmightyspiff:odyhunter-delay_exponential into fc0a44a on softlayer:master.

allmightyspiff · 2017-08-31T22:10:19Z

@gltdeveloper / @odyhunter

In my implementation I've gone with simply increasing the delay by a factor of 2. I'm not really convinced we need random staggering here, or a higher exponential factor than 2.
I've removed the attempts variable in favor of just increasing the delay per error. Since wait_for_ready already has a time limit I think his is a bit nicer.
I've changed the default delay from 1s to 10s. This is still configurable of course.

Comments are of course welcome.

coveralls · 2017-08-31T22:19:35Z

Coverage increased (+0.01%) to 85.159% when pulling 1b70b2b on allmightyspiff:odyhunter-delay_exponential into fc0a44a on softlayer:master.

odyhunter · 2017-08-31T23:08:57Z

@allmightyspiff This solution will solve our problem, but I think we still need to add attempt :)

Consider this case, if customer use 15 sec delay, 3600 sec total wait time, then if hit exception, the retry will be triggered 8 times. on 15, 30, 60, 120, 240, 480, 960...

If this is a man made error, e.g. given incorrect VSI ID or even customer side Network outage on their site, this code will still running...

gltdeveloper · 2017-08-31T23:18:25Z

On the removal of the randomization, as a datapoint, the customer that drove this request is driving 100 concurrent provisions (all individually), and all at the same time. If they get into a cadence of wait for ready, of 15 seconds...each per second and there is an API limit on things, it's possible that they could get on the same cadence. That's what drove some aspects of the randomization.
Note, other cloud providers use this algorithm for exponential backoff, and there's no reason to be that different IMHO for the sake of thinking we are doing it better. To some extent, people using both clouds would desire similar behavior.
Changing the default delay is fine.. (and like that it's configurable).
You could also cap the retry to the max retry wait time (to avoid going over what they desire as the desired model).

allmightyspiff · 2017-09-01T15:42:00Z

Thanks for the input.

specifying attempts

For both of those situations I would expect the user to want to ctrl-c out of the script before it hits attempts. Since the exception catcher logs exceptions, I don't think its too much to ask that users monitor the logs to see if their script is working correctly.
I think I will add a log entry for each successful loop though as well.

randomization

This is a good point, I think I'll add a random 0-9s delay each exception in addition to multiplying by 2

exponential backoff

delay = delay * 2
represents f(x)= 2^x, so we should be good there. With the random number being added the function will be
delay = (delay * 2) +random.randint(0, 9)

capped retry

wait_for_ready will only run at most the time specified by limit, no matter how many exceptions are thrown.

gltdeveloper · 2017-09-01T15:48:38Z

I agree with all the comments.. On the random side, I keep thinking maybe you can tie it to the current delay functionality but what you have is simple, and I do like the K.I.S.S. principle. This is a good place to start.

coveralls · 2017-09-01T20:31:13Z

Coverage increased (+0.02%) to 85.163% when pulling 0bac762 on allmightyspiff:odyhunter-delay_exponential into fc0a44a on softlayer:master.

allmightyspiff · 2017-09-01T20:33:02Z

Real testing

import SoftLayer
import logging

client = SoftLayer.Client()
vs_manager = SoftLayer.VSManager(client)
LOGGER = logging.getLogger()
LOGGER.addHandler(logging.StreamHandler())
LOGGER.setLevel(20)
ready = vs_manager.wait_for_ready(38614529, 5000, 10)
print(ready)

Testing exception

$ python wait_for_ready_test.py
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
Exception: SoftLayerAPIError(SoftLayer_Exception_ObjectNotFound): Unable to find object with id of '309835167'.
Auto retry in 4 seconds
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
Exception: SoftLayerAPIError(SoftLayer_Exception_ObjectNotFound): Unable to find object with id of '309835167'.
Auto retry in 17 seconds
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
Exception: SoftLayerAPIError(SoftLayer_Exception_ObjectNotFound): Unable to find object with id of '309835167'.
Auto retry in 41 seconds
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
Exception: SoftLayerAPIError(SoftLayer_Exception_ObjectNotFound): Unable to find object with id of '309835167'.
Auto retry in 36.471503496170044 seconds
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
Exception: SoftLayerAPIError(SoftLayer_Exception_ObjectNotFound): Unable to find object with id of '309835167'.
False
(py36)

Real VM being provisioned

$ python wait_for_ready_test.py
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
38614529 not ready.
Auto retry in 10 seconds
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
38614529 not ready.
Auto retry in 10 seconds
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
38614529 not ready.
Auto retry in 10 seconds
POST https://api.softlayer.com/xmlrpc/v3.1/SoftLayer_Virtual_Guest
True

odyhunter · 2017-09-01T22:00:42Z

Looking good :)

lei and others added 5 commits August 30, 2017 19:37

Add retry for wait_for_ready

f5115ac

Merge branch 'delay_exponential' of https://github.com/odyhunter/soft…

3d33eef

…layer-python into odyhunter-delay_exponential

cleaned up scale-back code and added unit tests

ce1d4db

fixing up unit tests and code quality

fa44de7

fixed documentation

a58c26b

allmightyspiff mentioned this pull request Aug 31, 2017

Add retry for wait_for_ready #868

Closed

removed trailing whitespace

1b70b2b

allmightyspiff self-assigned this Aug 31, 2017

moved some logging messages around, added random delay

0bac762

allmightyspiff merged commit e249c70 into softlayer:master Sep 5, 2017

allmightyspiff deleted the odyhunter-delay_exponential branch August 31, 2020 22:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vs manager wait_for_ready exception handling #869

vs manager wait_for_ready exception handling #869

Uh oh!

allmightyspiff commented Aug 31, 2017

Uh oh!

coveralls commented Aug 31, 2017

Uh oh!

allmightyspiff commented Aug 31, 2017 •

edited

Loading

Uh oh!

coveralls commented Aug 31, 2017

Uh oh!

odyhunter commented Aug 31, 2017

Uh oh!

gltdeveloper commented Aug 31, 2017

Uh oh!

allmightyspiff commented Sep 1, 2017

Uh oh!

gltdeveloper commented Sep 1, 2017

Uh oh!

coveralls commented Sep 1, 2017

Uh oh!

allmightyspiff commented Sep 1, 2017

Uh oh!

odyhunter commented Sep 1, 2017

Uh oh!

Uh oh!

vs manager wait_for_ready exception handling #869

vs manager wait_for_ready exception handling #869

Uh oh!

Conversation

allmightyspiff commented Aug 31, 2017

Uh oh!

coveralls commented Aug 31, 2017

Uh oh!

allmightyspiff commented Aug 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Aug 31, 2017

Uh oh!

odyhunter commented Aug 31, 2017

Uh oh!

gltdeveloper commented Aug 31, 2017

Uh oh!

allmightyspiff commented Sep 1, 2017

specifying attempts

randomization

exponential backoff

capped retry

Uh oh!

gltdeveloper commented Sep 1, 2017

Uh oh!

coveralls commented Sep 1, 2017

Uh oh!

allmightyspiff commented Sep 1, 2017

Real testing

Testing exception

Real VM being provisioned

Uh oh!

odyhunter commented Sep 1, 2017

Uh oh!

Uh oh!

allmightyspiff commented Aug 31, 2017 •

edited

Loading