-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Object retrier #514
Object retrier #514
Conversation
Codecov Report
@@ Coverage Diff @@
## master #514 +/- ##
==========================================
+ Coverage 18.49% 18.82% +0.33%
==========================================
Files 160 160
Lines 9525 9565 +40
==========================================
+ Hits 1762 1801 +39
- Misses 7763 7764 +1
Continue to review full report at Codecov.
|
bff1a7b
to
cbce35c
Compare
* Update charm-octavia from branch 'master' to 0c07a1f7959dfca82e2e2f12070dd4c8f5168988 - Merge "Ensure octavia-driver-agents gets installed" - Ensure octavia-driver-agents gets installed Due to a 'quirk' being fixed, the octavia-driver-agents don't get installed if the ovsdb-subordinate relation is the last to be made. This is due to the update-status hook gating the reactive handler from running which 'fixed' the issue. Looking deeper, if the handler isn't the first to run, then the charm's packages won't get initialised properly. This patch fixes that by using the dynamic properties 'all_packages' and 'full_service_list' available in charms.openstack core for the Charm class. This means that it doesn't matter when the flag is set, as long as it is before the property is accessed. Looking at the handler, this will be in the right place for the install. func-test-pr: openstack-charmers/zaza-openstack-tests#514 Change-Id: I5fd75c9d371390bca402d6a3a264421a44fd092a Closes-Bug: #1916764
Due to a 'quirk' being fixed, the octavia-driver-agents don't get installed if the ovsdb-subordinate relation is the last to be made. This is due to the update-status hook gating the reactive handler from running which 'fixed' the issue. Looking deeper, if the handler isn't the first to run, then the charm's packages won't get initialised properly. This patch fixes that by using the dynamic properties 'all_packages' and 'full_service_list' available in charms.openstack core for the Charm class. This means that it doesn't matter when the flag is set, as long as it is before the property is accessed. Looking at the handler, this will be in the right place for the install. func-test-pr: openstack-charmers/zaza-openstack-tests#514 Change-Id: I5fd75c9d371390bca402d6a3a264421a44fd092a Closes-Bug: #1916764
Tested with https://review.opendev.org/c/openstack/charm-octavia/+/778003 (which has now merged) |
This adds a wrapper class that detects if a callable object in any of the descendent objects raises an Exception. If so, then it retries that exception. This is to attempt to make the zaza tests a little more robust in the face of small network failures or strange restarts. This is a test, and robust logging a reporting should be used to determine whether it is covering up actual bugs rather than CI system issues. Related Bug: (zot repo)openstack-charmers#348
The main failure that seems to occur with clients is the ConnectFailure, which according to the docs, is retry-able. Thus provide a function that adds that exception condition automatically. Also fix the import problem in octavia test for the object retrier that is being used to validate the ObjectRetrier feature.
cbce35c
to
c041042
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking but a few thoughts. The ObjectRetrier call really feels like a re-implementation of tenacity. Is there a reason you are not using tenacity here?
I was a bit horrified to see the first usage was at the setUpClass level. Are we truly seeing failures at this stage? That would seem to indicate the overcloud is not really ready yet. Does it not?
It feels like we will want this kind of thing everywhere. Should we consider wrapping the openstack clients and importing the wrapped versions rather than the direct clients?
zaza/openstack/utilities/__init__.py
Outdated
try: | ||
return obj(*args, **kwargs) | ||
except Exception as e: | ||
# if retry_exceptions is None, or the type of the exception is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is not None correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, typo; thanks for picking this up.
cls.keystone_client = ObjectRetrier( | ||
openstack_utils.get_keystone_session_client(cls.keystone_session)) | ||
cls.neutron_client = ObjectRetrier( | ||
openstack_utils.get_neutron_session_client(cls.keystone_session)) | ||
cls.octavia_client = ObjectRetrier( | ||
openstack_utils.get_octavia_session_client(cls.keystone_session)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit surprised to see these at the setUpClass level. Are we seeing failures here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I may not have explained how it works properly. The __init__(...)
method does the wrapping of the object, but it's when then client is used that the retry pattern is activated. e.g.
# wrap a client that is used to make calls.
client = ObjectRetrier(some_object_instance)
...
# later
...
# This is where the retries are done if needed
client.do_call()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes complete sense. I am not sure why I didn't see it at review time.
Why didn't I use tenacity? It's a fair question. It was mostly about having more obvious (in the code) control over how the exceptions from the wrapped code are handled and determine whether a retry will happen and then either eating or propagating that exception; I felt that tenacity was hiding it too much and also it was quite awkward to set it up.
No, that's just the setup code to wrap the object; no retries happening at that point - please see the inline comment; I can see how the naming of the class might make people think that that is when the retries happen. Maybe the class should be renamed to
Valid point. My purpose here (in this PR) was to introduce the code for the |
I am torn on this since we have the pattern of using tenacity everywhere else. But as I as, I am not blocking on this.
Makes complete sense now.
I appreciate this is the first round and is targeted. It makes sense to land a lighter weight change first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear we are just waiting on the typo fix to land this.
Fixes a typo (None -> not None) and renames the class from ObjectRetrier -> ObjectRetrierWraps to make it clearer that the class instantiation is to wrap and object with the retrier code rather than do retries at that moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Landing. Thanks for the work.
This adds a wrapper class that detects if a callable object in any of
the descendent objects raises an Exception. If so, then it retries that
exception.
This is to attempt to make the zaza tests a little more robust in the
face of small network failures or strange restarts. This is a test, and
robust logging a reporting should be used to determine whether it is
covering up actual bugs rather than CI system issues.
Related Bug: (zot repo)#348