-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression? juju 2.9 / get_unit_address() causes disconnect? - breaks libjuju #615
Comments
Due to bug [1], it seems that there is a issue with libjuju communicating with the Juju controller which causes a permanent model disconnect that libjuju doesn't resolve. Thus, for zaza, this patch wraps unit.get_public.address() with a wrapper that can choose, based on the environment variable `ZAZA_FEATURE_BUG472` to use a subprocess shell call to the `juju status` command to get the public address. This should always succeed. The feature environment variable `ZAZA_FEATURE_BUG472` was added so that the library can switch between the native libjuju function and the fallback wrapper to enable testing of the issue as libjuju continues to evolve. By default, the wrapper function is used, to enable zaza to interoperate with libjuju with Juju 2.9 on OpenStack providers. The implementation is slightly complicated because an async version of the wrapper `get_unit_public_address()` is needed as it is called from async code. [1]: juju/python-libjuju#615
Due to bug [1], it seems that there is a issue with libjuju communicating with the Juju controller which causes a permanent model disconnect that libjuju doesn't resolve. Thus, for zaza, this patch wraps unit.get_public.address() with a wrapper that can choose, based on the environment variable `ZAZA_FEATURE_BUG472` to use a subprocess shell call to the `juju status` command to get the public address. This should always succeed. The feature environment variable `ZAZA_FEATURE_BUG472` was added so that the library can switch between the native libjuju function and the fallback wrapper to enable testing of the issue as libjuju continues to evolve. By default, the wrapper function is used, to enable zaza to interoperate with libjuju with Juju 2.9 on OpenStack providers. The implementation is slightly complicated because an async version of the wrapper `get_unit_public_address()` is needed as it is called from async code. [1]: juju/python-libjuju#615
This is not about When you're running Then for each In other words, when you create the new So if you want to make facade calls from the units of a model at any time in the future (or call methods that uses the connection in any way, such as calling (p.s. The units that you're able to get the public addresses of are the ones that already have addresses in their |
Hi Caner Thanks for the clarification. To ensure my understanding, please could you confirm:
and:
Does this mean that I need to assume that the model is disconnected, make a connection, get all the data they need and then disconnect; it's the only way to be sure that it might work (assuming the model doesn't disconnect between connecting and other actions on it). Do all interactions with a model need to be wrapped in a try: except: in case the "websockets.ConnectionClosed" exception is raised? Also, I've noticed that the issue is closed? Is that because it's not fixable or that it is fixed? Thanks very much! |
Hey Alex, You are correct. The Model object is to be kept connected for the entirety of the computation, it is not designed for like a plug-and-play sort of use. Further, the contingencies are in place for the Model to reconnect gracefully if it's not disconnected on purpose, so you do not need to handle potential connection issues along the way. Just don't disconnect it until you're done for sure, and everything should work well. I closed the issue because to fix your problem all you need to do is to just remove the lines that disconnect the model prematurely and it should work fine. We can re-open the issue no problem if further work is needed 👍 As a side info; we also thought that the entity data should not be available after the associated Model is disconnected, as it's not possible to validate that it's up-to-date with what's going on in Juju's side (as we're not getting any deltas anymore after disconnect). In your example some of your Units are able to get their addresses, because currently that data is available even though the model is disconnected (again, despite that there's no way of knowing if the unit's actual address is changed behind the scenes after disconnect). You can see the details on #629 if you're interested, though it's currently held for further discussion. Hope this clears it up for you, let us know if you need further info or action on this, cheers! |
Hi Caner Thanks for the confirmation. It does make it a bit awkward for zaza as it's a sync application trying to use an async library. Essentially, I need to arrange for it to have the async loop running (so that model updates occur and the objects attached to those models are updated). My solution will be to put the libjuju loop into a background thread and then inject async calls into the loop to access libjuju functions using run_coroutine_threadsafe. The GIL will make the objects thread-safe for read access which means they can be used on the sync side more conveniently. With luck I can keep all the 'interfaces' in zaza the same and thus the existing codebase won't need to change. At least that's the plan. I'll let you know how I get on! |
So this is possibly connected to #458, but I found the this bug integrating
get_unit_address()
to fetch the unit address unconditionally on an OpenStack cloud.Essentially, like #458, libjuju gets a disconnect:
The smallest reproducer that will definitely trigger this behaviour is these two scripts:
and
fetch1.py
:i.e. bash -> python, and the python code may cause a disconnect:
For the sake of clarify this means that repeatedly calling
get_unit_address()
between invocations of the python interpreter, loading libjuju fresh can trigger the disconnect.There's obviously something strange going on in terms of the juju controller and idles' or disconnects?
Testing on juju 2.8.x does not show these problems.
The text was updated successfully, but these errors were encountered: