-
Notifications
You must be signed in to change notification settings - Fork 123
feat: unpeering force #3054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: unpeering force #3054
Conversation
Hi @giuliafalaschi. Thanks for your PR! I am @adamjensenbot.
Make sure this PR appears in the liqo changelog, adding one of the following labels:
|
Hi @giuliafalaschi, thanks for this PR. Is this ready for review? |
Yes, I tested everything on a local cluster. |
Hi Giulia 😊 Thank you so much for your contribution and the effort you put into this PR, it's really appreciated! 🙏 That said, I noticed some issues related to how the forced unpeering is handled. Specifically, there are some challenges that need to be handled when the remote cluster becomes unreachable, which are not addressed in your implementation. The two main components that cause problems in such scenarios are:
Because of this, if you try to force unpeer two kind clusters and then delete one of them, with your version of the code you’ll notice that the forced unpeer hangs on the ResourceSlice deletion. To properly handle these situations, we need to introduce a way to inform that the remote cluster is no longer expected to respond. That way, components like the CRDReplicator or Virtual Kubelet can stop trying to reach it endlessly. Here are a couple of ideas I could come up with:
Once we handle this properly in the Liqo core, we can refine the behavior of the liqoctl unpeer command. I saw you modified the flow a bit, but I would suggest keeping the original flow and falling back to forced unpeer only if the standard one fails. For example, if we can’t get the cluster ID of the provider during unpeering, we can attempt with a short timeout. If it fails and Lastly, regarding the CLI flags: instead of having both Thanks again for your work! Let me know your thoughts on this and feel free to ask if you any questions 😊 |
Description
Two flags have been added to perform a forced unpeering:
--force
, a Boolean flag that force the unpeering even if the provider cluster is unreachable, and--remote-cluster-id
, a flag that must be entered in conjunction with--force
and must contain the cluster id of the provider cluster.Fixes #3051
How Has This Been Tested?
I created two local clusters using kind, installed Liqo on them, and established a peering relationship between the two. To simulate a catastrophic scenario, I modified the IP address of the provider cluster to make it unreachable. Finally, I executed the unpeer command with the options I implemented, specifically:
liqoctl unpeer --force --remote-cluster-id <cluster-id>
In this way, the unpeering was successfully performed on the consumer cluster, even though the provider was no longer reachable.