New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clustermesh: Introduce ClusterID reservation mechanism #26124
clustermesh: Introduce ClusterID reservation mechanism #26124
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly ready to me, just a couple of comments inline.
9f40c4e
to
a97d1c0
Compare
/test |
91e9ebf
to
03641a9
Compare
/test |
/ci-multicluster Similar flake to #25816 |
Currently, the ClusterIDs for each remoteClusters are managed by each remote cluster controllers with rc.config. This makes very hard to control the access to the ClusterIDs. For example, when we have a new remote cluster connection and receive a new cluster config, we need to ensure the new ClusterID is not used by other remote cluster controller. To ensure that, we need to iterate over all remoteCluster objects and also access to the rc.config which may be changed over time depending on each remote cluster's connection state. For every time the remoteCluster controller start to use a new ClusterID, it "reserves" the ClusterID from central registry. By correctly performing mutex for this reservation, we can guarantee that no one else uses the reserved ClusterID. So that after the reservation, each remoteCluster controller can exclusively access to the corresponding CT/SNAT per-cluster map slots. This can also replace the complicated canConnect() validation with ClusterID reservation. Instead of iterate over all clusters and check ClusterID uniqueness, we can simply try to reserve the ID and if it fails, reject a new connection. Once the remote cluster controller finish using the ClusterID, it cleanups any resources bounded to the ClusterID (e.g. per-cluster maps) and "releases" the ClusterID. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
03641a9
to
743b9d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/test |
I'm going to merge this, but I will note that "reservation system" to me implies that there should be full lifecycle tests that allocate reservations and also ensure that reservations are properly garbage collected and so on. At a glance, I didn't see that sort of testing, so if we don't have any such testing then I would encourage you to think about building additional testing for those cases. |
This was mostly refactoring of existing code and not really introducing any new functionality. |
Improve the lifecycle of cluster config by introducing the ClusterID reservation mechanism. More details in commit message