New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change leadership mechanism #182
Change leadership mechanism #182
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good this is big changes!
Would you please fix typos and warp commits title and body it cut of on github.
I see that we keep using retryOnConflict
when interacting with kube-api, can you please elaborate why we need it?
pkg/manager/leaderelection.go
Outdated
@@ -13,13 +14,18 @@ func (k *KubeMacPoolManager) waitToStartLeading() error { | |||
} | |||
|
|||
func (k *KubeMacPoolManager) markPodAsLeader() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this function name to something like labelLeaderElectedPod
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to AddLeaderLabelToElectedPod
. is that ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect
/hold new elected does not acquire old cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some error wrapping is missing, also we may weant to do the Elected
check at pool manager outside, so pool manager is agnostic of leader election, for that we have to make an poolManager Start
method public..
tests/tests.go
Outdated
numberOfReadyPods := int32(0) | ||
for _, podObject := range podsList.Items { | ||
if podObject.Status.Phase != corev1.PodRunning { | ||
return false | ||
for _, condition := range podObject.Status.Conditions { | ||
if condition.Type == corev1.PodReady && condition.Status == corev1.ConditionTrue { | ||
numberOfReadyPods += 1 | ||
} | ||
} | ||
} | ||
if numberOfReadyPods < numOfReplica { | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @ormergi point out, just checking managerDeployment.Status.ReadyReplicas != numOfReplica
is enough since it's more or less the same as this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RamLavi this is not resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry missed it
44aca12
to
537cb74
Compare
Currently we use an external leader election where only the elected pod actually runs the controller-runtime manager. We want to change this behavior so that all pods run the controller-runtime manager, thus are shown as ready. To do that we remove the od code of the leader election. Signed-off-by: Ram Lavi <ralavi@redhat.com>
…elected leader pod This also means we get can remove the use of leaderElectionChannel Signed-off-by: Ram Lavi <ralavi@redhat.com>
…er() Signed-off-by: Ram Lavi <ralavi@redhat.com>
in thie commit we move waitToStartLeading() to a separate go routine, so that both pods will start the controller-runtime manager. we then add the leader election inside the controller runtime manager. Signed-off-by: Ram Lavi <ralavi@redhat.com>
since the webhook service operates only on pods with kubemacpool leader label, we cant start the webhooks until the election is complete and a leader as been chosen. in order to do that, we add a new status to the pod, to prevent the pods to be ready and the webhhoks to start until the label is set. Signed-off-by: Ram Lavi <ralavi@redhat.com>
…appropriate rbacs. Signed-off-by: Ram Lavi <ralavi@redhat.com>
Signed-off-by: Ram Lavi <ralavi@redhat.com>
…y will initiate after teh election is complete since we don't want both pods to perform pool-manager related operations, we start the pool-manager routines only after the kubemacpool leader is selected. Signed-off-by: Ram Lavi <ralavi@redhat.com>
…oyment replicas, as well as making the needed changes in test to wait for both kubemacpool instead of only to the leader. Signed-off-by: Ram Lavi <ralavi@redhat.com>
537cb74
to
a445b80
Compare
/unhold fixed issues and also added a test for it in other PR #183 |
a445b80
to
e4b432a
Compare
tests/tests.go
Outdated
numberOfReadyPods := int32(0) | ||
for _, podObject := range podsList.Items { | ||
if podObject.Status.Phase != corev1.PodRunning { | ||
return false | ||
for _, condition := range podObject.Status.Conditions { | ||
if condition.Type == corev1.PodReady && condition.Status == corev1.ConditionTrue { | ||
numberOfReadyPods += 1 | ||
} | ||
} | ||
} | ||
if numberOfReadyPods < numOfReplica { | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RamLavi this is not resolved
e4b432a
to
0325252
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last nit.
Signed-off-by: Ram Lavi <ralavi@redhat.com>
0325252
to
a3708fe
Compare
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qinqon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-v0.14.0 |
@RamLavi: #182 failed to apply on top of branch "release-v0.14.0":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Currently our backup-active implementation make it so that only 1 kubemacpool pod is in ready state. This causes a problem with the deployment status to be not ready.
In order to fix this, we need to change the leadership election to occur using the controller-runtime api, and not externally as implemented now.
What this PR does / why we need it:
Special notes for your reviewer:
Release note: