Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add leader election options #133

Merged

Conversation

hanqiuzh
Copy link
Contributor

@hanqiuzh hanqiuzh commented Sep 30, 2021

Updates:

  • added leader election options, same as CCCMO

Tests:

  • enable leader election, non leader pods will not start controllers (includes status controller)
  • enable leader election and delete leader pod, new leader will be elected, and controllers will start
  • disable leader election, all pods works

@hanqiuzh
Copy link
Contributor Author

hanqiuzh commented Oct 4, 2021

/retest-required

@hanqiuzh hanqiuzh force-pushed the add-leader-election branch 3 times, most recently from 4ad3723 to 4039544 Compare October 5, 2021 01:20
@enxebre
Copy link
Member

enxebre commented Oct 5, 2021

/retest

1 similar comment
@hanqiuzh
Copy link
Contributor Author

hanqiuzh commented Oct 5, 2021

/retest

main.go Outdated
MetricsBindAddress: metricsPort,
LeaderElectionNamespace: leaderElectionConfig.ResourceNamespace,
LeaderElection: leaderElectionConfig.LeaderElect,
LeaseDuration: &leaderElectionConfig.LeaseDuration.Duration,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let's set LeaderElectionResourceLock: resourcelock.LeasesResourceLock
otherwise controller runtime defaults to configmapsleases

- ""
- "coordination.k8s.io"
resources:
- configmaps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for additional rbac for configmaps, only leases https://github.com/openshift/cluster-machine-approver/pull/133/files#r726013249

main.go Outdated
@@ -56,7 +73,11 @@ func main() {
flagSet.StringVar(&machineNamespace, "machine-namespace", "", "restrict machine operations to a specific namespace, if not set, all machines will be observed in approval decisions")
flagSet.StringVar(&workloadKubeConfigPath, "workload-cluster-kubeconfig", "", "workload kubeconfig path")

flagSet.Parse(os.Args[1:])
// Once all the flags are regitered, switch to pflag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow this comment, mind elaborating?

Copy link
Contributor Author

@hanqiuzh hanqiuzh Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure BindLeaderElectionFlags is expecting a pflag as input variable, so I converted the original flags into pflag type. (same as cccmo)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this exposing more than we need as a flags, e.g ResourceLock?
If I would just expose the ones we need manually, no need for the library.

@enxebre
Copy link
Member

enxebre commented Oct 11, 2021

what is CCCMO?

main.go Outdated
// Default leader electrion configuration.
leaderElectionConfig = config.LeaderElectionConfiguration{
LeaderElect: true,
LeaseDuration: metav1.Duration{Duration: 137 * time.Second},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you choose this values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elmiko
Copy link
Contributor

elmiko commented Oct 11, 2021

what is CCCMO?

https://github.com/openshift/cluster-cloud-controller-manager-operator/ the new operator for managing CCMs on openshift

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 11, 2021
@hanqiuzh
Copy link
Contributor Author

hanqiuzh commented Oct 12, 2021

+1 that CCCMO is cluster-cloud-controller-manager-operator, and please see pr comment, I added a link of the reference commit I used 👍

@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 12, 2021
@enxebre
Copy link
Member

enxebre commented Oct 13, 2021

Thanks! Just one comment #133 (comment)

@enxebre
Copy link
Member

enxebre commented Oct 13, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 13, 2021
@hanqiuzh
Copy link
Contributor Author

/test e2e-aws

@hanqiuzh
Copy link
Contributor Author

/test e2e-upgrade

@hanqiuzh
Copy link
Contributor Author

/retest

@@ -58,6 +66,12 @@ func main() {
flagSet.StringVar(&workloadKubeConfigPath, "workload-cluster-kubeconfig", "", "workload kubeconfig path")
flagSet.BoolVar(&disableStatusController, "disable-status-controller", false, "disable status controller that will update the machine-approver clusteroperator status")

flagSet.BoolVar(&leaderElect, "leader-elect", true, "use leader election when starting the manager.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want it true by default?

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick change, otherwise LGTM

main.go Outdated
@@ -90,7 +104,14 @@ func main() {
// Create a new Cmd to provide shared dependencies and start components
klog.Info("setting up manager")
mgr, err := manager.New(workloadConfig, manager.Options{
MetricsBindAddress: metricsPort,
MetricsBindAddress: metricsPort,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add ReleaseOnCancel: true else this will cause approx 3 minute delays during upgrades

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated with LeaderElectionReleaseOnCancel: true

Comment on lines +167 to +171
go func() {
<-mgr.Elected()
statusController.Run(1, stop)
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that we run this separately, I'll make a note to look into this

Signed-off-by: Hanqiu Zhang <hanzhang@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 15, 2021
@JoelSpeed
Copy link
Contributor

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 15, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 15, 2021
@hanqiuzh
Copy link
Contributor Author

/retest

1 similar comment
@hanqiuzh
Copy link
Contributor Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 15, 2021

@hanqiuzh: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-disruptive a135d50 link false /test e2e-aws-disruptive

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @hanqiuzh !
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 18, 2021
@openshift-merge-robot openshift-merge-robot merged commit 7b7289a into openshift:master Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants