Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HA support to endpoints controller #479

Merged
merged 2 commits into from
Apr 8, 2021

Conversation

thisisnotashwin
Copy link
Contributor

@thisisnotashwin thisisnotashwin commented Apr 8, 2021

Changes proposed in this PR:

  • Allow for leader election which will allow us to run multiple instances of the endpoints controller. When one of the instances gets deleted, another running instance grabs the lease and becomes the new leader.
  • All instances will serve the webhook server but the requests will be round robin-ed between them.

How I've tested this PR:

  • Created an image from it and ran the acceptance tests against it. I had to update some of the RBAC to ensure the controller has CRUD access on configmaps and also can read from leases (this was a change introduced by the update to the latest controller runtime that was made to the config-entry controller and was now made to connect-inject controller. The acceptance tests passed with the number of replicas of the connect-inject deployment set to 3.
  • Manually deployed the controller with 3 instances and deleted the leader to observe another pod grab the lease and become the new leader while reconciling the endpoints.
  • ashwinvenkatesh/consul-k8s@sha256:2dc7eb5523645ecc7198212af8fa5056286ac01d9a015a5afcc5b4e95d50d695

How I expect reviewers to test this PR:

  • Hit approve 😛

@thisisnotashwin thisisnotashwin requested review from kschoche, a team and lkysow and removed request for a team April 8, 2021 20:48
@thisisnotashwin thisisnotashwin added type/enhancement New feature or request theme/tproxy Items related to transparent proxy labels Apr 8, 2021
Copy link
Contributor

@kschoche kschoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that your comments are 10x your LOC changes 🗡️

Okay so silly question: I don't think we need to include unit testing for the manager's leader election itself but should we make an acceptance test of sorts to validate it end to end?
I'm ok with "no".

Copy link
Member

@lkysow lkysow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only all tickets were this easy!

@thisisnotashwin
Copy link
Contributor Author

we make an acceptance test of sorts to validate it end to end?

That is a great idea. I did run the acceptance test with multiple instances running to ensure that there isnt a "breaking" change and manually checked that leaders got swapped when a pod got deleted. I guess i am on the fence about writing an acceptance test for this specifically only stems from the fact that we did not build in the leader election logic, and if someone makes a mistakes the existing tests should fail 😄

@lkysow
Copy link
Member

lkysow commented Apr 8, 2021

Okay so silly question: I don't think we need to include unit testing for the manager's leader election itself but should we make an acceptance test of sorts to validate it end to end?

We should bump our replicas to 2 at some point. Then I think our acceptance tests as-is would be a good test.

@lkysow
Copy link
Member

lkysow commented Apr 8, 2021

If we added upgrade testing this would also be a good test since during the deploy they'll need to hand-off elections.

@thisisnotashwin
Copy link
Contributor Author

If we added upgrade testing this would also be a good test since during the deploy they'll need to hand-off elections.

That sounds like a good idea.

@lkysow lkysow mentioned this pull request Apr 8, 2021
Base automatically changed from tproxy-remove-cleanupcontroller to feature-tproxy April 8, 2021 22:48
- The controller manager allows for multiple instances of the controller
  to run using leader election. By configuring leader election, one of
the instances of the controller (the leader) will run the
reconciler. All instances can serve the webhook though, but only on of
the instances will inject the containers per request.

This requires additional priveleges from an RBAC perspective that will
be a part of a helm commit.
@thisisnotashwin thisisnotashwin merged commit 2d8fb8c into feature-tproxy Apr 8, 2021
@thisisnotashwin thisisnotashwin deleted the add-leader-election branch April 8, 2021 22:52
ishustava pushed a commit that referenced this pull request Apr 14, 2021
* Support HA for the endpoints controller

- The controller manager allows for multiple instances of the controller to run using leader election. By configuring leader election, one of the instances of the controller (the leader) will run the reconciler. All instances can serve the webhook though, but only on the instances will inject the containers per request.

This requires additional priveleges from an RBAC perspective that will be a part of a helm commit.
ndhanushkodi pushed a commit to ndhanushkodi/consul-k8s that referenced this pull request Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/tproxy Items related to transparent proxy type/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants