Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add support for kube-vip #320

Merged
merged 27 commits into from
May 17, 2022
Merged

Conversation

davidspek
Copy link
Contributor

Replacement of: detiber#65.
What this PR does / why we need it:
This replaces the binding of an elastic IP to one of the control plane nodes with deploying kube-vip and having it load balance the Kubernetes API between the various control plane nodes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 8, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot
Copy link
Contributor

Welcome @davidspek!

It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-packet 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-provider-packet has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Apr 8, 2022
@k8s-ci-robot k8s-ci-robot requested a review from deitch April 8, 2022 13:24
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 8, 2022
@davidspek
Copy link
Contributor Author

@cprivitere I still need to validate that nothing went wrong during the rebase, but here is the PR in any case.

@davidspek
Copy link
Contributor Author

@cprivitere Looks like nothing strange happened during the rebasing. I will do a test deployment though to be sure, but it should be good to continue on from here.

@WillsonHG
Copy link

/easycla

@cprivitere
Copy link
Member

cprivitere commented Apr 8, 2022

@davidspek So I recently changed my Github username from @cprivite to @cprivitere . Because of this the email address associated with my github is now 23177737+cprivitere@users.noreply.github.com. This is causing EasyCLA to not be able to validate that I've signed the CLA and thus block this PR from going through. It sounds like you either need to squash all the commits to just be from you or have you update the author of each commit to be my new username. Do you have a preference?

cprivitere and others added 10 commits April 11, 2022 17:43
Signed-off-by: Chris Privitere <cprivite@users.noreply.github.com>
Signed-off-by: Chris Privitere <cprivite@users.noreply.github.com>
Signed-off-by: Chris Privitere <cprivite@users.noreply.github.com>
Signed-off-by: Chris Privitere <cprivite@users.noreply.github.com>
Signed-off-by: Chris Privitere <cprivite@users.noreply.github.com>
Signed-off-by: Chris Privitere <cprivite@users.noreply.github.com>
@cprivitere
Copy link
Member

It looks like prow isn't managing the CI jobs. Not sure what this does but it is waiting for approval. https://github.com/kubernetes-sigs/cluster-api-provider-packet/actions/runs/2150416972

Yeah, I've already messaged @detiber for help.

Copy link
Member

@displague displague left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like my review comments are a little late and some discussion has already happened about making kube-vip non-default. I have some concerns about BGP being enabled by default.

// - bgpConfig struct does not have Status=="disabled"
if err == nil && bgpConfig != nil && bgpConfig.ID != "" && strings.ToLower(bgpConfig.Status) != "disabled" {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will we do if the BGPConfig Get request fails? Invalid project or token, timeout, API availability issues?
Let's handle err != nil here. We should log at the very least.

If BGP can not be enabled, we can return that error and this error will cascade through the reconciliation loop and we will await the next reconciliation loop to attempt to enable BGP. That sounds good. The log messages will be helpful if the resource can not resolve because of this.

// when the node is a control plan we should check if the elastic ip
// for this cluster is not assigned. If it is free we can prepare the
// current node to use it.
// when the node is a control plan we need the elastic IP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// when the node is a control plan we need the elastic IP
// when a node is a control plane node we need the elastic IP

@@ -137,6 +137,11 @@ func (r *PacketClusterReconciler) reconcileNormal(ctx context.Context, clusterSc
}
}

if err := r.PacketClient.EnableProjectBGP(packetCluster.Spec.ProjectID); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BGP + KubeVIP mode should be optional

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once enabled in a project, BGP can not be easily disabled and some of the initial settings can not be modified. We shouldn't enable this by default for everyone.

@@ -362,6 +353,11 @@ func (r *PacketMachineReconciler) reconcile(ctx context.Context, machineScope *s
machineScope.SetProviderID(dev.ID)
machineScope.SetInstanceStatus(infrav1.PacketResourceStatus(dev.State))

if err := r.PacketClient.EnsureNodeBGPEnabled(dev.ID); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be optional, based on a CAPP setting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to look at the latest version.

@@ -222,7 +223,7 @@ func (r *PacketMachineReconciler) PacketClusterToPacketMachines(ctx context.Cont
}
}

func (r *PacketMachineReconciler) reconcile(ctx context.Context, machineScope *scope.MachineScope) (ctrl.Result, error) { //nolint:gocyclo
func (r *PacketMachineReconciler) reconcile(ctx context.Context, machineScope *scope.MachineScope) (ctrl.Result, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (r *PacketMachineReconciler) reconcile(ctx context.Context, machineScope *scope.MachineScope) (ctrl.Result, error) {
func (r *PacketMachineReconciler) reconcile(ctx context.Context, machineScope *scope.MachineScope) (ctrl.Result, error) { //nolint:gocyclo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add this nolint comment back in.

@k8s-ci-robot k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 12, 2022
Signed-off-by: DavidSpek <vanderspek.david@gmail.com>
@davidspek
Copy link
Contributor Author

@cprivitere @displague Sorry for the slow response here. I've just pushed the suggested changes.

@cprivitere
Copy link
Member

@davidspek can you make generate and re-push?

@cprivitere cprivitere changed the title ✨ Add support for kube-vip [WIP] ✨ Add support for kube-vip Apr 29, 2022
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 29, 2022
Signed-off-by: DavidSpek <vanderspek.david@gmail.com>
Signed-off-by: DavidSpek <vanderspek.david@gmail.com>
@davidspek
Copy link
Contributor Author

@cprivitere The lint error should be resolved now

@davidspek
Copy link
Contributor Author

davidspek commented May 3, 2022

I've just come across this error during the creation of a control plane node:

E0503 13:30:41.782725       1 controller.go:317] controller/packetmachine "msg"="Reconciler error" "error"="failed to enable bpg on machine eqm-cluster-small-on-demand-4wlm8: POST https://api.equinix.com/metal/v1/devices/6e646e83-08bf-468c-8395-6398ee17128c/bgp/sessions: 422 Device has no assigned IP addresses " "name"="eqm-cluster-small-on-demand-4wlm8" "namespace"="bootstrap" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="PacketMachine" 

This error should probably trigger a reconciliation so it will retry enabling BGP on the node.

EDIT:
I just noticed this error occurs when the machine is still provisioning, so it shouldn't be an issue actually.

davidspek and others added 2 commits May 4, 2022 17:23
Signed-off-by: DavidSpek <vanderspek.david@gmail.com>
…uster Type (#5)

* Convert to having the EIP_MANAGEMENT variable as part of the packetcluster type

* Make vipmanager field immutable.

Signed-off-by: Chris Privitere <23177737+cprivitere@users.noreply.github.com>

* Rename field to VIPManager

Signed-off-by: Chris Privitere <23177737+cprivitere@users.noreply.github.com>

* rename to vipmanager, fix defaults, rm services

Signed-off-by: Chris Privitere <23177737+cprivitere@users.noreply.github.com>

* Fix typo and make VIPManager an enum

Signed-off-by: Chris Privitere <23177737+cprivitere@users.noreply.github.com>
@cprivitere
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 17, 2022
@cprivitere cprivitere changed the title [WIP] ✨ Add support for kube-vip ✨ Add support for kube-vip May 17, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 17, 2022
@cprivitere
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cprivitere, DavidSpek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 17, 2022
@cprivitere cprivitere added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels May 17, 2022
@cprivitere
Copy link
Member

/retest

@k8s-ci-robot k8s-ci-robot merged commit a6d3608 into kubernetes-sigs:main May 17, 2022
@displague displague added this to the 0.6.0 milestone Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants