This repository has been archived by the owner on Jun 29, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 49
kube-controller-manager doesn't respawn with 1 controller #1097
Labels
Comments
johananl
added
bug
Something isn't working
area/kubernetes
Core Kubernetes stuff
labels
Oct 19, 2020
johananl
added a commit
that referenced
this issue
Oct 19, 2020
johananl
added a commit
that referenced
this issue
Oct 19, 2020
johananl
added a commit
that referenced
this issue
Oct 19, 2020
johananl
added a commit
that referenced
this issue
Oct 20, 2020
@johananl isn't the fix to this problem to fix this
|
invidian
added a commit
that referenced
this issue
Nov 18, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 18, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 18, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 18, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 18, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
pushed a commit
that referenced
this issue
Nov 18, 2020
invidian
added a commit
that referenced
this issue
Nov 19, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 23, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 23, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 23, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 25, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 26, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 27, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Nov 30, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Dec 1, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Dec 2, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Dec 2, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Dec 3, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
invidian
added a commit
that referenced
this issue
Dec 3, 2020
This commit fixes passed 'control_plane_replicas' value to Kubernetes Helm chart which caused kube-scheduler and kube-controller-manager to run as DaemonSet on single controlplane node clusters, which breaks the ability to update it gracefully. It also adds tests that controlplane is using right resource type on different controlplane sizes and that both can be gracefully updated without breaking cluster functionality. Closes #1097 Closes #90 Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
In 3c1fb4a (#1030) we've changed kube-controller-manager from a deployment to a daemonset when the number of controller nodes is greater than one. However, looks like we create a daemonset also when running with one controller. The reason for that is the following:
lokomotive/assets/charts/control-plane/kubernetes/templates/kube-controller-manager-ds.yaml
Line 1 in c730066
lokomotive/assets/terraform-modules/bootkube/assets.tf
Line 72 in c730066
As can be seen, when deploying a single controller node
max(2, length(var.etcd_servers))
evaluates to2
, which makes the conditional in the chart be always true.To fix this we could change the conditional to be
{{- if gt (int .Values.controllerManager.controlPlaneReplicas) 2 }}
.However, I'm not sure we ever want to use a daemonset for kube-controller-manager. A deployment is better suited for this element because our goal is to have a consistent number of replicas in the cluster, not to have a replica on each node. Furthermore, I'm not sure there is a point in running more than two replicas when there are multiple controller nodes, unless we aim at surviving a dual failure (in which case there are likely many other things we would have to change). AFAICT a total of two kube-controller-manager pods should be the minimum and should suffice as the maximum as well.
Since kube-controller-manager is the element that's responsible for re-creating pods when a pod gets deleted, at least one kube-controller-manager pod must be operational at all times. Right now, when deploying a cluster with a single controller node, deleting the kube-controller-manager pod puts the cluster in an unrecoverable state because there is no kube-controller-manager to re-create the deleted kube-controller-manager pod.
I understand the concern about having multiple replicas land on a single node. Still, I think we should figure out other ways to solve that problem and in any case the situation is currently worse than before #1030 when running with a single controller node: Before the PR we would have a broken control plane when the controller node dies, however would recover from pod deletions. Now the cluster still breaks when the controller node dies, however we no longer recover from deleting the kube-controller-manager pod (and likely from evictions of that pod, too).
Related issues:
@surajssd - FYI.
The text was updated successfully, but these errors were encountered: