New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1.10] Backport #18486 #18737
[v1.10] Backport #18486 #18737
Conversation
The changes that we have been doing to /etc/defaults/kubelet are reset on node reboots, as is apparently the whole /etc directory --- which also means that /etc/cni/net.d/05-cilium.conf is removed. This would not be a problem if the assumption we made that the node taint we recommend placing on the nodes would come back upon reboots held true, but in practice it doesn't. Besides this, it seems that containerd will re-instante its CNI configuration file, and it will do so way before Cilium has had the chance to re-run on the node and re-create its CNI configuration, causing pods to be assigned IPs by the default CNI rather than by Cilium in the meantime. This commit attempts at preventing that from happening by observing that /home/kubernetes/bin/kubelet (i.e. the actual kubelet binary) is kept between reboots and executed concurrently with containerd by systemd. We leverage on this empirical observation to replace this file kubelet with a wrapper script that, under the required conditions, disables containerd, patches its configuration, removes undesired CNI configuration files, re-enables containerd and becomes the kubelet. [ upstream commit 36585e4 ] Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> Co-authored-by: Alexandre Perrin <alex@kaworu.ch> Co-authored-by: Chris Tarazi <chris@isovalent.com>
To prevent situations in which the GKE node is forcibly stopped and re-created from causing unmanaged pods, and building on the observation that the node comes back with the same name and pods are already scheduled there, we change the recommended taint effect from NoSchedule to NoExecute, to cause any previously scheduled pods to be evicted, preventing them from getting IPs assigned by the default CNI. This should not impact other environments due to the nature of 'NoExecute', so we recommend it everywhere. [ upstream commit b049574 ] Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> Co-authored-by: Tam Mach <sayboras@yahoo.com>
/test-backport-1.10 Job 'Cilium-PR-K8s-GKE' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment Job 'Cilium-PR-K8s-1.20-kernel-4.19' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment Job 'Cilium-PR-K8s-1.16-net-next' failed and has not been observed before, so may be related to your PR: Click to show.Test Name
Failure Output
If it is a flake, comment |
@bmcustodio What's the status on this please? Was it expected to drop the backport? Can we please update the |
Apologies, I see this was backported in #18835 instead, sorry for the noise. |
Backport of #18486 to
v1.10
(cc @joamaki).