Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent unmanaged pods in GKE's containerd flavors #18486

Merged
merged 2 commits into from Feb 7, 2022
Merged

Prevent unmanaged pods in GKE's containerd flavors #18486

merged 2 commits into from Feb 7, 2022

Conversation

bmcustodio
Copy link
Contributor

@bmcustodio bmcustodio commented Jan 14, 2022

The changes that we have been doing to /etc/defaults/kubelet are reset on node reboots, as is apparently the whole /etc directory — which also means that /etc/cni/net.d/05-cilium.conf is removed.

This would not be a problem if the assumption we made that the node taint we recommend placing on the nodes would come back upon reboots held true, but in practice it doesn't.

Besides this, it seems that containerd will re-instante its CNI configuration file, and it will do so way before Cilium has had the chance to re-run on the node and re-create its CNI configuration, causing pods to be assigned IPs by the default CNI rather than by Cilium in the meantime.

This commit attempts at preventing that from happening by observing that /home/kubernetes/bin/kubelet (i.e. the actual kubelet binary) is kept between reboots and executed concurrently with containerd by systemd. We leverage on this empirical observation to replace this file kubelet with a wrapper script that, under the required conditions, disables containerd, patches its configuration, removes undesired CNI configuration files, re-enables containerd and becomes the kubelet.

Additionally, to prevent situations in which the GKE node is forcibly stopped and re-created from causing unmanaged pods, and building on the observation that the node comes back with the same name and pods are already scheduled there, we change the recommended taint effect from NoSchedule to NoExecute, to cause any previously scheduled pods to be evicted, preventing them from getting IPs assigned by the default CNI.This should not impact other environments due to the nature of NoExecute, so we recommend it everywhere.

Prevent unmanaged pods in GKE's containerd flavors.
*Important:* Users should update their node taints from `node.cilium.io/agent-not-ready=true:NoSchedule` to `node.cilium.io/agent-not-ready=true:NoExecute`.
*Important:* During the first node reboot after the fix is applied pods may still get IPs from the default CNI as cilium-node-init is only run later in the node startup process. The fix will then be in place for all subsequent reboots.

@bmcustodio bmcustodio requested review from a team as code owners January 14, 2022 15:00
@bmcustodio bmcustodio requested review from a team and qmonnet January 14, 2022 15:00
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 14, 2022
@bmcustodio bmcustodio added backport/1.10 backport/1.11 This PR represents a backport for Cilium 1.11.x of a PR that was merged to main. labels Jan 14, 2022
@aanm aanm added needs-backport/1.10 release-note/bug This PR fixes an issue in a previous release of Cilium. and removed backport/1.10 backport/1.11 This PR represents a backport for Cilium 1.11.x of a PR that was merged to main. labels Jan 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.11.1 Jan 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.7 Jan 14, 2022
@bmcustodio

This comment has been minimized.

Copy link
Member

@kaworu kaworu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of question, but overall LGTM.

install/kubernetes/cilium/files/nodeinit/prestop.bash Outdated Show resolved Hide resolved
install/kubernetes/cilium/files/nodeinit/startup.bash Outdated Show resolved Hide resolved
@maintainer-s-little-helper

This comment has been minimized.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jan 17, 2022
@maintainer-s-little-helper

This comment has been minimized.

@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jan 17, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.11.2 Feb 8, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.11.2 Feb 8, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.8 Feb 8, 2022
@nebril nebril added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Feb 17, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.11 in 1.11.2 Feb 17, 2022
@joestringer
Copy link
Member

joestringer commented Feb 21, 2022

@nebril can you point to which PR did the backports of this PR for v1.11? Or switch the labels back if this was not backported to 1.11 yet?

EDIT: Found it, see the link below.

@joestringer joestringer added this to Backport pending to v1.10 in 1.10.9 Feb 23, 2022
@joestringer joestringer removed this from Backport pending to v1.10 in 1.10.8 Feb 23, 2022
@joestringer
Copy link
Member

joestringer commented Feb 24, 2022

@bmcustodio Please consider updating the official upgrade guide in future when communicating to users about necessary steps for them to take during upgrade; users are likely to miss notes marked "important" in the release notes unless both we and they are paying close attention. For v1.11.2 I will call it out explicitly in the announcements.

@joestringer
Copy link
Member

Hmm, actually looking back again at this PR I think that in the announcement I miscommunicated this because I assumed that the advice was specific to GKE users. It looks like all users must take action regarding the taints. Can you confirm?

So far this is only part of v1.11.2 so shouldn't have a huge impact but I am concerned about communicating this clearly so we don't end up having to field a large number of questions from confused users about why the taints are different than they were before.

@bmcustodio
Copy link
Contributor Author

It looks like all users must take action regarding the taints. Can you confirm?

Yes, this is correct ;)

@aanm aanm added this to Backport pending to v1.10 in 1.10.10 Mar 26, 2022
@aanm aanm removed this from Backport pending to v1.10 in 1.10.9 Mar 26, 2022
@joestringer joestringer moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.10 Apr 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.10.10
Backport done to v1.10
1.11.2
Backport done to v1.11
Development

Successfully merging this pull request may close these issues.

None yet

10 participants