New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cilium installation in GCloud beta "rapid" channel #9959
Conversation
Release note label not set, please set the appropriate release note. |
2 similar comments
Release note label not set, please set the appropriate release note. |
Release note label not set, please set the appropriate release note. |
If the nodeinit script runs somewhere where the BPFFS is already mounted, then it is highly likely that existing infrastructure handles the auto-mounting of the BPF filesystem to /sys/fs/bpf, for instance because they are running systemd 239 or later. In this case, don't always create the BPFFS mount unit configuration for systemd, otherwise systemd may reject it with: Failed to start sys-fs-bpf.mount: Unit sys-fs-bpf.mount has a bad unit file setting. This is because systemd has deemed the BPFFS mounting to be within its own domain of control, and that others should not configure it. This caused issues in beta gcloud environments where the nodeinit would fail because the systemd mount unit would not mount. This prevented configuration of the kubelet on each node, meaning that Cilium would not be configured as the CNI in the environment. However, the Cilium DS itself would run, leading to situations where pods would attempt to connect to remote services (eg kube-dns to the APIserver) and fail to connect in an environment where the Cilium agents themselves otherwise appear to be healthy. A cursory investigation of `cilium endpoint list` in such environments would clearly show that no pods are being managed by Cilium. Fixes: cilium#9556 Signed-off-by: Joe Stringer <joe@cilium.io>
60f8497
to
79d2ef2
Compare
test-me-please EDIT: Only these failed:
So, not consistently (1 per k8s version), and not related to BPFFS (restore etc). I'm not even sure where nodeinit is used in the CI. k8s tests with other versions all passed though so I think it's good from CI perspective. |
If the nodeinit script runs somewhere where the BPFFS is already
mounted, then it is highly likely that existing infrastructure handles
the auto-mounting of the BPF filesystem to /sys/fs/bpf, for instance
because they are running systemd 239 or later.
In this case, don't always create the BPFFS mount unit configuration for
systemd, otherwise systemd may reject it with:
This is because systemd has deemed the BPFFS mounting to be within its
own domain of control, and that others should not configure it.
This caused issues in beta gcloud environments where the nodeinit would
fail because the systemd mount unit would not mount. This prevented
configuration of the kubelet on each node, meaning that Cilium would not
be configured as the CNI in the environment. However, the Cilium DS
itself would run, leading to situations where pods would attempt to
connect to remote services (eg kube-dns to the APIserver) and fail to
connect in an environment where the Cilium agents themselves otherwise
appear to be healthy. A cursory investigation of
cilium endpoint list
in such environments would clearly show that no pods are being managed
by Cilium.
Fixes: #9556
This change is