Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cilium v1.13.1 install CNI plugin failed #1911

Closed
2 tasks done
DesireWithin opened this issue Mar 20, 2023 · 13 comments
Closed
2 tasks done

Cilium v1.13.1 install CNI plugin failed #1911

DesireWithin opened this issue Mar 20, 2023 · 13 comments
Labels
kind/bug Something isn't working kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/agent

Comments

@DesireWithin
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

I setup a generic k8s v1.26.2 by kubeadm on kvm Ubuntu 22.04, kernel version is 5.15.0-60-generic
I want to install cilium v1.13.1 as pod network.
Here is my step:

  1. Download cilium-cli from https://github.com/cilium/cilium-cli/releases/download/v0.13.1/cilium-linux-amd64.tar.gz
  2. uncompress it to /usr/local/bin
  3. run cilium install:
$> cilium install  --wait-duration 5m --helm-values helm.yaml

Here is my helm.yaml:

image:
  repository: harbor.mycompany.net/k8s/cilium
  tag: v1.13.1
  useDigest: false
  pullPolicy: "IfNotPresent"
operator:
  image:
      repository: harbor.mycompany.net/k8s/cilium-operator
      tag: v1.13.1
      useDigest: false
      pullPolicy: "IfNotPresent"
  1. cilium will show installation is success:
# notice this line
ℹ️  Using Cilium version 1.13.0    
...
ℹ️  helm template --namespace kube-system cilium cilium/cilium --version 1.13.0 --set cluster.id=0,cluster.name=kubernetes,encryption.nodeEncryption=false,image.pullPolicy=IfNotPresent,image.repository=harbor.mycompany.net/k8s/cilium,image.tag=v1.13.1,image.useDigest=false,kubeProxyReplacement=disabled,operator.image.pullPolicy=IfNotPresent,operator.image.repository=harbor.mycompany.net/k8s/cilium-operator,operator.image.tag=v1.13.1,operator.image.useDigest=false,operator.replicas=1,serviceAccounts.cilium.name=cilium,serviceAccounts.operator.name=cilium-operator,tunnel=vxlan
...
  1. coredns or other pods can't be created because:
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "36b29...": plugin type="cilium-cni" name ="cilium" failed (add): failed to find plugin "cilium-cni" in path [/opt/cni/bin]  

I checked the hostpath /opt/cni/bin, there is no cilium-cni. But in the cilium agent pod, cilium-cni is exist in /opt/cni/bin, not exist in /host/opt/cni/bin/

I downgrade both cilium agent and operator to v1.13.0, this issue is gone. So I think this is a 1.13.1 version specific bug.

I can produce this bug very easy by set image tag from v1.13.0 to v.1.13.1 or reinstall cilium.

Cilium Version

cilium-cli: v0.13.1 compiled with go1.20 on linux/amd64
cilium image (default): v1.13.0
cilium image (stable): unknown #my servers are offline
cilium image (running): v1.13.1

Kernel Version

Linux app-infra-k8s-master01 5.15.0-60-generic cilium/cilium#66-Ubuntu SMP Fri Jan 20 14:29:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-02-22T13:39:03Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-02-22T13:32:22Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}

Sysdump

None

Relevant log output

None

Anything else?

I notice that cilium-cli only support v1.13.0 for now, (but its own version is v1.13.1):

$> cilium install --list-versions
v1.14.0-snapshot.0
v1.13.0 (default)
v1.13.0-rc5
v1.13.0-rc4
v1.13.0-rc3
v1.13.0-rc2
v1.13.0-rc1
v1.13.0-rc0
...

I checked /opt/cni/bin privilege on the host:

drwxrwxr-x 2 root root 4096 Mar 20 09:20 /opt/cni/bin/

Code of Conduct

  • I agree to follow this project's Code of Conduct
@DesireWithin DesireWithin added kind/bug Something isn't working kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish the root cause. labels Mar 20, 2023
@sayboras sayboras changed the title Cilimi v1.13.1 install CNI plugin failed Cilium v1.13.1 install CNI plugin failed Mar 20, 2023
@youngnick youngnick added sig/agent and removed needs/triage This issue requires triaging to establish the root cause. labels Mar 21, 2023
@wbh1
Copy link

wbh1 commented Mar 21, 2023

Probably related to the change implemented here: cilium/cilium#24075

In my case, the installation of cilium wasn't adding the new init container in the manifests generated by Kubespray (PR to fix)

dghubble referenced this issue in poseidon/terraform-render-bootstrap Mar 29, 2023
* Starting in Cilium v1.13.1, the cilium-cni plugin is installed
via an init container rather than by the Cilium agent container

Rel: https://github.com/cilium/cilium/issues/24457
@christarazi
Copy link
Member

@wbh1 Is your issue fixed with kubernetes-sigs/kubespray#9914 or is this issue separate from kubespray?

@jqueuniet
Copy link

Also having this issue with Cilium 1.13.1 on Fedora CoreOS hosts with direct Cilium CLI usage. Rolling back to 1.13.0 and everything works again.

@ccieliu
Copy link

ccieliu commented Apr 20, 2023

same issue on v1.12.8, and no 'cilium-cni" under host /opt/cni/bin.
rollback to v1.12.7, "cilium-cni" file back to host /opt/cni/bin.
Could you please let me know any one can fix the issue?

@RobM83
Copy link

RobM83 commented Apr 23, 2023

Got the same issue, did an upgrade with cilium cli (cilium upgrade --version v1.13.2), coming from v1.12.2.
Introduced the same problem.

@yangvipguang
Copy link

same issue with cilium v1.13.2 when installing with kubespray

@Nerkho
Copy link

Nerkho commented Jun 29, 2023

Same issue on bare metal nodes after upgrading to 1.13.3 with cilium cli.

Downgraded back to 1.13.0 for now as a workaround.

@joestringer
Copy link
Member

Do you observe this issue with the Cilium-CLI v0.14.8? https://github.com/cilium/cilium-cli/releases/tag/v0.14.8 (cf. current releases of CLI: https://github.com/cilium/cilium-cli#releases)

@mrlhansen
Copy link

Yesterday I got the same error upgrading from v1.12.2 to v1.12.12 using the cilium cli. After downgrading to v1.12.7 the /opt/cni/bin/cilium-cni file was back again and everything is working. I updated cilium-cli to v0.14.8 as suggested above, and tried again, but upgrading to v1.12.12 still doesn't work.

@Nerkho
Copy link

Nerkho commented Aug 8, 2023

Do you observe this issue with the Cilium-CLI v0.14.8? https://github.com/cilium/cilium-cli/releases/tag/v0.14.8 (cf. current releases of CLI: https://github.com/cilium/cilium-cli#releases)

I gave it another go today. Upgrading to 1.14.0 with clilium-cli v0.15.5 worked fine so far, no more errors.

@aanm aanm transferred this issue from cilium/cilium Aug 15, 2023
@aanm aanm unassigned squeed Aug 15, 2023
@jqueuniet
Copy link

No more error either after upgrading to Cilium 1.14.0 and cilium-cli 0.15.5

@shapirus
Copy link

shapirus commented Sep 8, 2023

Same issue here with a fresh install of k8s 1.26.8 via kops 1.26.3 with Cilium 1.12.13.

Any idea of a workaround?

Using a newer version of Cilium doesn't work, because kops is trying to be smarter than the operator:

│ Error: spec.networking.cilium.version: Invalid value: "v1.11.14": Only version 1.12 is supported

(v1.11.14 is what I tested to work in my case on a different cluster.)

update: Cilium v1.12.7 works in this case. kops 1.26.3 installs Cilium v1.12.5 by default, so it should work too.

@joestringer
Copy link
Member

Overall I think that this problem is caused by mixing an matching Helm charts and Cilium versions where one is older and the other is newer. I don't think it reproduces while using the latest cilium-cli or by installing directly via Helm with any recent version of Cilium for supported branches. If you're hitting this issue with an external installer such as kOps, I suggest reaching out to your upstream installer to verify that they are pulling in a new version of the Cilium Helm charts as well as Cilium image version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/agent
Projects
None yet
Development

No branches or pull requests