Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coredns fails connecting to kube-api via kubernetes service #27900

Open
2 tasks done
azzid opened this issue Sep 3, 2023 · 38 comments
Open
2 tasks done

Coredns fails connecting to kube-api via kubernetes service #27900

azzid opened this issue Sep 3, 2023 · 38 comments
Labels
info-completed The GH issue has received a reply from the author kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers.

Comments

@azzid
Copy link

azzid commented Sep 3, 2023

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

As initially reported here I'm unable to get dns working due to coredns failing to connect to kubernetes api - I think it might be a regression since upgrading to $latest.

Whole post follows as copy-paste:

Cluster information:

Kubernetes version: v1.28.1
Cloud being used: bare-metal
Installation method: kubeadm
Host OS: Ubuntu 22.04.3 LTS
CNI and version: cilium 1.14.1
CRI and version: containerd 1.6.22

Today, after upgrading to 1.28.1 I realized that my test cluster is unable to get coredns ready:

$ k get po -A | grep core
kube-system   coredns-5dd5756b68-hchqq            0/1     Running   0             57m
kube-system   coredns-5dd5756b68-r768b            0/1     Running   0             57m

Upon inspecting the logs there seem to be some connectivity issue between coredns and kube-api:

$ k -n kube-system logs coredns-5dd5756b68-hchqq | tail -5 | tail -2
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.96.0.1:443/version": dial tcp 10.96.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"

cilium connectivity test seem to run into the same issue:

$ cilium connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
⌛ [kubernetes] Waiting for deployment cilium-test/client to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/client2 to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-same-node to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-other-node to become ready...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/client-78f9dffc84-g5z5l to appear...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/client2-59b578d4bb-jttvw to appear...
⌛ [kubernetes] Waiting for pod cilium-test/client-78f9dffc84-g5z5l to reach DNS server on cilium-test/echo-same-node-54cc4f75b8-xt4cf pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-59b578d4bb-jttvw to reach DNS server on cilium-test/echo-same-node-54cc4f75b8-xt4cf pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-78f9dffc84-g5z5l to reach DNS server on cilium-test/echo-other-node-5b87f6f4f4-cdmtl pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-59b578d4bb-jttvw to reach DNS server on cilium-test/echo-other-node-5b87f6f4f4-cdmtl pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-78f9dffc84-g5z5l to reach default/kubernetes service...
connectivity test failed: timeout reached waiting for lookup for kubernetes.default from pod cilium-test/client-78f9dffc84-g5z5l to succeed (last error: context deadline exceeded)

Accessing the kube-api from outside the cluster works fine - as is demonstrated by kubectl working. 😉

Cilium status seem ok.

$ cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 \__/¯¯\__/    Hubble Relay:       OK
    \__/       ClusterMesh:        disabled

Deployment             hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1
Deployment             cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
Containers:            cilium             Running: 4
                       hubble-relay       Running: 1
                       hubble-ui          Running: 1
                       cilium-operator    Running: 2
Cluster Pods:          8/8 managed by Cilium
Helm chart version:    1.14.1
Image versions         cilium             quay.io/cilium/cilium:v1.14.1@sha256:edc1d05ea1365c4a8f6ac6982247d5c145181704894bb698619c3827b6963a72: 4
                       hubble-relay       quay.io/cilium/hubble-relay:v1.13.2: 1
                       hubble-ui          quay.io/cilium/hubble-ui:v0.11.0@sha256:bcb369c47cada2d4257d63d3749f7f87c91dde32e010b223597306de95d1ecc8: 1
                       hubble-ui          quay.io/cilium/hubble-ui-backend:v0.11.0@sha256:14c04d11f78da5c363f88592abae8d2ecee3cbe009f443ef11df6ac5f692d839: 1
                       cilium-operator    quay.io/cilium/operator-generic:v1.14.1@sha256:e061de0a930534c7e3f8feda8330976367971238ccafff42659f104effd4b5f7: 2

There are no network policies I can find to blame.

$ k get ciliumnetworkpolicies.cilium.io -A
No resources found
$ k get networkpolicies.networking.k8s.io -A
No resources found

There are endpoints which I believe should be implicitly targeted by the service:

$ k get endpointslices.discovery.k8s.io 
NAME         ADDRESSTYPE   PORTS   ENDPOINTS                                      AGE
kubernetes   IPv4          6443    192.168.100.10,192.168.100.11,192.168.100.12   140d
$ k get svc -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: Service
  metadata:
    creationTimestamp: "2023-09-01T08:00:11Z"
    labels:
      component: apiserver
      provider: kubernetes
    name: kubernetes
    namespace: default
    resourceVersion: "2726902"
    uid: 5e7c32c9-ab89-47e4-8940-db010c2ffc4d
  spec:
    clusterIP: 10.96.0.1
    clusterIPs:
    - 10.96.0.1
    internalTrafficPolicy: Cluster
    ipFamilies:
    - IPv4
    ipFamilyPolicy: SingleStack
    ports:
    - name: https
      port: 443
      protocol: TCP
      targetPort: 6443
    sessionAffinity: None
    type: ClusterIP
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""

I don’t believe I have any funny business in the coredns config:

$ k get all -A -l k8s-app=kube-dns
NAMESPACE     NAME                           READY   STATUS    RESTARTS        AGE
kube-system   pod/coredns-5dd5756b68-hchqq   0/1     Running   1 (6m39s ago)   4h46m
kube-system   pod/coredns-5dd5756b68-r768b   0/1     Running   1 (6m38s ago)   4h46m

NAMESPACE     NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-system   service/kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   4h46m

NAMESPACE     NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/coredns   0/2     2            0           4h46m

NAMESPACE     NAME                                 DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/coredns-5dd5756b68   2         2         0       4h46m

$ k -n kube-system describe cm coredns 
Name:         coredns
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>

Data
====
Corefile:
----
.:53 {
    errors
    health {
       lameduck 5s
    }
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       fallthrough in-addr.arpa ip6.arpa
       ttl 30
    }
    prometheus :9153
    forward . /etc/resolv.conf {
       max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
}


BinaryData
====

Events:  <none>

There is a service running in the container - but it does not seem to hold any data, probably due to not being able to connect to the api:

$ kubectl -n kube-system debug -it pod/coredns-5dd5756b68-hchqq --image=nicolaka/netshoot --target=coredns
coredns-5dd5756b68-hchqq  ~  ss -lnp | grep :53
udp   UNCONN 0      0                  *:53               *:*    users:(("coredns",pid=1,fd=12))
tcp   LISTEN 0      4096               *:53               *:*    users:(("coredns",pid=1,fd=11))
coredns-5dd5756b68-hchqq  ~  dig @localhost kubernetes.default

; <<>> DiG 9.18.13 <<>> @localhost kubernetes.default
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 29162
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 7fdf8d625c0b48eb (echoed)
;; QUESTION SECTION:
;kubernetes.default.            IN      A

;; Query time: 0 msec
;; SERVER: ::1#53(localhost) (UDP)
;; WHEN: Fri Sep 01 13:23:08 UTC 2023
;; MSG SIZE  rcvd: 59

I can access the api from the pod on the external ip, but not the service ip:

 coredns-5dd5756b68-hchqq  ~  ping kubernetes              
ping: kubernetes: Try again

 coredns-5dd5756b68-hchqq  ~  ping k8s       
PING k8s.kubenet (192.168.100.5) 56(84) bytes of data.
64 bytes from k8s.kubenet (192.168.100.5): icmp_seq=1 ttl=62 time=0.139 ms
64 bytes from k8s.kubenet (192.168.100.5): icmp_seq=2 ttl=62 time=0.147 ms
^C
--- k8s.kubenet ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1023ms
rtt min/avg/max/mdev = 0.139/0.143/0.147/0.004 ms

 coredns-5dd5756b68-hchqq  ~  curl -k https://k8s:6443
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}#                                                                                                                                                                                             

 coredns-5dd5756b68-hchqq  ~  curl -k https://10.96.0.1:443 
curl: (28) Failed to connect to 10.96.0.1 port 443 after 130812 ms: Couldn't connect to server

What am I missing?

Cilium Version

cilium-cli: v0.15.7 compiled with go1.21.0 on linux/amd64
cilium image (default): v1.14.1
cilium image (stable): v1.14.1
cilium image (running): 1.14.1

Kernel Version

6.4.11-200.fc38.x86_64

Kubernetes Version

Client Version: v1.28.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.1

Sysdump

cilium-sysdump-20230903-172137.zip

Relevant log output

$ k get po -A | grep core
kube-system   coredns-5dd5756b68-hchqq            0/1     Running   0             57m
kube-system   coredns-5dd5756b68-r768b            0/1     Running   0             57m

$ k -n kube-system logs coredns-5dd5756b68-hchqq | tail -5 | tail -2
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.96.0.1:443/version": dial tcp 10.96.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"

Anything else?

https://discuss.kubernetes.io/t/coredns-fails-connecting-to-kube-api-via-kubernetes-service/

Code of Conduct

  • I agree to follow this project's Code of Conduct
@azzid azzid added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels Sep 3, 2023
@youngnick youngnick added need-more-info More information is required to further debug or fix the issue. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers. labels Sep 4, 2023
@youngnick
Copy link
Contributor

youngnick commented Sep 4, 2023

Thanks for this detailed issue @azzid.

Unfortunately Cilium 1.14 only supports Kubernetes 1.27 - it looks like there might be something in the 1.28 upgrade that's not working properly.

Could you try either doing Cilium 1.14 on Kubernetes 1.27, or Cilium main (you could also use the 1.15.0-pre.0 tag) and Kubernetes 1.28, and seeing if you get the same issue?

@azzid
Copy link
Author

azzid commented Sep 4, 2023

Could you try either doing Cilium 1.14 on Kubernetes 1.28

Is there a miss in the spelling there? 1.14 on 1.28 is what I'm already failing at, right? ;-)

@github-actions github-actions bot added info-completed The GH issue has received a reply from the author and removed need-more-info More information is required to further debug or fix the issue. labels Sep 4, 2023
@youngnick
Copy link
Contributor

🤦 Yes, thanks, Cilium 1.14 on Kubernetes 1.27 is what I meant. Thanks!

@azzid
Copy link
Author

azzid commented Sep 4, 2023

Forced a downgrade (basically apt install -y kubeadm=1.27.5-00 kubelet=1.27.5-00 kubectl=1.27.5-00 && ( kubeadm upgrade apply v1.27.5 || kubeadm upgrade node ) - might contain traces of 1.28) and rebooted the nodes:

$ k get no
NAME        STATUS   ROLES           AGE    VERSION
k8scp       Ready    control-plane   143d   v1.27.5
k8scp2      Ready    control-plane   143d   v1.27.5
k8scp3      Ready    control-plane   143d   v1.27.5
k8sworker   Ready    <none>          143d   v1.27.5

Thereafter coredns is running!

$ k get po -A | grep core
kube-system   coredns-5dd5756b68-k6bjt           1/1     Running   0                3m57s
kube-system   coredns-5dd5756b68-lsqcd           1/1     Running   1 (4m43s ago)    8m33s

@youngnick
Copy link
Contributor

Okay, so it's probably something about coredns and Cilium on 1.14 and 1.28, thanks!

@youngnick
Copy link
Contributor

Oh, to be clear, were you using Kubernetes 1.28 before version 1.14, or is this problem only with Cilium 1.14 and Kubernetes 1.28. Do you know what happens if it's 1.13 and Kube 1.28? (Trying to think about if this is only the Kubernetes version or if we changed something in 1.14 that causes it).

@youngnick youngnick added the sig/agent Cilium agent related. label Sep 4, 2023
@azzid
Copy link
Author

azzid commented Sep 4, 2023

were you using Kubernetes 1.28 before version 1.14

That's unfortunately a bit unclear. I was using something older before doing the kubernetes upgrade - so after the nodes where 1.28 I upgraded Cilium to 1.14 - but I don't know:
A. if coredns worked before the cilium upgrade
B. which cilium version I upgraded from

@azzid
Copy link
Author

azzid commented Sep 4, 2023

Tried upgrading to 1.15 on 1.27:

$ helm upgrade cilium cilium/cilium --version 1.15.0-pre.0 --namespace=kube-system --set k8sServiceHost=k8s --set k8sServicePort=6443
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Mon Sep  4 10:46:09 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 4
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.

Your release version is 1.15.0-pre.0.

For any further help, visit https://docs.cilium.io/en/v1.15/gettinghelp

$ kubectl get pods --all-namespaces -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,HOSTNETWORK:.spec.hostNetwork --no-headers=true | grep '<none>'
 | awk '{print "-n "$1" "$2}' | xargs -L 1 -r kubectl delete pod                                                                                                                               
pod "client-6b4b857d98-5xmlw" deleted                                                                                                                                                          
pod "client2-646b88fb9b-tmc7w" deleted                                                                                                                                                         
pod "echo-other-node-8b4df78df-vhxbw" deleted                                                                                                                                                  
pod "echo-same-node-775456cfcf-9f6ll" deleted                                                                                                                                                  
pod "cowboy" deleted                                                                                                                                                                           
pod "coredns-5dd5756b68-k6bjt" deleted                                                                                                                                                         
pod "coredns-5dd5756b68-lsqcd" deleted                                                                                                                                                         
pod "hubble-relay-585bb7f68-ggqfg" deleted                                                                                                                                                     
pod "hubble-ui-694cf76f4c-t8kx6" deleted

$ k delete ns cilium-test
namespace "cilium-test" deleted

$ cilium connectivity test                                              
ℹ️  Monitor aggregation detected, will skip some flow validation steps
✨ [kubernetes] Creating namespace cilium-test for connectivity check...
✨ [kubernetes] Deploying echo-same-node service...
✨ [kubernetes] Deploying DNS test server configmap...
✨ [kubernetes] Deploying same-node deployment...
✨ [kubernetes] Deploying client deployment...
✨ [kubernetes] Deploying client2 deployment...
✨ [kubernetes] Deploying echo-other-node service...
✨ [kubernetes] Deploying other-node deployment...
✨ [host-netns] Deploying kubernetes daemonset...
✨ [host-netns-non-cilium] Deploying kubernetes daemonset...
✨ [kubernetes] Deploying echo-external-node deployment...
⌛ [kubernetes] Waiting for deployment cilium-test/client to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/client2 to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-same-node to become ready...
⌛ [kubernetes] Waiting for deployment cilium-test/echo-other-node to become ready...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/client-6b4b857d98-8wfxx to appear...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/client2-646b88fb9b-8v579 to appear...
⌛ [kubernetes] Waiting for pod cilium-test/client-6b4b857d98-8wfxx to reach DNS server on cilium-test/echo-same-node-695c975475-9ls6m pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-646b88fb9b-8v579 to reach DNS server on cilium-test/echo-same-node-695c975475-9ls6m pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-6b4b857d98-8wfxx to reach DNS server on cilium-test/echo-other-node-6f988f8697-p8hlz pod...
⌛ [kubernetes] Waiting for pod cilium-test/client2-646b88fb9b-8v579 to reach DNS server on cilium-test/echo-other-node-6f988f8697-p8hlz pod...
⌛ [kubernetes] Waiting for pod cilium-test/client-6b4b857d98-8wfxx to reach default/kubernetes service...
⌛ [kubernetes] Waiting for pod cilium-test/client2-646b88fb9b-8v579 to reach default/kubernetes service...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/echo-other-node-6f988f8697-p8hlz to appear...
⌛ [kubernetes] Waiting for CiliumEndpoint for pod cilium-test/echo-same-node-695c975475-9ls6m to appear...
⌛ [kubernetes] Waiting for Service cilium-test/echo-other-node to become ready...
⌛ [kubernetes] Waiting for Service cilium-test/echo-other-node to be synchronized by Cilium pod kube-system/cilium-qtqch
⌛ [kubernetes] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [kubernetes] Waiting for Service cilium-test/echo-same-node to be synchronized by Cilium pod kube-system/cilium-qtqch
⌛ [kubernetes] Waiting for NodePort 192.168.100.10:30370 (cilium-test/echo-other-node) to become ready...
connectivity test failed: timeout reached waiting for NodePort 192.168.100.10:30370 (cilium-test/echo-other-node) (last error: command terminated with exit code 1)

Not only was I unsuccessful, it also seem to have successfully broken even the older versions:

$ helm delete cilium --namespace=kube-system
release "cilium" uninstalled

$ helm install cilium cilium/cilium --version 1.14.1 --namespace=kube-system --set global.kubeProxyReplacement=strict --set k8sServiceHost=k8s --set k8sServicePort=6443
NAME: cilium
LAST DEPLOYED: Mon Sep  4 11:32:56 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.

Your release version is 1.14.1.

For any further help, visit https://docs.cilium.io/en/v1.14/gettinghelp

$ cilium connectivity test | tail
connectivity test failed: timeout reached waiting for NodePort 192.168.100.20:30370 (cilium-test/echo-other-node) (last error: command terminated with exit code 1)

Probably due to me doing something wrong causing the kube-proxy replacement to not be enabled:

$ kubectl -n kube-system exec cilium-tqjnx -- cilium status | grep ProxyReplace
KubeProxyReplacement:     False

Manually setting kube-proxy-replacement: true in configmap seem to fix that particular detail:

$ k describe -n kube-system cm cilium-config | grep -A2 kube-proxy-replacement:
kube-proxy-replacement:
----
true
$ cilium connectivity test
[redacted]
✅ All 42 tests (306 actions) successful, 13 tests skipped, 0 scenarios skipped.

Afterwhich another upgrade was performed:

$ helm upgrade cilium cilium/cilium --version 1.15.0-pre.0 --namespace=kube-system --set upgradeCompatibility=1.14
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Mon Sep  4 15:36:24 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.

Your release version is 1.15.0-pre.0.

For any further help, visit https://docs.cilium.io/en/v1.15/gettinghelp

status looks good

$ cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 \__/¯¯\__/    Hubble Relay:       disabled
    \__/       ClusterMesh:        disabled

Deployment             cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
Containers:            cilium             Running: 4
                       cilium-operator    Running: 2
Cluster Pods:          6/6 managed by Cilium
Helm chart version:    1.15.0-pre.0
Image versions         cilium             quay.io/cilium/cilium:v1.15.0-pre.0: 4
                       cilium-operator    quay.io/cilium/operator-generic:v1.15.0-pre.0: 2

but proxy is disabled again

$ k describe -n kube-system cm cilium-config | grep -A2 kube-proxy-replacement:
kube-proxy-replacement:
----
false

re-enabled it

$ k edit -n kube-system cm cilium-config
configmap/cilium-config edited
$ k describe -n kube-system cm cilium-config | grep -A2 kube-proxy-replacement:
kube-proxy-replacement:
----
true

but 1.15 still seem to work poorly

$ cilium connectivity test
[redacted]
connectivity test failed: timeout reached waiting for NodePort 192.168.100.20:30615 (cilium-test/echo-same-node) (last error: command terminated with exit code 1)

@aanm
Copy link
Member

aanm commented Sep 5, 2023

I've installed minikube with k8s 1.28.1 and Cilium 1.14.1 and the connectivity tests have passed:

$ minikube version
minikube version: v1.31.2
commit: fd7ecd9c4599bef9f04c0986c4a0187f98a4396e
$ minikube start --kubernetes-version=v1.28.1
$ cilium version
cilium-cli: v0.15.7 compiled with go1.21.0 on linux/amd64
cilium image (default): v1.14.1
cilium image (stable): v1.14.1
cilium image (running): 1.14.1
$ cilium install
🔮 Auto-detected Kubernetes kind: minikube
✨ Running "minikube" validation checks
✅ Detected minikube version "1.31.2"
ℹ  Using Cilium version 1.14.1
🔮 Auto-detected cluster name: minikube
🔮 Auto-detected kube-proxy has been installed
$ kubectl version
Client Version: v1.28.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.1
$ cilium connectivity test
ℹ  Single-node environment detected, enabling single-node connectivity test
ℹ  Monitor aggregation detected, will skip some flow validation steps
✨ [minikube] Creating namespace cilium-test for connectivity check...
✨ [minikube] Deploying echo-same-node service...
✨ [minikube] Deploying DNS test server configmap...
✨ [minikube] Deploying same-node deployment...
✨ [minikube] Deploying client deployment...
✨ [minikube] Deploying client2 deployment...
⌛ [minikube] Waiting for deployment cilium-test/client to become ready...
⌛ [minikube] Waiting for deployment cilium-test/client2 to become ready...
⌛ [minikube] Waiting for deployment cilium-test/echo-same-node to become ready...
⌛ [minikube] Waiting for CiliumEndpoint for pod cilium-test/client-78f9dffc84-887bg to appear...
⌛ [minikube] Waiting for CiliumEndpoint for pod cilium-test/client2-59b578d4bb-8kld9 to appear...
⌛ [minikube] Waiting for pod cilium-test/client-78f9dffc84-887bg to reach DNS server on cilium-test/echo-same-node-79d996cb79-k7p6p pod...
⌛ [minikube] Waiting for pod cilium-test/client2-59b578d4bb-8kld9 to reach DNS server on cilium-test/echo-same-node-79d996cb79-k7p6p pod...
⌛ [minikube] Waiting for pod cilium-test/client-78f9dffc84-887bg to reach default/kubernetes service...
⌛ [minikube] Waiting for pod cilium-test/client2-59b578d4bb-8kld9 to reach default/kubernetes service...
⌛ [minikube] Waiting for CiliumEndpoint for pod cilium-test/echo-same-node-79d996cb79-k7p6p to appear...
⌛ [minikube] Waiting for Service cilium-test/echo-same-node to become ready...
⌛ [minikube] Waiting for Service cilium-test/echo-same-node to be synchronized by Cilium pod kube-system/cilium-sgtdt
⌛ [minikube] Waiting for NodePort 192.168.49.2:30363 (cilium-test/echo-same-node) to become ready...
ℹ  Skipping IPCache check
🔭 Enabling Hubble telescope...
⚠  Unable to contact Hubble Relay, disabling Hubble telescope and flow validation: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:4245: connect: connection refused"
ℹ  Expose Relay locally with:
   cilium hubble enable
   cilium hubble port-forward&
ℹ  Cilium version: 1.14.1
🏃 Running tests...
[=] Test [no-policies]
..................
[=] Test [no-policies-extra]
..
[=] Test [allow-all-except-world]
........
[=] Test [client-ingress]
..
[=] Test [client-ingress-knp]
..
[=] Test [allow-all-with-metrics-check]
..
[=] Test [all-ingress-deny]
......
[=] Test [all-ingress-deny-knp]
......
[=] Test [all-egress-deny]
........
[=] Test [all-egress-deny-knp]
........
[=] Test [all-entities-deny]
......
[=] Test [cluster-entity]
..
[=] Test [host-entity]
..
[=] Test [echo-ingress]
..
[=] Test [echo-ingress-knp]
..
[=] Test [client-ingress-icmp]
..
[=] Test [client-egress]
..
[=] Test [client-egress-knp]
..
[=] Test [client-egress-expression]
..
[=] Test [client-egress-expression-knp]
..
[=] Test [client-with-service-account-egress-to-echo]
..
[=] Test [client-egress-to-echo-service-account]
..
[=] Test [to-entities-world]
......
[=] Test [to-cidr-external]
....
[=] Test [to-cidr-external-knp]
....
[=] Test [echo-ingress-from-other-client-deny]
....
[=] Test [client-ingress-from-other-client-icmp-deny]
....
[=] Test [client-egress-to-echo-deny]
....
[=] Test [client-ingress-to-echo-named-port-deny]
..
[=] Test [client-egress-to-echo-expression-deny]
..
[=] Test [client-with-service-account-egress-to-echo-deny]
..
[=] Test [client-egress-to-echo-service-account-deny]
.
[=] Test [client-egress-to-cidr-deny]
....
[=] Test [client-egress-to-cidr-deny-default]
....
[=] Test [health]
.
[=] Skipping Test [north-south-loadbalancing] (feature node-without-cilium is disabled)
[=] Skipping Test [pod-to-pod-encryption] (feature encryption-pod is disabled)
[=] Skipping Test [node-to-node-encryption] (feature encryption-pod is disabled)
[=] Skipping Test [egress-gateway-excluded-cidrs] (feature enable-ipv4-egress-gateway is disabled)
[=] Skipping Test [north-south-loadbalancing-with-l7-policy] (feature node-without-cilium is disabled)
[=] Test [echo-ingress-l7]
......
[=] Test [echo-ingress-l7-named-port]
......
[=] Test [client-egress-l7-method]
......
[=] Test [client-egress-l7]
........
[=] Test [client-egress-l7-named-port]
........
[=] Skipping Test [client-egress-l7-tls-deny-without-headers] (feature secret-backend-k8s is disabled)
[=] Skipping Test [client-egress-l7-tls-headers] (feature secret-backend-k8s is disabled)
[=] Skipping Test [client-egress-l7-set-header] (feature secret-backend-k8s is disabled)
[=] Skipping Test [echo-ingress-auth-always-fail] (feature mutual-auth-spiffe is disabled)
[=] Skipping Test [echo-ingress-mutual-auth-spiffe] (feature mutual-auth-spiffe is disabled)
[=] Skipping Test [pod-to-ingress-service] (feature ingress-controller is disabled)
[=] Skipping Test [pod-to-ingress-service-deny-all] (feature ingress-controller is disabled)
[=] Skipping Test [pod-to-ingress-service-allow-ingress-identity] (feature ingress-controller is disabled)
[=] Test [dns-only]
........
[=] Test [to-fqdns]
........

✅ All 42 tests (182 actions) successful, 13 tests skipped, 1 scenarios skipped.

@aanm aanm added need-more-info More information is required to further debug or fix the issue. and removed kind/bug This is a bug in the Cilium logic. needs/triage This issue requires triaging to establish severity and next steps. info-completed The GH issue has received a reply from the author labels Sep 5, 2023
@azzid
Copy link
Author

azzid commented Sep 5, 2023

Since yesterday my version soup looks like:

$ cilium version 
cilium-cli: v0.15.7 compiled with go1.21.0 on linux/amd64
cilium image (default): v1.14.1
cilium image (stable): v1.14.1
cilium image (running): 1.15.0-pre.0
$ ssh k8scp kubectl version --short
Client Version: v1.27.5
Kustomize Version: v5.0.1
Server Version: v1.27.5

Everything looks a bit broken

$ k get po -A | grep -e cilium -e core
kube-system   cilium-4kb4q                       0/1     Init:Error              57 (6m13s ago)   22h
kube-system   cilium-operator-6f99cbd84f-56l9s   0/1     CrashLoopBackOff        69 (2m4s ago)    22h
kube-system   cilium-operator-6f99cbd84f-nclll   0/1     CrashLoopBackOff        72 (55s ago)     22h
kube-system   cilium-t2b8z                       0/1     Init:0/6                56 (5m52s ago)   22h
kube-system   cilium-tcjbc                       0/1     Init:Error              57 (6m21s ago)   22h
kube-system   cilium-wtw2k                       0/1     Init:CrashLoopBackOff   57 (19s ago)     22h
kube-system   coredns-5dd5756b68-52czp           0/1     Unknown                 0                26h
kube-system   coredns-5dd5756b68-pl5bd           0/1     Unknown                 1                26h

not only coredns, but cilium as well is having trouble with the kube-api service

$ k -n kube-system logs cilium-operator-6f99cbd84f-56l9s | grep -e error -e fatal
level=error msg="Unable to contact k8s api-server" error="Get \"https://10.96.0.1:443/api/v1/namespaces/kube-system\": dial tcp 10.96.0.1:443: i/o timeout" ipAddr="https://10.96.0.1:443" subsys=k8s-client
level=error msg="Start hook failed" error="Get \"https://10.96.0.1:443/api/v1/namespaces/kube-system\": dial tcp 10.96.0.1:443: i/o timeout" function="client.(*compositeClientset).onStart" subsys=hive
level=fatal msg="failed to start: Get \"https://10.96.0.1:443/api/v1/namespaces/kube-system\": dial tcp 10.96.0.1:443: i/o timeout" subsys=cilium-operator-generic

Since it works in minikube I guess something has gone awry in my configuration during up/downgrades.

How does one go about flushing whatever configuration might be left behind since the earlier attempts? I'd prefer to get a grasp on what's going on rather than restarting with kubeadm reset.

I've found cilium cleanup in the docs, but I stumble on using it.

$ cilium cleanup
unknown command "cleanup" for "cilium"

I tried running it in a cilium-agent pod yesterday, but while the cilium command there was aware of it it refused to do anything as the cilium service was running.

@github-actions github-actions bot added info-completed The GH issue has received a reply from the author and removed need-more-info More information is required to further debug or fix the issue. labels Sep 5, 2023
@kvaster
Copy link
Contributor

kvaster commented Sep 5, 2023

I have exactly the same problem. Pods are failing to connect to apiserver - 10.255.100.1 service is not working. Direct connection to apiserver ip's is working. The problem occured after upgrdaing from kubernetes 1.27.4 to 1.28.1. Using latest cilium git code (1.15 branch) doesn't help either.

@kvaster
Copy link
Contributor

kvaster commented Sep 5, 2023

Starting to investigate the problem :) So I have cluster with 4x nodes. First 3 nodes are control plane. I've created test daemonset and trying to launch: wget https://10.225.100.1 . And it works with no problem on node 4. It seems that only control plane nodes are affected....

@azzid
Copy link
Author

azzid commented Sep 6, 2023

I managed to get v1.14.1 working again. Seems the --set global.kubeProxyReplacement=true option to helm works poorly.

I did a helm pull and tweaked the values.yaml like so:

$ diff -u cilium/values.yaml{.distOrig,}
--- cilium/values.yaml.distOrig 2023-08-15 13:00:00.000000000 +0200
+++ cilium/values.yaml  2023-09-06 09:37:47.091867691 +0200
@@ -36,9 +36,9 @@
 # @default -- `"~/.kube/config"`
 kubeConfigPath: ""
 # -- (string) Kubernetes service host
-k8sServiceHost: ""
+k8sServiceHost: "k8s"
 # -- (string) Kubernetes service port
-k8sServicePort: ""
+k8sServicePort: "6443"
 
 # -- Configure the client side rate limit for the agent and operator
 #
@@ -137,7 +137,7 @@
 name: cilium
 
 # -- Roll out cilium agent pods automatically when configmap is updated.
-rollOutCiliumPods: false
+rollOutCiliumPods: true
 
 # -- Agent container image.
 image:
@@ -526,7 +526,7 @@
   # if you're removing Cilium from the cluster. Disable this to prevent the CNI
   # configuration file from being removed during agent upgrade, which can cause
   # nodes to go unmanageable.
-  uninstall: false
+  uninstall: true
 
   # -- Configure chaining on top of other CNI plugins. Possible values:
   #  - none
@@ -1575,7 +1575,7 @@
 # -- Configure the kube-proxy replacement in Cilium BPF datapath
 # Valid options are "true", "false", "disabled" (deprecated), "partial" (deprecated), "strict" (deprecated).
 # ref: https://docs.cilium.io/en/stable/network/kubernetes/kubeproxy-free/
-#kubeProxyReplacement: "false"
+kubeProxyReplacement: "true"
 
 # -- healthz server bind address for the kube-proxy replacement.
 # To enable set the value to '0.0.0.0:10256' for all ipv4

which got the configmap right from the start:

$ k -n kube-system describe cm cilium-config | grep -A2 kube-proxy-replacement:
kube-proxy-replacement:
----
true

also got all the pods in working shape

$ k get po -A | grep -e cilium -e core
kube-system   cilium-dtxqg                       1/1     Running   0             3m47s
kube-system   cilium-lh9t4                       1/1     Running   0             3m47s
kube-system   cilium-operator-8487fbb8d7-dxz9p   1/1     Running   0             3m47s
kube-system   cilium-operator-8487fbb8d7-wvwt5   1/1     Running   0             3m47s
kube-system   cilium-wxvzm                       1/1     Running   0             3m47s
kube-system   cilium-zkn9k                       1/1     Running   0             3m47s
kube-system   coredns-5dd5756b68-mt8wr           1/1     Running   0             3m9s
kube-system   coredns-5dd5756b68-rlxvg           1/1     Running   0             3m9s
$ cilium connectivity test                   
ℹ️  Monitor aggregation detected, will skip some flow validation steps                          
✨ [kubernetes] Creating namespace cilium-test for connectivity check...     
(...)
✅ All 42 tests (306 actions) successful, 13 tests skipped, 0 scenarios skipped.

@aanm
Copy link
Member

aanm commented Sep 6, 2023

@azzid can you clarify which steps did you use for the working scenario and steps did you use for the non-working scenario? Thank you

@aanm aanm added the need-more-info More information is required to further debug or fix the issue. label Sep 6, 2023
@azzid
Copy link
Author

azzid commented Sep 6, 2023

@azzid can you clarify which steps did you use for the working scenario and steps did you use for the non-working scenario?

I'll try to reproduce again. What should trigger the problem is upgrading to 1.28.

tl;dr - upgrade k8s from 1.27 to 1.28 and coredns won't be able to start after cluster reboot

reproduction

I have come to realize that I'm using the now deprecated google repos - after switching to pkgs.k8s.io I see that there are new versions not only for kubeadm, kubectl and kubelet, but also kubernets-cni and cri-tools.

cri-tools/unknown 1.28.0-1.1 amd64 [upgradable from: 1.26.0-00]                                
kubeadm/unknown 1.28.1-1.1 amd64 [upgradable from: 1.27.5-00]                                  
kubectl/unknown 1.28.1-1.1 amd64 [upgradable from: 1.27.5-00]                                  
kubelet/unknown 1.28.1-1.1 amd64 [upgradable from: 1.27.5-00]                                  
kubernetes-cni/unknown 1.2.0-2.1 amd64 [upgradable from: 1.2.0-00]                    

But even with the upgraded kubernetes-cni the problem i reproducible as shown below.

I upgraded all cluster nodes to 1.28

$ k get no
NAME        STATUS   ROLES           AGE    VERSION
k8scp       Ready    control-plane   146d   v1.28.1
k8scp2      Ready    control-plane   146d   v1.28.1
k8scp3      Ready    control-plane   146d   v1.28.1
k8sworker   Ready    <none>          146d   v1.28.1

everything seemed fine

$ k get po -A | grep -e cilium -e core                                                                                                                                 
kube-system   cilium-dtxqg                       1/1     Running   1 (26m ago)     6h12m                                                                                                       
kube-system   cilium-lh9t4                       1/1     Running   1 (26m ago)     6h12m                                                                                                       
kube-system   cilium-operator-8487fbb8d7-dxz9p   1/1     Running   2 (5m17s ago)   6h12m                                                                                                       
kube-system   cilium-operator-8487fbb8d7-wvwt5   1/1     Running   1 (26m ago)     6h12m                                                                                                       
kube-system   cilium-wxvzm                       1/1     Running   1 (30m ago)     6h12m                                                                                                       
kube-system   cilium-zkn9k                       1/1     Running   1 (27m ago)     6h12m                                                                                                       
kube-system   coredns-5dd5756b68-6r2bh           1/1     Running   0               33s                                                                                                         
kube-system   coredns-5dd5756b68-ggcrs           1/1     Running   0               33s

so I did a reboot ($ clush -w @k8s reboot), after which the problem seem to have resurfaced:

$ k get po -A | grep -e cilium -e core
kube-system   cilium-dtxqg                       1/1     Running                           2 (4m5s ago)    6h19m
kube-system   cilium-lh9t4                       1/1     Running                           2 (5m31s ago)   6h19m
kube-system   cilium-operator-8487fbb8d7-dxz9p   1/1     Running                           3 (4m5s ago)    6h19m
kube-system   cilium-operator-8487fbb8d7-wvwt5   1/1     Running                           2 (5m31s ago)   6h19m
kube-system   cilium-hn2kx                       1/1     Running                           0               19s
kube-system   cilium-zkn9k                       1/1     Running                           2 (4m5s ago)    6h19m
kube-system   coredns-5dd5756b68-6r2bh           0/1     Running                           1 (4m5s ago)    7m29s
kube-system   coredns-5dd5756b68-ggcrs           0/1     Running                           1 (5m28s ago)   7m29s

versions after upgrade

$ ssh k8scp kubectl version
Client Version: v1.28.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.1
[matfra@pc100164241 k8s]$ cilium version
cilium-cli: v0.15.7 compiled with go1.21.0 on linux/amd64
cilium image (default): v1.14.1
cilium image (stable): v1.14.1
cilium image (running): 1.14.1

note that only kubernetes has been upgraded - cilium has remained untouched.

coredns logs that it is unable to communicate with kube-api

[INFO] plugin/ready: Still waiting on: "kubernetes"
[INFO] plugin/kubernetes: waiting for Kubernetes API before starting server
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration SHA512 = 591cf328cccc12bc490481273e738df59329c62c0b729d94e8b61db9961c2fa5f046dd37f1cf888b953814040d180f52594972691cd6ff41be96639138a43908
CoreDNS-1.10.1
linux/amd64, go1.20, 055b2c3
[INFO] plugin/ready: Still waiting on: "kubernetes"
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.96.0.1:443/version": dial tcp 10.96.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"

Sorry for being a bit verbose - just trying to be transparent so any user errors on my part will be apparent.

@brb
Copy link
Member

brb commented Sep 7, 2023

@azzid @kvaster Could you re-run my experiment to see whether you are hitting the same issue?

@borkmann
Copy link
Member

borkmann commented Sep 7, 2023

Cc @ti-mo wrt cgroup links

@aanm aanm added need-more-info More information is required to further debug or fix the issue. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. and removed sig/agent Cilium agent related. info-completed The GH issue has received a reply from the author labels Sep 7, 2023
@kvaster
Copy link
Contributor

kvaster commented Sep 7, 2023

@brb, will re-run experiment in approx 4 hours! Thanks!

@jspaleta
Copy link
Contributor

jspaleta commented Sep 7, 2023

Trying to confirm my reproducer is related to #27848 by installing Cilium 1.13.6 and then restarting all the Kind nodes..and it still looks like that fails for me...

So that's a bit of a head-scratcher as commentary in the other issue seems to correlate with the group attachment findings from @brb here

Reproducer procedure:

  1. spin up multi-node kind node with kubeproxy and cni disabled and version of k8s you want I've tried 1.27 and 1.28
  2. install cilium versions I've tried: 1.13.6,1.14.1, 1.15.0-pre.0
  3. confirm normal operation with running coredns
  4. docker restart all the kind node containers
  5. confirm coredns pods never come back up into a ready state and logs for those pods indicate
    Kubernetes API connection failure

@azzid
Copy link
Author

azzid commented Sep 8, 2023

@azzid @kvaster Could you re-run my experiment to see whether you are hitting the same issue?

root@k8scp:~# kubectl get po -A -o wide| grep '0/'
kube-system   coredns-5dd5756b68-6r2bh           0/1     Running   1 (42h ago)   42h   10.0.3.34        k8scp3      <none>           <none>
kube-system   coredns-5dd5756b68-ggcrs           0/1     Running   1 (42h ago)   42h   10.0.2.187       k8scp       <none>           <none>
root@k8scp:~# cat /proc/$(ps aux | grep coredns | head -n1 | awk '{print $2}')/cgroup
0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poddf6bc9ed_fe53_410f_bd20_f54e59f0c048.slice/cri-containerd-43d307d27e57ee95b7b05eaec4ff4aba66f49df09ba9db4c6bcbfb71adc6e066.scope
root@k8scp:~# bpftool cgroup tree | tail -2
/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poddf6bc9ed_fe53_410f_bd20_f54e59f0c048.slice/cri-containerd-43d307d27e57ee95b7b05eaec4ff4aba66f49df09ba9db4c6bcbfb71adc6e066.scope
    453      device          multi                          
root@k8scp:~#

@github-actions github-actions bot added info-completed The GH issue has received a reply from the author and removed need-more-info More information is required to further debug or fix the issue. labels Sep 8, 2023
@kvaster
Copy link
Contributor

kvaster commented Sep 8, 2023

From my setup:

k8s-1 /home/kva # sudo cat /proc/$(ps aux | grep core | head -n1 | awk '{print $2}')/cgroup
0::/kubepods/besteffort/pod79e1b178-95b5-4dc6-9d5b-1ad4ee64d190/e7d1027117911c63fef8582d28e1c544b729c470d378b7e7d9fbc9bf9fbf6c00

and

k8s-1 /home/kva # bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name           
/sys/fs/cgroup/kubepods/burstable/podf21937e2-50b5-452c-8a62-90d313eee5b4/19619a967ee0e5b1f2da717839139e4f925576c28645ba8b29872419000f1961
    52       cgroup_device   multi                                          
    368      cgroup_inet4_connect multi           cil_sock4_connect                
    374      cgroup_inet6_connect multi           cil_sock6_connect                
    372      cgroup_inet4_post_bind multi           cil_sock4_post_bind                
    370      cgroup_inet6_post_bind multi           cil_sock6_post_bind                
    373      cgroup_udp4_sendmsg multi           cil_sock4_sendmsg                
    371      cgroup_udp6_sendmsg multi           cil_sock6_sendmsg                
    377      cgroup_udp4_recvmsg multi           cil_sock4_recvmsg                
    375      cgroup_udp6_recvmsg multi           cil_sock6_recvmsg                
    369      cgroup_inet4_getpeername multi           cil_sock4_getpeername                
    376      cgroup_inet6_getpeername multi           cil_sock6_getpeername                
/sys/fs/cgroup/kubepods/burstable/podf21937e2-50b5-452c-8a62-90d313eee5b4/2dc2d678b10f7fbcb549b6206ec8f08960d92d77189ca97a2643ed3589257736
    43       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/podfeef4e54045161bcbf29022bb161242c/090af00fc865ed08fe561c0f7d3ead7e2e350169e0d8b40c4aac801949d86458
    7        cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/podfeef4e54045161bcbf29022bb161242c/5519ee689f1cf05e78c01bde889d4242595ef1bbe8af13f9d02c21ea7d6b10df
    25       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pod43dbcbd1f898c59a29d5e21b81bc9c30/1b0f2a8a84d0526f33f6f92f40f4a466ab2338f93c80eb5bbabdd9e0bebdb042
    19       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pod43dbcbd1f898c59a29d5e21b81bc9c30/ba173fe87a20af3b50a7a4f0ea0668b1cd4be232ad837429c854169d4677af5d
    4        cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pod49149134978c7c5a6df38d3a5cdde5b3/e6a4414d90fce7b9b5547cbb1bec8a846fd9f086b7b3f8ec3e48fd56814d1cce
    10       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pod49149134978c7c5a6df38d3a5cdde5b3/6e626dadafe9ad9c6a350b724fcc647a069bb438af83df373458d9e2e1130303
    22       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pode26b2f696e1f0489bc7d852a98b7f4df/28a51e57f88eed3e96f919ff9333c51a0ef944f9a812d49556bd325ed08ff3da
    13       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pode26b2f696e1f0489bc7d852a98b7f4df/1a501e704db7c21814541abf82372b07d7c26d02fd6f240642a6506bb2b4fcd5
    16       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pod8dd29228-6a55-4d65-b725-de7a4f19bac9/28d6fa00a07c9b8114d6ad51ca550db04afa1fe0e84d913e7b3a9005bcc47652
    680      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/burstable/pod8dd29228-6a55-4d65-b725-de7a4f19bac9/6cd2afab895631095ba78fe2349f5e7c9288893a0103ec6a75c661d0880d7f7e
    677      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podac594ac5-234d-42dd-ab99-2198ad3094fd/6e35b002c58080684385aa36fa36747f7b784ef7a2c8c3edcc71b333e1a143fd
    46       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podac594ac5-234d-42dd-ab99-2198ad3094fd/3c4d14c9d97ebea48df6b336efe629edbaf3a488c8c200987768479b10576985
    34       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod37f448da-134d-499a-9c7b-616c8d130e2b/0a9fde75026dbdebac006dfe53144d3b4234eb414a803dd9e0e48254cee075b2
    636      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod37f448da-134d-499a-9c7b-616c8d130e2b/7f7de1b279e1b9137c8a378a818413f4175bc1ddcfbc9cae8ca492e53237df5b
    615      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod37f448da-134d-499a-9c7b-616c8d130e2b/ce20fe2bdd48e66e488a5113a5cf34c1200614501f3c0359a95bbbc5ae4b856f
    591      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod1380c6e9-a9cd-43f6-9f97-16133da790d4/eeadfe119ec76c957338bde13f63603fbc4ae96e658ef480b625ec7c2375c22c
    612      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod1380c6e9-a9cd-43f6-9f97-16133da790d4/15799589c788138ee842ee824910c3163a69ffe0aa3369432e770458f2d6e96e
    563      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod4afde13c-558e-4847-9a49-a10ac97302da/23bdc088884367a8377526f733503baff4d7de6a07f1f4f84a57b21571bdc557
    560      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod4afde13c-558e-4847-9a49-a10ac97302da/ba475cc88303d98145fa4d150cc7e08c2e8c5c258879902e002fbea15e9ce2f1
    630      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod54eb5888-5033-456b-97f6-5b48e3f9422a/ef07d24abf0799a31c04df4f0ba3f8379eb0a6100c3a2e84341ffa8440b78cb6
    40       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod54eb5888-5033-456b-97f6-5b48e3f9422a/65e025ebc593d256d5fb778f30cc614f126e6f7dab6acaf010bc5b26c04e3be1
    37       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podee38f93a-7450-4607-8dab-aabe14e05216/d66e83ba864716cf8c5bf0be00df84297da37e7f571613a77cdaaf9482ea8540
    683      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podee38f93a-7450-4607-8dab-aabe14e05216/c0c5df035a9a91aeffbbe9fbd3061d134b968cad7844f0e12d3dcd8af6e379c4
    633      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podee38f93a-7450-4607-8dab-aabe14e05216/f849eb40bce769c7022e842d35052c40f280d213698ca224c628b1f249e2f79a
    535      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podee38f93a-7450-4607-8dab-aabe14e05216/ecde1b0ec37b2f00f28ca87e7f8aea340e16b696d26e114e742b1ca51294ce0d
    621      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podb264fcda-0685-46b9-bd7f-c9f41aed28dc/d03800378d3a43ed80b59d0f10c76dcff9b0b28cca3144c59f74b7f17f2de37a
    600      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podb264fcda-0685-46b9-bd7f-c9f41aed28dc/c3b68367701731c6c15d8fae1d37d9cb661a6d58d3090c78c0c4e90200bbfdbe
    518      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod79e1b178-95b5-4dc6-9d5b-1ad4ee64d190/b30ee92e68fcc0334f26dda81a0668232162d4ab677d417e50af75bb8a9180d6
    645      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod79e1b178-95b5-4dc6-9d5b-1ad4ee64d190/e7d1027117911c63fef8582d28e1c544b729c470d378b7e7d9fbc9bf9fbf6c00
    603      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod79e1b178-95b5-4dc6-9d5b-1ad4ee64d190/d3d5bdb50a044da963f8b65e366580dd1a2af8bf6497b76012ee8ac5bf407c83
    663      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod79e1b178-95b5-4dc6-9d5b-1ad4ee64d190/6a21ab1101a53ceb8d9b6b6d5fcbefa486795361897e6c2e3ea72a58ef150b64
    532      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod385f667b-7c87-440d-8d23-22304a35094b/b790c6df37f14722ce4d06b741636643824e60004ff131fb7fd638759c327c59
    515      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod385f667b-7c87-440d-8d23-22304a35094b/22a4400832d1eb7b9711e0020e2163311501ac5855db9ecfbaafb74089dabc98
    597      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod434bbb56-8a33-47c7-9ac2-71be0bb8f2c8/e6c91edb92a86a27c5905d6bce2078b1f2aa4675c8d4ad2718acaa6e42a227f1
    512      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod434bbb56-8a33-47c7-9ac2-71be0bb8f2c8/ccea10d9940dcb43fc2e0bd72389fa2a36b4acf52ded0181666b690d6af9ef75
    594      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod434bbb56-8a33-47c7-9ac2-71be0bb8f2c8/b2770b637dd712a9764b9b53d74fedd0900b4e0879a01514778f127a02b5d7f2
    624      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod434bbb56-8a33-47c7-9ac2-71be0bb8f2c8/3a4c3e9c19dd31cfa59b3b4fce37a4abe83f0b9f18a66068cb0c67e720c00dcb
    639      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podd72fabe0-f4bf-49e6-bd5d-709c0b5cbaa8/d22acdb1caf7579e610da7d76394e135af0fdd6340336774adc2dfe835231198
    588      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podd72fabe0-f4bf-49e6-bd5d-709c0b5cbaa8/d13d7d17bcdcafca2e705cdbf18b0e65c53c246dab5a354c9366b8290cfb7bf1
    648      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/podd72fabe0-f4bf-49e6-bd5d-709c0b5cbaa8/292ad4b76ead8ec0f6411a09b2794255c19d69bc44515366058a72447e8e94ab
    642      cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod2fe53cea-1e60-459b-be5a-420790364cdc/06688e8343e5f7b6877932700177f0b41c3236c61d6d143b7c997815d62757d9
    28       cgroup_device   multi                                          
/sys/fs/cgroup/kubepods/besteffort/pod2fe53cea-1e60-459b-be5a-420790364cdc/0f4c4ae05f01ad6c9fe496afcd8da173fc00ee366f3e70435e08f16c5055a35f
    31       cgroup_device   multi                                          

@aditighag
Copy link
Member

@azzid There seems to be a regression introduced on k8s v1.28 that involves app containers getting started before init containers - kubernetes/kubernetes#120247.

If you are running with cgroup v2 and cilium socket-lb config enabled, the mount-cgroup init container must run prior to the agent. This ensures that the BPF cgroup programs are attached at the right hierarchy. See the kube-proxy replacement guide 1 and 2 for more details.

As an alternative/workaround, you can mount the cgroup2 fs on the host, see this note in the guide -

Cilium will automatically mount cgroup v2 filesystem required to attach BPF cgroup programs by default at the path /run/cilium/cgroupv2. To do that, it needs to mount the host /proc inside an init container launched by the DaemonSet temporarily. If you need to disable the auto-mount, specify --set cgroup.autoMount.enabled=false, and set the host mount point where cgroup v2 filesystem is already mounted by using --set cgroup.hostRoot. For example, if not already mounted, you can mount cgroup v2 filesystem by running the below command on the host, and specify --set cgroup.hostRoot=/sys/fs/cgroup.
...

/cc @aojea Which k8s versions will the regression fix be available in?

@jspaleta
Copy link
Contributor

jspaleta commented Sep 8, 2023

My reproducer is probably an entirely different issue than initially reported, same symptoms but high likelihood a different problem..maybe even a Kind specific problem.
Should I file a new issue to track that?

I'm going to retest now using my local k3s home lab and avoid the complication of cluster node as containers sharing a common kernel.

@aojea
Copy link
Contributor

aojea commented Sep 8, 2023

/cc @aojea Which k8s versions will the regression fix be available in?

1.28.2 that is supposed to be released next week

@azzid
Copy link
Author

azzid commented Sep 11, 2023

/cc @aojea Which k8s versions will the regression fix be available in?

1.28.2 that is supposed to be released next week

2023-09-13 to be even more specific.

@azzid
Copy link
Author

azzid commented Sep 11, 2023

@azzid There seems to be a regression introduced on k8s v1.28 that involves app containers getting started before init containers - kubernetes/kubernetes#120247.

If you are running with cgroup v2 and cilium socket-lb config enabled, the mount-cgroup init container must run prior to the agent. This ensures that the BPF cgroup programs are attached at the right hierarchy. See the kube-proxy replacement guide 1 and 2 for more details.

As an alternative/workaround, you can mount the cgroup2 fs on the host, see this note in the guide -

Cilium will automatically mount cgroup v2 filesystem required to attach BPF cgroup programs by default at the path /run/cilium/cgroupv2. To do that, it needs to mount the host /proc inside an init container launched by the DaemonSet temporarily. If you need to disable the auto-mount, specify --set cgroup.autoMount.enabled=false, and set the host mount point where cgroup v2 filesystem is already mounted by using --set cgroup.hostRoot. For example, if not already mounted, you can mount cgroup v2 filesystem by running the below command on the host, and specify --set cgroup.hostRoot=/sys/fs/cgroup.
...

/cc @aojea Which k8s versions will the regression fix be available in?

I'm not entirely sure I understand what workaround you're proposing.

My nodes are running cgroup v2:

$ clush -w @k8s stat -fc %T /sys/fs/cgroup/
k8scp: cgroup2fs
k8scp3: cgroup2fs
k8sworker: cgroup2fs
k8scp2: cgroup2fs

And they all have a cgroup2 mountpoint:

$ clush -w @k8s 'mount | grep /sys/fs/cgroup'
k8scp: cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
k8scp2: cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
k8scp3: cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
k8sworker: cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)

I can't find any cgroup.hostRoot in the cilium-config.

There is however a cgroup-root, which seem to point to something good on one of the control planes - but not the others.

$ k -n kube-system get cm cilium-config -o yaml | grep -i cgroup
  cgroup-root: /run/cilium/cgroupv2
$ clush -w @k8s ls -l /run/cilium/cgroupv2
k8scp: total 0
k8scp: -r--r--r--  1 root root 0 Sep  6 15:51 cgroup.controllers
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 cgroup.max.depth
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 cgroup.max.descendants
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:51 cgroup.procs
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 cgroup.stat
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:51 cgroup.subtree_control
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 cgroup.threads
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 cpu.pressure
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 cpuset.cpus.effective
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 cpuset.mems.effective
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 cpu.stat
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 dev-hugepages.mount
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 dev-mqueue.mount
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:51 init.scope
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 io.cost.model
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 io.cost.qos
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 io.pressure
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 io.prio.class
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 io.stat
k8scp: drwxr-xr-x  4 root root 0 Sep  6 15:52 kubepods.slice
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 memory.numa_stat
k8scp: -rw-r--r--  1 root root 0 Sep  6 15:52 memory.pressure
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 memory.stat
k8scp: -r--r--r--  1 root root 0 Sep  6 15:52 misc.capacity
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 proc-fs-nfsd.mount
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 proc-sys-fs-binfmt_misc.mount
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 sys-fs-fuse-connections.mount
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 sys-kernel-config.mount
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 sys-kernel-debug.mount
k8scp: drwxr-xr-x  2 root root 0 Sep  6 15:52 sys-kernel-tracing.mount
k8scp: drwxr-xr-x 51 root root 0 Sep 11 07:44 system.slice
k8scp: drwxr-xr-x  3 root root 0 Sep  6 15:58 user.slice
k8scp2: total 0
k8scp3: total 0
k8sworker: total 0

Manually mounting seem like no no-go:

root@k8scp2:~# mount -t cgroup2 none /run/cilium/cgroupv2
mount: /run/cilium/cgroupv2: none already mounted on /run/credentials/systemd-sysusers.service.

Pointing the configmap to the existing mount point and restarting the pods does not seem to improve the situation:

$ k -n kube-system get cm cilium-config -o yaml | grep -i cgroup
  cgroup-root: /sys/fs/cgroup
$ k delete po -A -l name=cilium-operator
pod "cilium-operator-8487fbb8d7-g7xq2" deleted
pod "cilium-operator-8487fbb8d7-mszm5" deleted
$ k delete po -A -l k8s-app=cilium
pod "cilium-hqbzl" deleted
pod "cilium-hslcb" deleted
pod "cilium-npcn6" deleted
pod "cilium-t7kwg" deleted
$ k -n kube-system delete po -l k8s-app=kube-dns
pod "coredns-5dd5756b68-5rwl6" deleted
pod "coredns-5dd5756b68-fwwvf" deleted
$ k get po -A | grep -e cilium -e core
kube-system   cilium-d96ts                       1/1     Running   0               26s
kube-system   cilium-g9zgf                       1/1     Running   0               25s
kube-system   cilium-mdg89                       1/1     Running   0               26s
kube-system   cilium-operator-8487fbb8d7-mtppw   1/1     Running   0               33s
kube-system   cilium-operator-8487fbb8d7-wdtv8   1/1     Running   0               33s
kube-system   cilium-vvx54                       1/1     Running   0               25s
kube-system   coredns-5dd5756b68-cnv4s           0/1     Running   0               21s
kube-system   coredns-5dd5756b68-jgxw6           0/1     Running   0               20s

The change to the configmap seem to have been picked up by the pods:

$ k -n kube-system exec -it ds/cilium -- cilium config --all 2>&1| grep CGroupRoot 
CGroupRoot                        : /sys/fs/cgroup

@azzid
Copy link
Author

azzid commented Sep 14, 2023

Tried upgrading to 1.28.2 today:

$ k get no
NAME        STATUS   ROLES           AGE    VERSION
k8scp       Ready    control-plane   153d   v1.28.2
k8scp2      Ready    control-plane   153d   v1.28.2
k8scp3      Ready    control-plane   153d   v1.28.2
k8sworker   Ready    <none>          153d   v1.28.2

Upgrading alone does not seem to fix the dns:

$ k get po -A -l k8s-app=kube-dns -o wide
NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE     IP           NODE        NOMINATED NODE   READINESS GATES
kube-system   coredns-5dd5756b68-wxfwm   0/1     Running   0          5m45s   10.0.1.124   k8scp2      <none>           <none>
kube-system   coredns-5dd5756b68-zbt2c   0/1     Running   0          5m45s   10.0.0.39    k8sworker   <none>           <none>

Neither does deleting the old pods:

$ k delete po -A -l k8s-app=kube-dns
pod "coredns-5dd5756b68-wxfwm" deleted
pod "coredns-5dd5756b68-zbt2c" deleted
$ k get po -A -l k8s-app=kube-dns -o wide
NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE   IP           NODE     NOMINATED NODE   READINESS GATES
kube-system   coredns-5dd5756b68-fkcv5   0/1     Running   0          66s   10.0.3.145   k8scp3   <none>           <none>
kube-system   coredns-5dd5756b68-z5skz   0/1     Running   0          66s   10.0.2.127   k8scp    <none>           <none>

But a cluster reboot after the upgrade seem to get everything in shape:

$ clush -w @k8s,lb reboot
$ sleep 120
$ k get po -A -l k8s-app=kube-dns -o wide
NAMESPACE     NAME                       READY   STATUS    RESTARTS       AGE     IP           NODE        NOMINATED NODE   READINESS GATES
kube-system   coredns-5dd5756b68-fkcv5   1/1     Running   1 (104s ago)   5m28s   10.0.3.170   k8scp3      <none>           <none>
kube-system   coredns-5dd5756b68-hjws8   1/1     Running   0              24s     10.0.0.68    k8sworker   <none>           <none>
$ cilium connectivity test
(...)
✅ All 42 tests (306 actions) successful, 13 tests skipped, 0 scenarios skipped.

For completion, the bpf-stuff post upgrade:

root@k8scp3:~# cat /proc/$(ps aux | grep coredns | head -n1 | awk '{print $2}')/cgroup
0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod0fbef6a2_439a_4e3d_b861_c07dc2f2e567.slice/cri-containerd-8c0b79f8288484ac45ea2c35fe5c66eee446fe6bd2ef24086c90ab632994f9c1.scope

root@k8scp3:~# bpftool cgroup tree | grep -A1 8c0b79f8288484ac45ea2c35fe5c66eee446fe6bd2ef240
/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod0fbef6a2_439a_4e3d_b861_c07dc2f2e567.slice/cri-containerd-8c0b79f8288484ac45ea2c35fe5c66eee446fe6bd2ef24086c90ab632994f9c1.scope
    466      device          multi

@khalidsafwat
Copy link

Tried upgrading to 1.28.2 today:

$ k get no
NAME        STATUS   ROLES           AGE    VERSION
k8scp       Ready    control-plane   153d   v1.28.2
k8scp2      Ready    control-plane   153d   v1.28.2
k8scp3      Ready    control-plane   153d   v1.28.2
k8sworker   Ready    <none>          153d   v1.28.2

Upgrading alone does not seem to fix the dns:

$ k get po -A -l k8s-app=kube-dns -o wide
NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE     IP           NODE        NOMINATED NODE   READINESS GATES
kube-system   coredns-5dd5756b68-wxfwm   0/1     Running   0          5m45s   10.0.1.124   k8scp2      <none>           <none>
kube-system   coredns-5dd5756b68-zbt2c   0/1     Running   0          5m45s   10.0.0.39    k8sworker   <none>           <none>

Neither does deleting the old pods:

$ k delete po -A -l k8s-app=kube-dns
pod "coredns-5dd5756b68-wxfwm" deleted
pod "coredns-5dd5756b68-zbt2c" deleted
$ k get po -A -l k8s-app=kube-dns -o wide
NAMESPACE     NAME                       READY   STATUS    RESTARTS   AGE   IP           NODE     NOMINATED NODE   READINESS GATES
kube-system   coredns-5dd5756b68-fkcv5   0/1     Running   0          66s   10.0.3.145   k8scp3   <none>           <none>
kube-system   coredns-5dd5756b68-z5skz   0/1     Running   0          66s   10.0.2.127   k8scp    <none>           <none>

But a cluster reboot after the upgrade seem to get everything in shape:

$ clush -w @k8s,lb reboot
$ sleep 120
$ k get po -A -l k8s-app=kube-dns -o wide
NAMESPACE     NAME                       READY   STATUS    RESTARTS       AGE     IP           NODE        NOMINATED NODE   READINESS GATES
kube-system   coredns-5dd5756b68-fkcv5   1/1     Running   1 (104s ago)   5m28s   10.0.3.170   k8scp3      <none>           <none>
kube-system   coredns-5dd5756b68-hjws8   1/1     Running   0              24s     10.0.0.68    k8sworker   <none>           <none>
$ cilium connectivity test
(...)
✅ All 42 tests (306 actions) successful, 13 tests skipped, 0 scenarios skipped.

For completion, the bpf-stuff post upgrade:

root@k8scp3:~# cat /proc/$(ps aux | grep coredns | head -n1 | awk '{print $2}')/cgroup
0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod0fbef6a2_439a_4e3d_b861_c07dc2f2e567.slice/cri-containerd-8c0b79f8288484ac45ea2c35fe5c66eee446fe6bd2ef24086c90ab632994f9c1.scope

root@k8scp3:~# bpftool cgroup tree | grep -A1 8c0b79f8288484ac45ea2c35fe5c66eee446fe6bd2ef240
/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod0fbef6a2_439a_4e3d_b861_c07dc2f2e567.slice/cri-containerd-8c0b79f8288484ac45ea2c35fe5c66eee446fe6bd2ef24086c90ab632994f9c1.scope
    466      device          multi

Thanks @azzid that you tested it. Had the same problem being discussed here on a bare metal cluster of two nodes and also after rebooting the control plane everything stopped. I updated Kubernetes to 1.28.2 then rebooted and all back to normal.

@unixbird
Copy link

So what is the solution here? I'm receiving the same issue on 1.27.6 and my podcidr is 192.168.0.0/16. coredns gives me the same output as well. Is the solution to just update kubeadm?

@danpe
Copy link

danpe commented Apr 17, 2024

I'm still experiencing the same...

RHEL 9
Kubernetes 1.28.8
Cillium 1.15.3
CoreDNS 1.10.0

@kvaster
Copy link
Contributor

kvaster commented Apr 18, 2024

We observed such behaviour on kubernetes 1.28.8 aswell. Upgrading to kubernetes 1.29.x fixed it.

@danpe
Copy link

danpe commented Apr 18, 2024

@kvaster I ended up moving to Calico after couple of days of frustration ... 😢

@brb
Copy link
Member

brb commented Apr 18, 2024

We observed such behaviour on kubernetes 1.28.8 aswell. Upgrading to kubernetes 1.29.x fixed it.

@kvaster Do you mean a wrong cgroup root for the bpf_sock attachment?

@brb brb reopened this Apr 19, 2024
@T0MASD
Copy link

T0MASD commented May 2, 2024

might be unrelated, but had exact error i/o timeout to kubeapi endpoint in coredns logs and all pods not able to communicate outside the cluster. I have suspected nat issue, but could not see anything wrong. I suspected something with ipv6 so I've disabled it completely via cmdline, then I noticed coredns error changed that it was not able to communicate to ipv6 address i.e. cluster had dual stack (tho I didn't set it up). I solved my network error with:

net.ipv6.conf.all.forwarding=1

environment:
kube v1.29.4
fedora40
kubeadm init with pod-network-cidr=172.20.0.0/16, service-cidr=172.30.0.0/16 using crio socket
crio cni

👋 @\brb

@kolovo
Copy link

kolovo commented May 10, 2024

IS there a resolution eventually for the above issue?
I have exactly the same issue when upgrading from k8s .127.5 to 1.28.3 with kubespray and cilium chart v1.14 and kubeproxy replacement enabled.
Upgrading cilium to 1.15 doesnt change anything neither.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
info-completed The GH issue has received a reply from the author kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/k8s Impacts the kubernetes API, or kubernetes -> cilium internals translation layers.
Projects
None yet
Development

No branches or pull requests