manual upgrade of rke2 on arm setup gets several pods restarting and alternate between running/crashloopbackoff state #6235

aganesh-suse · 2024-06-20T20:35:18Z

Issue found on master branch with version v1.30.2-rc5+rke2r1

Environment Details

Infrastructure

Cloud
Hosted

Node(s) CPU architecture, OS, and Version:

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"

$ uname -m
aarch64

Cluster Configuration:

HA : 3 server / 1 agent

Config.yaml:

token: xxxx
write-kubeconfig-mode: "0644"
node-external-ip: 1.1.1.1

Additional files

Steps to reproduce:

Copy config.yaml

$ sudo mkdir -p /etc/rancher/rke2 && sudo cp config.yaml /etc/rancher/rke2

Install RKE2 with version v1.30.1+rke2r1. After double checking all services, nodes and pods are running,
do a manual upgrade on all nodes:

curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_VERSION='v1.30.2-rc5+rke2r1' INSTALL_RKE2_TYPE='server' INSTALL_RKE2_METHOD=tar sh -

Start the RKE2 service

$ sudo systemctl enable --now rke2-server
or 
$ sudo systemctl enable --now rke2-agent

Verify Cluster Status:

kubectl get nodes -o wide
kubectl get pods -A

Reproducing Results/Observations:

rke2 version used for replication:

rke2 -v
rke2 version v1.30.2-rc5+rke2r1 (3f678f964ad849e24449e49f0c2c44e75d944c9f)
go version go1.22.4 X:boringcrypto

Pods status for the nginx controllers and snapshot controllers seem to alternate between running/crashloopbackoff and are continuously restarting.

kubectl get pods -A
NAMESPACE        NAME                                                    READY   STATUS             RESTARTS         AGE
dnsutils         dnsutils                                                1/1     Running            1 (30m ago)      90m
kube-system      cloud-controller-manager-ip-172-31-12-208               1/1     Running            0                86m
kube-system      cloud-controller-manager-ip-172-31-14-17                1/1     Running            0                85m
kube-system      cloud-controller-manager-ip-172-31-8-183                1/1     Running            0                88m
kube-system      etcd-ip-172-31-12-208                                   1/1     Running            0                86m
kube-system      etcd-ip-172-31-14-17                                    1/1     Running            0                85m
kube-system      etcd-ip-172-31-8-183                                    1/1     Running            0                88m
kube-system      helm-install-rke2-canal-bcshx                           0/1     Completed          0                85m
kube-system      helm-install-rke2-coredns-jdxch                         0/1     Completed          0                85m
kube-system      helm-install-rke2-ingress-nginx-mcdnb                   0/1     CrashLoopBackOff   16 (2m48s ago)   85m
kube-system      helm-install-rke2-metrics-server-9kxgw                  0/1     CrashLoopBackOff   16 (2m50s ago)   85m
kube-system      helm-install-rke2-snapshot-controller-crd-t282f         0/1     CrashLoopBackOff   16 (2m49s ago)   85m
kube-system      helm-install-rke2-snapshot-controller-gp5v5             0/1     CrashLoopBackOff   16 (2m45s ago)   85m
kube-system      helm-install-rke2-snapshot-validation-webhook-8ck5h     0/1     CrashLoopBackOff   16 (2m48s ago)   85m
kube-system      kube-apiserver-ip-172-31-12-208                         1/1     Running            0                86m
kube-system      kube-apiserver-ip-172-31-14-17                          1/1     Running            0                85m
kube-system      kube-apiserver-ip-172-31-8-183                          1/1     Running            0                88m
kube-system      kube-controller-manager-ip-172-31-12-208                1/1     Running            1 (86m ago)      86m
kube-system      kube-controller-manager-ip-172-31-14-17                 1/1     Running            1 (85m ago)      85m
kube-system      kube-controller-manager-ip-172-31-8-183                 1/1     Running            1 (88m ago)      88m
kube-system      kube-proxy-ip-172-31-1-10                               1/1     Running            0                84m
kube-system      kube-proxy-ip-172-31-12-208                             1/1     Running            0                86m
kube-system      kube-proxy-ip-172-31-14-17                              1/1     Running            0                83m
kube-system      kube-proxy-ip-172-31-8-183                              1/1     Running            0                86m
kube-system      kube-scheduler-ip-172-31-12-208                         1/1     Running            0                86m
kube-system      kube-scheduler-ip-172-31-14-17                          1/1     Running            0                85m
kube-system      kube-scheduler-ip-172-31-8-183                          1/1     Running            0                88m
kube-system      rke2-canal-9ljxk                                        1/2     Running            0                86m
kube-system      rke2-canal-gg26w                                        2/2     Running            0                99m
kube-system      rke2-canal-kfvl2                                        2/2     Running            0                96m
kube-system      rke2-canal-zjdfm                                        1/2     Running            0                84m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-c4l42              1/1     Running            0                98m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-skwjd              1/1     Running            0                99m
kube-system      rke2-coredns-rke2-coredns-autoscaler-5749cd7b8b-n5z7d   1/1     Running            0                99m
kube-system      rke2-ingress-nginx-controller-5bz9k                     0/1     CrashLoopBackOff   27 (83s ago)     84m
kube-system      rke2-ingress-nginx-controller-6zbj2                     1/1     Running            1 (85m ago)      86m
kube-system      rke2-ingress-nginx-controller-bv29r                     1/1     Running            0                86m
kube-system      rke2-ingress-nginx-controller-qw8wl                     1/1     Running            0                85m
kube-system      rke2-metrics-server-868fc8795f-52vcg                    1/1     Running            0                98m
kube-system      rke2-snapshot-controller-7dcf5d5b46-754bx               1/1     Running            1 (86m ago)      98m
kube-system      rke2-snapshot-validation-webhook-bf7bbd6fc-z7ggb        1/1     Running            0                98m
test-clusterip   test-clusterip-74b55cccff-qdc9m                         1/1     Running            0                90m
test-clusterip   test-clusterip-74b55cccff-tmn7h                         1/1     Running            0                90m
test-daemonset   test-daemonset-2b4j5                                    1/1     Running            0                90m
test-daemonset   test-daemonset-2k59p                                    1/1     Running            0                90m
test-daemonset   test-daemonset-qphnm                                    1/1     Running            0                90m
test-daemonset   test-daemonset-sv4p6                                    1/1     Running            0                90m
test-ingress     test-ingress-l74mn                                      1/1     Running            0                90m
test-ingress     test-ingress-tjtdq                                      1/1     Running            0                90m

NAMESPACE        NAME                                                    READY   STATUS             RESTARTS         AGE
dnsutils         dnsutils                                                1/1     Running            1 (33m ago)      93m
kube-system      cloud-controller-manager-ip-172-31-12-208               1/1     Running            0                90m
kube-system      cloud-controller-manager-ip-172-31-14-17                1/1     Running            0                89m
kube-system      cloud-controller-manager-ip-172-31-8-183                1/1     Running            0                91m
kube-system      etcd-ip-172-31-12-208                                   1/1     Running            0                90m
kube-system      etcd-ip-172-31-14-17                                    1/1     Running            0                89m
kube-system      etcd-ip-172-31-8-183                                    1/1     Running            0                91m
kube-system      helm-install-rke2-canal-bcshx                           0/1     Completed          0                89m
kube-system      helm-install-rke2-coredns-jdxch                         0/1     Completed          0                89m
kube-system      helm-install-rke2-ingress-nginx-mcdnb                   1/1     Running            17 (6m5s ago)    89m
kube-system      helm-install-rke2-metrics-server-9kxgw                  1/1     Running            17 (6m7s ago)    89m
kube-system      helm-install-rke2-snapshot-controller-crd-t282f         1/1     Running            17 (6m6s ago)    89m
kube-system      helm-install-rke2-snapshot-controller-gp5v5             1/1     Running            17 (6m2s ago)    89m
kube-system      helm-install-rke2-snapshot-validation-webhook-8ck5h     1/1     Running            17 (6m5s ago)    89m
kube-system      kube-apiserver-ip-172-31-12-208                         1/1     Running            0                90m
kube-system      kube-apiserver-ip-172-31-14-17                          1/1     Running            0                89m
kube-system      kube-apiserver-ip-172-31-8-183                          1/1     Running            0                91m
kube-system      kube-controller-manager-ip-172-31-12-208                1/1     Running            1 (90m ago)      90m
kube-system      kube-controller-manager-ip-172-31-14-17                 1/1     Running            1 (89m ago)      89m
kube-system      kube-controller-manager-ip-172-31-8-183                 1/1     Running            1 (91m ago)      91m
kube-system      kube-proxy-ip-172-31-1-10                               1/1     Running            0                87m
kube-system      kube-proxy-ip-172-31-12-208                             1/1     Running            0                90m
kube-system      kube-proxy-ip-172-31-14-17                              1/1     Running            0                87m
kube-system      kube-proxy-ip-172-31-8-183                              1/1     Running            0                89m
kube-system      kube-scheduler-ip-172-31-12-208                         1/1     Running            0                90m
kube-system      kube-scheduler-ip-172-31-14-17                          1/1     Running            0                89m
kube-system      kube-scheduler-ip-172-31-8-183                          1/1     Running            0                91m
kube-system      rke2-canal-9ljxk                                        1/2     Running            0                89m
kube-system      rke2-canal-gg26w                                        2/2     Running            0                102m
kube-system      rke2-canal-kfvl2                                        2/2     Running            0                99m
kube-system      rke2-canal-zjdfm                                        1/2     Running            0                88m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-c4l42              1/1     Running            0                101m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-skwjd              1/1     Running            0                103m
kube-system      rke2-coredns-rke2-coredns-autoscaler-5749cd7b8b-n5z7d   1/1     Running            0                103m
kube-system      rke2-ingress-nginx-controller-5bz9k                     0/1     CrashLoopBackOff   27 (4m40s ago)   87m
kube-system      rke2-ingress-nginx-controller-6zbj2                     1/1     Running            1 (88m ago)      90m
kube-system      rke2-ingress-nginx-controller-bv29r                     1/1     Running            0                89m
kube-system      rke2-ingress-nginx-controller-qw8wl                     1/1     Running            0                88m
kube-system      rke2-metrics-server-868fc8795f-52vcg                    1/1     Running            0                102m
kube-system      rke2-snapshot-controller-7dcf5d5b46-754bx               1/1     Running            1 (89m ago)      102m
kube-system      rke2-snapshot-validation-webhook-bf7bbd6fc-z7ggb        1/1     Running            0                102m
test-clusterip   test-clusterip-74b55cccff-qdc9m                         1/1     Running            0                93m
test-clusterip   test-clusterip-74b55cccff-tmn7h                         1/1     Running            0                93m
test-daemonset   test-daemonset-2b4j5                                    1/1     Running            0                93m
test-daemonset   test-daemonset-2k59p                                    1/1     Running            0                93m
test-daemonset   test-daemonset-qphnm                                    1/1     Running            0                93m
test-daemonset   test-daemonset-sv4p6                                    1/1     Running            0                93m
test-ingress     test-ingress-l74mn                                      1/1     Running            0                93m
test-ingress     test-ingress-tjtdq                                      1/1     Running            0                93m

Some of the pod logs:

alias kspl='kubectl -n kube-system logs'

kspl helm-install-rke2-ingress-nginx-mcdnb
if [[ ${KUBERNETES_SERVICE_HOST} =~ .*:.* ]]; then
	echo "KUBERNETES_SERVICE_HOST is using IPv6"
	CHART="${CHART//%\{KUBERNETES_API\}%/[${KUBERNETES_SERVICE_HOST}]:${KUBERNETES_SERVICE_PORT}}"
else
	CHART="${CHART//%\{KUBERNETES_API\}%/${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}}"
fi

set +v -x
+ [[ '' != \t\r\u\e ]]
+ export HELM_HOST=127.0.0.1:44134
+ HELM_HOST=127.0.0.1:44134
+ helm_v2 init --skip-refresh --client-only --stable-repo-url https://charts.helm.sh/stable/
+ tiller --listen=127.0.0.1:44134 --storage=secret
[main] 2024/06/20 19:55:56 Starting Tiller v2.17.0 (tls=false)
[main] 2024/06/20 19:55:56 GRPC listening on 127.0.0.1:44134
[main] 2024/06/20 19:55:56 Probes listening on :44135
[main] 2024/06/20 19:55:56 Storage driver is Secret
[main] 2024/06/20 19:55:56 Max history per release is 0
$HELM_HOME has been configured at /home/klipper-helm/.helm.
Not installing Tiller due to 'client-only' flag having been set
++ timeout -s KILL ++ jq 30 -r '.Releases | length'
helm_v2 ls --all '^rke2-ingress-nginx$' --output json
[storage] 2024/06/20 19:55:57 listing all releases with filter
+ V2_CHART_EXISTS=
+ [[ '' == \1 ]]
+ [[ '' == \v\2 ]]
+ shopt -s nullglob
+ [[ -f /config/ca-file.pem ]]
+ [[ -f /tmp/ca-file.pem ]]
+ [[ -n '' ]]
+ helm_content_decode
+ set -e
+ ENC_CHART_PATH=/chart/rke2-ingress-nginx.tgz.base64
+ CHART_PATH=/tmp/rke2-ingress-nginx.tgz
+ [[ ! -f /chart/rke2-ingress-nginx.tgz.base64 ]]
+ base64 -d /chart/rke2-ingress-nginx.tgz.base64
+ CHART=/tmp/rke2-ingress-nginx.tgz
+ set +e
+ [[ install != \d\e\l\e\t\e ]]
+ helm_repo_init
+ grep -q -e 'https\?://'
+ [[ helm_v3 == \h\e\l\m\_\v\3 ]]
+ [[ /tmp/rke2-ingress-nginx.tgz == stable/* ]]
+ [[ -n '' ]]
+ helm_update install --set-string global.clusterCIDR=10.42.0.0/16 --set-string global.clusterCIDRv4=10.42.0.0/16 --set-string global.clusterDNS=10.43.0.10 --set-string global.clusterDomain=cluster.local --set-string global.rke2DataDir=/var/lib/rancher/rke2 --set-string global.serviceCIDR=10.43.0.0/16
+ [[ helm_v3 == \h\e\l\m\_\v\3 ]]
++ ++ jq -r '"\(.[0].chart),\(.[0].status)"'
tr '[:upper:]' '[:lower:]'
++ helm_v3 ls --all -f '^rke2-ingress-nginx$' --namespace kube-system --output json
[storage/driver] 2024/06/20 19:56:27 list: failed to list: Get "https://10.43.0.1:443/api/v1/namespaces/kube-system/secrets?labelSelector=OWNER%3DTILLER": dial tcp 10.43.0.1:443: i/o timeout
Error: Kubernetes cluster unreachable: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
+ LINE=
+ IFS=,
+ read -r INSTALLED_VERSION STATUS _
+ VALUES=
+ [[ install = \d\e\l\e\t\e ]]
+ [[ '' =~ ^(|null)$ ]]
+ [[ '' =~ ^(|null)$ ]]
+ echo 'Installing helm_v3 chart'
+ helm_v3 install --set-string global.clusterCIDR=10.42.0.0/16 --set-string global.clusterCIDRv4=10.42.0.0/16 --set-string global.clusterDNS=10.43.0.10 --set-string global.clusterDomain=cluster.local --set-string global.rke2DataDir=/var/lib/rancher/rke2 --set-string global.serviceCIDR=10.43.0.0/16 rke2-ingress-nginx /tmp/rke2-ingress-nginx.tgz
Error: INSTALLATION FAILED: Kubernetes cluster unreachable: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
+ exit

kspl rke2-ingress-nginx-controller-5bz9k
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.10.1-hardened1
  Build:         git-b48c66a2f
  Repository:    https://github.com/rancher/ingress-nginx
  nginx version: nginx/1.25.3

-------------------------------------------------------------------------------

W0620 20:05:03.881215       7 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0620 20:05:03.881997       7 main.go:205] "Creating API client" host="https://10.43.0.1:443"

Expected behavior:

Post upgrade - All pods should be up and running successfully (without restarting several times and heading into a crashloopbackoff. )

The text was updated successfully, but these errors were encountered:

aganesh-suse · 2024-06-24T23:14:21Z

Am getting inconsistent test results with rc6. Out of 5 manual upgrades, 2 times the pods looked like they were running - after 7 to 9 restarts - and they stayed running for at least 15 mins or so... the other 3 times, i see results as follows:

kubectl get pods -A 
NAMESPACE        NAME                                                    READY   STATUS             RESTARTS         AGE
dnsutils         dnsutils                                                1/1     Running            2 (51s ago)      121m
kube-system      cloud-controller-manager-ip-172-31-12-154               1/1     Running            1 (116m ago)     116m
kube-system      cloud-controller-manager-ip-172-31-12-169               1/1     Running            0                119m
kube-system      cloud-controller-manager-ip-172-31-14-61                1/1     Running            0                117m
kube-system      etcd-ip-172-31-12-154                                   1/1     Running            0                116m
kube-system      etcd-ip-172-31-12-169                                   1/1     Running            0                119m
kube-system      etcd-ip-172-31-14-61                                    1/1     Running            0                117m
kube-system      helm-install-rke2-canal-l22p4                           0/1     Completed          0                117m
kube-system      helm-install-rke2-coredns-jcvrf                         0/1     Completed          0                117m
kube-system      helm-install-rke2-ingress-nginx-w6zkw                   0/1     CrashLoopBackOff   21 (40s ago)     117m
kube-system      helm-install-rke2-metrics-server-9s48g                  0/1     Error              21 (6m50s ago)   117m
kube-system      helm-install-rke2-snapshot-controller-crd-tlbvh         1/1     Running            21 (6m28s ago)   117m
kube-system      helm-install-rke2-snapshot-controller-zj6cf             1/1     Running            21 (6m16s ago)   117m
kube-system      helm-install-rke2-snapshot-validation-webhook-wjzmc     0/1     Error              21 (6m33s ago)   117m
kube-system      kube-apiserver-ip-172-31-12-154                         1/1     Running            0                116m
kube-system      kube-apiserver-ip-172-31-12-169                         1/1     Running            0                119m
kube-system      kube-apiserver-ip-172-31-14-61                          1/1     Running            0                117m
kube-system      kube-controller-manager-ip-172-31-12-154                1/1     Running            1 (116m ago)     116m
kube-system      kube-controller-manager-ip-172-31-12-169                1/1     Running            1 (119m ago)     119m
kube-system      kube-controller-manager-ip-172-31-14-61                 1/1     Running            1 (117m ago)     117m
kube-system      kube-proxy-ip-172-31-1-11                               1/1     Running            0                115m
kube-system      kube-proxy-ip-172-31-12-154                             1/1     Running            0                116m
kube-system      kube-proxy-ip-172-31-12-169                             1/1     Running            0                118m
kube-system      kube-proxy-ip-172-31-14-61                              1/1     Running            0                116m
kube-system      kube-scheduler-ip-172-31-12-154                         1/1     Running            0                116m
kube-system      kube-scheduler-ip-172-31-12-169                         1/1     Running            0                119m
kube-system      kube-scheduler-ip-172-31-14-61                          1/1     Running            0                117m
kube-system      rke2-canal-9g9x9                                        1/2     Running            8 (2s ago)       117m
kube-system      rke2-canal-kf2vv                                        2/2     Running            0                128m
kube-system      rke2-canal-rbm8n                                        2/2     Running            0                129m
kube-system      rke2-canal-t8s6h                                        2/2     Running            0                126m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-gpc7q              1/1     Running            0                128m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-t44tq              1/1     Running            0                129m
kube-system      rke2-coredns-rke2-coredns-autoscaler-5749cd7b8b-rnw9k   1/1     Running            0                129m
kube-system      rke2-ingress-nginx-controller-cfggs                     1/1     Running            0                125m
kube-system      rke2-ingress-nginx-controller-kl99t                     1/1     Running            0                128m
kube-system      rke2-ingress-nginx-controller-sfknn                     1/1     Running            0                125m
kube-system      rke2-ingress-nginx-controller-z9z5s                     1/1     Running            0                128m
kube-system      rke2-metrics-server-868fc8795f-dnr56                    1/1     Running            0                129m
kube-system      rke2-snapshot-controller-7dcf5d5b46-54fvl               1/1     Running            1 (117m ago)     129m
kube-system      rke2-snapshot-validation-webhook-bf7bbd6fc-gms6n        1/1     Running            0                129m
test-clusterip   test-clusterip-74b55cccff-nz5s5                         1/1     Running            0                121m
test-clusterip   test-clusterip-74b55cccff-pmbj2                         1/1     Running            0                121m
test-daemonset   test-daemonset-57shk                                    1/1     Running            0                121m
test-daemonset   test-daemonset-7nrsc                                    1/1     Running            0                121m
test-daemonset   test-daemonset-hjqtg                                    1/1     Running            0                121m
test-daemonset   test-daemonset-qgtzg                                    1/1     Running            0                121m
test-ingress     test-ingress-9r4df                                      1/1     Running            0                120m
test-ingress     test-ingress-jwkjs                                      1/1     Running            0                120m

$ kspl helm-install-rke2-metrics-server-9s48g -c helm
.
.
+ echo 'Installing helm_v3 chart'
+ helm_v3 install --set-string global.clusterCIDR=10.42.0.0/16 --set-string global.clusterCIDRv4=10.42.0.0/16 --set-string global.clusterDNS=10.43.0.10 --set-string global.clusterDomain=cluster.local --set-string global.rke2DataDir=/var/lib/rancher/rke2 --set-string global.serviceCIDR=10.43.0.0/16 rke2-metrics-server /tmp/rke2-metrics-server.tgz
Error: INSTALLATION FAILED: Kubernetes cluster unreachable: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
+ exit

kspl helm-install-rke2-snapshot-controller-crd-tlbvh -c helm
.
.
+ helm_v3 install --set-string global.clusterCIDR=10.42.0.0/16 --set-string global.clusterCIDRv4=10.42.0.0/16 --set-string global.clusterDNS=10.43.0.10 --set-string global.clusterDomain=cluster.local --set-string global.rke2DataDir=/var/lib/rancher/rke2 --set-string global.serviceCIDR=10.43.0.0/16 rke2-snapshot-controller-crd /tmp/rke2-snapshot-controller-crd.tgz
Error: INSTALLATION FAILED: Kubernetes cluster unreachable: Get "https://10.43.0.1:443/version": dial tcp 10.43.0.1:443: i/o timeout
+ exit

aganesh-suse · 2024-06-24T23:17:02Z

canal logs:

$ kspl rke2-canal-9g9x9
2024-06-24 23:16:18.102 [WARNING][14159] felix/table.go 809: iptables-legacy-save command failed error=fork/exec /usr/sbin/iptables-legacy-save: exec format error ipVersion=0x4 stderr="" table="raw"
2024-06-24 23:16:18.102 [PANIC][14159] felix/table.go 815: iptables-legacy-save command failed after retries ipVersion=0x4 table="raw"
panic: (*logrus.Entry) 0x40003436c0

goroutine 218 [running]:
github.com/sirupsen/logrus.(*Entry).log(0x400018b810, 0x0, {0x400074e880, 0x31})
	/go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/entry.go:260 +0x494
github.com/sirupsen/logrus.(*Entry).Log(0x400018b810, 0x0, {0x40000829d8?, 0x1?, 0x1?})
	/go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/entry.go:304 +0x60
github.com/sirupsen/logrus.(*Entry).Logf(0x400018b810, 0x0, {0x344f0b9?, 0x6?}, {0x4000082aa8?, 0x0?, 0x4000000014?})
	/go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/entry.go:349 +0x88
github.com/sirupsen/logrus.(*Entry).Panicf(...)
	/go/pkg/mod/github.com/sirupsen/logrus@v1.9.3/entry.go:387
github.com/projectcalico/calico/felix/iptables.(*Table).getHashesAndRulesFromDataplane(0x400060e488)
	/go/src/github.com/projectcalico/calico/felix/iptables/table.go:815 +0x310
github.com/projectcalico/calico/felix/iptables.(*Table).loadDataplaneState(0x400060e488)
	/go/src/github.com/projectcalico/calico/felix/iptables/table.go:652 +0x14c
github.com/projectcalico/calico/felix/iptables.(*Table).Apply(0x400060e488)
	/go/src/github.com/projectcalico/calico/felix/iptables/table.go:1056 +0x30c
github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply.func4(0x400086a690?)
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2196 +0x48
created by github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply in goroutine 87
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2195 +0xfcc

brandond · 2024-06-25T00:58:18Z

2024-06-24 23:16:18.102 [WARNING][14159] felix/table.go 809: iptables-legacy-save command failed error=fork/exec /usr/sbin/iptables-legacy-save: exec format error ipVersion=0x4 stderr="" table="raw"

This indicates that there are additional amd64 binaries being used on an arm64 system - so more GH Actions ARCH/TARGETARCH problems. Which of the canal containers is that log from?

aganesh-suse · 2024-06-25T17:55:47Z

kspl rke2-canal-9g9x9 -c calico-node | grep panic
panic: (*logrus.Entry) 0x4000672d20
panic: (*logrus.Entry) 0x40002d7730
panic: (*logrus.Entry) 0x4000679b90
panic: (*logrus.Entry) 0x400062f8f0
panic: (*logrus.Entry) 0x40006087e0
.
.

aganesh-suse · 2024-06-29T00:41:57Z

Verified on master branch with v1.30.2-rc7+rke2r1

rke2 -v
rke2 version v1.30.2-rc7+rke2r1 (f01072ab2b9cf1a529ce188e4a8d8645813d4620)
go version go1.22.4 X:boringcrypto

kubectl get nodes
NAME              STATUS   ROLES                       AGE   VERSION
ip-172-31-11-57   Ready    <none>                      77m   v1.30.2+rke2r1
ip-172-31-12-61   Ready    control-plane,etcd,master   79m   v1.30.2+rke2r1
ip-172-31-15-94   Ready    control-plane,etcd,master   77m   v1.30.2+rke2r1
ip-172-31-5-28    Ready    control-plane,etcd,master   80m   v1.30.2+rke2r1

kubectl get pods -A
NAMESPACE        NAME                                                    READY   STATUS      RESTARTS      AGE
dnsutils         dnsutils                                                1/1     Running     0             40m
kube-system      cloud-controller-manager-ip-172-31-12-61                1/1     Running     0             39m
kube-system      cloud-controller-manager-ip-172-31-15-94                1/1     Running     0             38m
kube-system      cloud-controller-manager-ip-172-31-5-28                 1/1     Running     0             37m
kube-system      etcd-ip-172-31-12-61                                    1/1     Running     0             39m
kube-system      etcd-ip-172-31-15-94                                    1/1     Running     0             38m
kube-system      etcd-ip-172-31-5-28                                     1/1     Running     0             37m
kube-system      helm-install-rke2-canal-6bxpm                           0/1     Completed   0             38m
kube-system      helm-install-rke2-coredns-ltdhd                         0/1     Completed   0             38m
kube-system      helm-install-rke2-ingress-nginx-dvcbk                   0/1     Completed   1             38m
kube-system      helm-install-rke2-metrics-server-tsbpm                  0/1     Completed   0             38m
kube-system      helm-install-rke2-snapshot-controller-crd-9x5fw         0/1     Completed   1             38m
kube-system      helm-install-rke2-snapshot-controller-rq8rw             0/1     Completed   0             38m
kube-system      helm-install-rke2-snapshot-validation-webhook-fzsqh     0/1     Completed   0             38m
kube-system      kube-apiserver-ip-172-31-12-61                          1/1     Running     0             39m
kube-system      kube-apiserver-ip-172-31-15-94                          1/1     Running     0             38m
kube-system      kube-apiserver-ip-172-31-5-28                           1/1     Running     0             37m
kube-system      kube-controller-manager-ip-172-31-12-61                 1/1     Running     1 (39m ago)   39m
kube-system      kube-controller-manager-ip-172-31-15-94                 1/1     Running     1 (38m ago)   38m
kube-system      kube-controller-manager-ip-172-31-5-28                  1/1     Running     1 (37m ago)   37m
kube-system      kube-proxy-ip-172-31-11-57                              1/1     Running     0             37m
kube-system      kube-proxy-ip-172-31-12-61                              1/1     Running     0             37m
kube-system      kube-proxy-ip-172-31-15-94                              1/1     Running     0             38m
kube-system      kube-proxy-ip-172-31-5-28                               1/1     Running     0             35m
kube-system      kube-scheduler-ip-172-31-12-61                          1/1     Running     0             39m
kube-system      kube-scheduler-ip-172-31-15-94                          1/1     Running     0             38m
kube-system      kube-scheduler-ip-172-31-5-28                           1/1     Running     0             37m
kube-system      rke2-canal-54xbf                                        2/2     Running     0             37m
kube-system      rke2-canal-5hgch                                        2/2     Running     0             36m
kube-system      rke2-canal-gtlms                                        2/2     Running     0             35m
kube-system      rke2-canal-rkjmh                                        2/2     Running     0             36m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-csb4l              1/1     Running     0             46m
kube-system      rke2-coredns-rke2-coredns-64dcf4f58b-l7h8s              1/1     Running     0             47m
kube-system      rke2-coredns-rke2-coredns-autoscaler-5749cd7b8b-qhbfb   1/1     Running     0             47m
kube-system      rke2-ingress-nginx-controller-46dzd                     1/1     Running     0             37m
kube-system      rke2-ingress-nginx-controller-8drbb                     1/1     Running     0             34m
kube-system      rke2-ingress-nginx-controller-bsf4s                     1/1     Running     0             36m
kube-system      rke2-ingress-nginx-controller-ln7tj                     1/1     Running     0             34m
kube-system      rke2-metrics-server-868fc8795f-ctdm7                    1/1     Running     1 (45m ago)   46m
kube-system      rke2-snapshot-controller-7dcf5d5b46-f756n               1/1     Running     1 (39m ago)   46m
kube-system      rke2-snapshot-validation-webhook-bf7bbd6fc-d2nwk        1/1     Running     1 (45m ago)   46m
test-clusterip   test-clusterip-74b55cccff-6xsh2                         1/1     Running     0             40m
test-clusterip   test-clusterip-74b55cccff-jjmrn                         1/1     Running     0             40m
test-daemonset   test-daemonset-47hcm                                    1/1     Running     0             40m
test-daemonset   test-daemonset-62psg                                    1/1     Running     0             40m
test-daemonset   test-daemonset-f5m7s                                    1/1     Running     0             40m
test-daemonset   test-daemonset-pkzzx                                    1/1     Running     0             40m
test-ingress     test-ingress-9sg4t                                      1/1     Running     0             40m
test-ingress     test-ingress-r8bx6                                      1/1     Running     0             40m

rancher-max added kind/bug-qa status/release-blocker labels Jun 20, 2024

aganesh-suse self-assigned this Jun 20, 2024

dereknola mentioned this issue Jun 25, 2024

Reinstate newest rke2-ingress-nginx #6253

Merged

caroline-suse-rancher added this to the v1.30.2+rke2r1 milestone Jun 26, 2024

aganesh-suse closed this as completed Jun 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

manual upgrade of rke2 on arm setup gets several pods restarting and alternate between running/crashloopbackoff state #6235

manual upgrade of rke2 on arm setup gets several pods restarting and alternate between running/crashloopbackoff state #6235

aganesh-suse commented Jun 20, 2024 •

edited

Loading

aganesh-suse commented Jun 24, 2024

aganesh-suse commented Jun 24, 2024 •

edited

Loading

brandond commented Jun 25, 2024 •

edited

Loading

aganesh-suse commented Jun 25, 2024 •

edited

Loading

aganesh-suse commented Jun 29, 2024

manual upgrade of rke2 on arm setup gets several pods restarting and alternate between running/crashloopbackoff state #6235

manual upgrade of rke2 on arm setup gets several pods restarting and alternate between running/crashloopbackoff state #6235

Comments

aganesh-suse commented Jun 20, 2024 • edited Loading

Issue found on master branch with version v1.30.2-rc5+rke2r1

Environment Details

Steps to reproduce:

aganesh-suse commented Jun 24, 2024

aganesh-suse commented Jun 24, 2024 • edited Loading

brandond commented Jun 25, 2024 • edited Loading

aganesh-suse commented Jun 25, 2024 • edited Loading

aganesh-suse commented Jun 29, 2024

aganesh-suse commented Jun 20, 2024 •

edited

Loading

aganesh-suse commented Jun 24, 2024 •

edited

Loading

brandond commented Jun 25, 2024 •

edited

Loading

aganesh-suse commented Jun 25, 2024 •

edited

Loading