Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy generates the wrong iptables dnat rule #114537

Closed
cyclinder opened this issue Dec 16, 2022 · 27 comments
Closed

kube-proxy generates the wrong iptables dnat rule #114537

cyclinder opened this issue Dec 16, 2022 · 27 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@cyclinder
Copy link
Contributor

cyclinder commented Dec 16, 2022

What happened?

kube-proxy generates the wrong iptables dnat rule, as shown following:

[root@controller-node-1 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.233.0.1      <none>        443/TCP   3h18m
my-dep       ClusterIP   10.233.60.220   <none>        80/TCP    26m
my-svc       ClusterIP   10.233.48.3     <none>        80/TCP    26m

[root@worker1 ~]# iptables-save -t nat | grep '10.233.0.1'
-A KUBE-SERVICES -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s 10.233.64.0/18 -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
[root@worker1 ~]# iptables-save -t nat | grep KUBE-SVC-NPX46M4PTMTKRN6Y
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
-A KUBE-SERVICES -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s 10.233.64.0/18 -d 10.233.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB
[root@worker1 ~]# iptables-save -t nat | grep KUBE-SEP-CC3HXZSKU6BR4DDB
:KUBE-SEP-CC3HXZSKU6BR4DDB - [0:0]
-A KUBE-SEP-CC3HXZSKU6BR4DDB -s 10.6.214.21/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB

This causes the worker node to be unable to access the apiserver because of this incorrect iptables rule.

What did you expect to happen?

kube-proxy should be generates correct iptables rule

How can we reproduce it (as minimally and precisely as possible)?

  • Create a cluster via kubespray, All are works.
[root@controller-node-1 ~]# kubectl get nodes -o wide
NAME                STATUS     ROLES           AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
controller-node-1   Ready      control-plane   129m   v1.25.3   10.6.214.12   <none>        CentOS Linux 7 (Core)   5.19.10-1.el7.elrepo.x86_64   containerd://1.6.8
  • Join a node
[root@controller-node-1 ~]# kubectl get nodes -o wide
NAME                STATUS     ROLES           AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
controller-node-1   Ready      control-plane   129m   v1.25.3   10.6.214.12   <none>        CentOS Linux 7 (Core)   5.19.10-1.el7.elrepo.x86_64   containerd://1.6.8
worker1             NotReady   <none>          85m    v1.25.3   10.6.214.13   <none>        CentOS Linux 7 (Core)   5.4.197-1.el7.elrepo.x86_64   containerd://1.6.8

And I found some hostNetwork Pod on work1 failed to start. Since these pods failed to visit apiServer, I check that the firewall has been disable.

I found kube-proxy generates incorrect iptables rule:

[root@worker1 ~]# iptables-save -t nat | grep KUBE-SEP-CC3HXZSKU6BR4DDB
:KUBE-SEP-CC3HXZSKU6BR4DDB - [0:0]
-A KUBE-SEP-CC3HXZSKU6BR4DDB -s 10.6.214.21/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB

--to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent ?

In controller-node-1, it works fine.

[root@controller-node-1 ~]# iptables-save -t nat | grep KUBE-SEP-CC3HXZSKU6BR4DDB
:KUBE-SEP-CC3HXZSKU6BR4DDB - [0:0]
-A KUBE-SEP-CC3HXZSKU6BR4DDB -s 10.6.214.21/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 10.6.214.21:6443
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.6.214.21:6443" -j KUBE-SEP-CC3HXZSKU6BR4DDB

I tried to create a new service, but the iptables rules generated by kube-proxy have the same problem.

[root@controller-node-1 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.233.0.1      <none>        443/TCP   3h18m
my-dep       ClusterIP   10.233.60.220   <none>        80/TCP    26m
my-svc       ClusterIP   10.233.48.3     <none>        80/TCP    26m
[root@worker1 ~]# iptables-save -t nat | grep  10.233.60.220
-A KUBE-SERVICES -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-SVC-YIDRKHK4K7YFNT5I
-A KUBE-SVC-YIDRKHK4K7YFNT5I ! -s 10.233.64.0/18 -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
[root@worker1 ~]# iptables-save -t nat | grep   KUBE-SVC-YIDRKHK4K7YFNT5I
:KUBE-SVC-YIDRKHK4K7YFNT5I - [0:0]
-A KUBE-SERVICES -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-SVC-YIDRKHK4K7YFNT5I
-A KUBE-SVC-YIDRKHK4K7YFNT5I ! -s 10.233.64.0/18 -d 10.233.60.220/32 -p tcp -m comment --comment "default/my-dep cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
-A KUBE-SVC-YIDRKHK4K7YFNT5I -m comment --comment "default/my-dep -> 10.233.74.77:80" -j KUBE-SEP-CKROCXU3WMRQYCUN
[root@worker1 ~]# iptables-save -t nat | grep   KUBE-SEP-CKROCXU3WMRQYCUN
:KUBE-SEP-CKROCXU3WMRQYCUN - [0:0]
-A KUBE-SEP-CKROCXU3WMRQYCUN -s 10.233.74.77/32 -m comment --comment "default/my-dep" -j KUBE-MARK-MASQ
-A KUBE-SEP-CKROCXU3WMRQYCUN -p tcp -m comment --comment "default/my-dep" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination
-A KUBE-SVC-YIDRKHK4K7YFNT5I -m comment --comment "default/my-dep -> 10.233.74.77:80" -j KUBE-SEP-CKROCXU3WMRQYCUN
[root@worker1 ~]# curl 10.233.60.220

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here
[root@controller-node-1 ~]# kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:57:26Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:49:09Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
[root@worker1 ~]# cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

$ uname -a
[root@controller-node-1 ~]# uname -a
Linux controller-node-1 5.19.10-1.el7.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep 17 11:34:40 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

[root@worker1 ~]# uname -a
Linux worker1 5.4.197-1.el7.elrepo.x86_64 #1 SMP Sat Jun 4 08:43:19 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

kubespray

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@cyclinder cyclinder added the kind/bug Categorizes issue or PR as related to a bug. label Dec 16, 2022
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 16, 2022
@k8s-ci-robot
Copy link
Contributor

@cyclinder: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cyclinder
Copy link
Contributor Author

/sig network

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 16, 2022
@aojea
Copy link
Member

aojea commented Dec 16, 2022

-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent

how is this generated?

/priority important-soon?

@k8s-ci-robot
Copy link
Contributor

@aojea: The label(s) priority/important-soon? cannot be applied, because the repository doesn't have them.

In response to this:

-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent

how is this generated?

/priority important-soon?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aojea
Copy link
Member

aojea commented Dec 16, 2022

/cc @danwinship

@aojea
Copy link
Member

aojea commented Dec 16, 2022

are we setting the --persistent flag somewhere?

src/kubernetes$ grep -r '\-\-persistent' .
./third_party/multiarch/qemu-user-static/register/qemu-binfmt-conf.sh:                           [--persistent yes|no][--qemu-suffix SUFFIX]
./third_party/multiarch/qemu-user-static/register/qemu-binfmt-conf.sh:       --persistent:  if yes, the interpreter is loaded when binfmt is
./third_party/multiarch/qemu-user-static/register/qemu-binfmt-conf.sh:    -p|--persistent)

@cyclinder
Copy link
Contributor Author

-A KUBE-SEP-CC3HXZSKU6BR4DDB -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0 --persistent

how is this generated?

/priority important-soon?

I'm not sure, this rule looks rather strange and is only for worker nodes, controller-node is works.

@cyclinder
Copy link
Contributor Author

are we setting the --persistent flag somewhere?

src/kubernetes$ grep -r '\-\-persistent' .
./third_party/multiarch/qemu-user-static/register/qemu-binfmt-conf.sh:                           [--persistent yes|no][--qemu-suffix SUFFIX]
./third_party/multiarch/qemu-user-static/register/qemu-binfmt-conf.sh:       --persistent:  if yes, the interpreter is loaded when binfmt is
./third_party/multiarch/qemu-user-static/register/qemu-binfmt-conf.sh:    -p|--persistent)

I didn't find any trace of setting '--persistent' flag in the code, and it seems iptables doesn't support setting this flag(man iptables)

@danwinship
Copy link
Contributor

Most of the iptables docs are in the iptable-extensions man page not iptables. --persistent is there, but... we definitely do not set that.

What other software is running on the worker nodes in this cluster? This looks like the bug where kube-router was accidentally messing up kube-proxy's iptables rules due to iptables binary version mismatches.

@cyclinder
Copy link
Contributor Author

Most of the iptables docs are in the iptable-extensions man page not iptables. --persistent is there, but... we definitely do not set that.

Oh thanks.

What other software is running on the worker nodes in this cluster? This looks like the bug where kube-router was accidentally messing up kube-proxy's iptables rules due to iptables binary version mismatches.

It looks like that's the reason, but I haven't found any other software for works nodes.

[root@controller-node-1 ~]# kubectl get po -A -o wide
NAMESPACE          NAME                                                         READY   STATUS                  RESTARTS         AGE   IP             NODE                NOMINATED NODE   READINESS GATES
calico-apiserver   calico-apiserver-795c84d468-5md88                            1/1     Running                 0                24h   10.233.74.66   controller-node-1   <none>           <none>
insight-system     helm-operation-install-insight-agent-bqzlm-544xd             0/1     Completed               0                23h   10.233.74.68   controller-node-1   <none>           <none>
insight-system     insight-agent-fluent-bit-k2q4n                               0/1     CrashLoopBackOff        409 (5m4s ago)   23h   10.233.74.69   controller-node-1   <none>           <none>
insight-system     insight-agent-fluent-bit-lw7ht                               0/1     ContainerCreating       0                23h   <none>         worker1             <none>           <none>
insight-system     insight-agent-kube-prometh-operator-76cb6f476-c2vlr          1/1     Running                 0                23h   10.233.74.73   controller-node-1   <none>           <none>
insight-system     insight-agent-kube-state-metrics-79d9bbc59f-9h92q            1/1     Running                 0                23h   10.233.74.71   controller-node-1   <none>           <none>
insight-system     insight-agent-kubernetes-event-exporter-566c5cb9f-6dwwm      1/1     Running                 0                23h   10.233.74.70   controller-node-1   <none>           <none>
insight-system     insight-agent-opentelemetry-collector-699544f5dc-59dqt       1/1     Running                 0                23h   10.233.74.74   controller-node-1   <none>           <none>
insight-system     insight-agent-prometheus-node-exporter-2xhbx                 1/1     Running                 0                23h   10.6.214.12    controller-node-1   <none>           <none>
insight-system     insight-agent-prometheus-node-exporter-d472z                 1/1     Running                 0                23h   10.6.214.13    worker1             <none>           <none>
insight-system     insight-agent-tailing-sidecar-operator-675bdcb85b-dgdkn      2/2     Running                 0                23h   10.233.74.72   controller-node-1   <none>           <none>
insight-system     opentelemetry-operator-controller-manager-7c689fd544-pm4ts   2/2     Running                 0                23h   10.233.74.75   controller-node-1   <none>           <none>
insight-system     prometheus-insight-agent-kube-prometh-prometheus-0           2/2     Running                 0                23h   10.233.74.76   controller-node-1   <none>           <none>
kube-system        calico-kube-controllers-56758dc98b-hns7j                     1/1     Running                 0                24h   10.6.214.12    controller-node-1   <none>           <none>
kube-system        calico-node-cbv95                                            1/1     Running                 0                23h   10.6.214.12    controller-node-1   <none>           <none>
kube-system        calico-node-gbqhk                                            0/1     Init:CrashLoopBackOff   275 (5m6s ago)   23h   10.6.214.13    worker1             <none>           <none>
kube-system        coredns-7dc66c6d4-p562b                                      1/1     Running                 0                24h   10.233.74.67   controller-node-1   <none>           <none>
kube-system        dns-autoscaler-77db77644c-s5nxx                              1/1     Running                 0                24h   10.233.74.64   controller-node-1   <none>           <none>
kube-system        etcd-controller-node-1                                       1/1     Running                 0                24h   10.6.214.12    controller-node-1   <none>           <none>
kube-system        kube-apiserver-controller-node-1                             1/1     Running                 0                24h   10.6.214.12    controller-node-1   <none>           <none>
kube-system        kube-controller-manager-controller-node-1                    1/1     Running                 2 (24h ago)      24h   10.6.214.12    controller-node-1   <none>           <none>
kube-system        kube-proxy-5wxxk                                             1/1     Running                 0                22h   10.6.214.13    worker1             <none>           <none>
kube-system        kube-proxy-tk4cl                                             1/1     Running                 0                22h   10.6.214.12    controller-node-1   <none>           <none>
kube-system        kube-scheduler-controller-node-1                             1/1     Running                 1                24h   10.6.214.12    controller-node-1   <none>           <none>
kube-system        nginx-proxy-worker1                                          1/1     Running                 0                23h   10.6.214.13    worker1             <none>           <none>

ps -ef:

[root@worker1 ~]# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 12月08 ?      00:01:25 /usr/lib/systemd/systemd --switched-root --system --deserialize 22
root           2       0  0 12月08 ?      00:00:00 [kthreadd]
root           3       2  0 12月08 ?      00:00:00 [rcu_gp]
root           4       2  0 12月08 ?      00:00:00 [rcu_par_gp]
root           6       2  0 12月08 ?      00:00:00 [kworker/0:0H-kb]
root           8       2  0 12月08 ?      00:00:00 [mm_percpu_wq]
root           9       2  0 12月08 ?      00:03:23 [ksoftirqd/0]
root          10       2  0 12月08 ?      00:24:48 [rcu_sched]
root          11       2  0 12月08 ?      00:00:03 [migration/0]
root          13       2  0 12月08 ?      00:00:00 [cpuhp/0]
root          14       2  0 12月08 ?      00:00:00 [cpuhp/1]
root          15       2  0 12月08 ?      00:00:03 [migration/1]
root          16       2  0 12月08 ?      00:02:00 [ksoftirqd/1]
root          18       2  0 12月08 ?      00:00:00 [kworker/1:0H-kb]
root          19       2  0 12月08 ?      00:00:00 [cpuhp/2]
root          20       2  0 12月08 ?      00:00:03 [migration/2]
root          21       2  0 12月08 ?      00:02:00 [ksoftirqd/2]
root          23       2  0 12月08 ?      00:00:00 [kworker/2:0H-kb]
root          24       2  0 12月08 ?      00:00:00 [cpuhp/3]
root          25       2  0 12月08 ?      00:00:03 [migration/3]
root          26       2  0 12月08 ?      00:01:56 [ksoftirqd/3]
root          28       2  0 12月08 ?      00:00:00 [kworker/3:0H-kb]
root          29       2  0 12月08 ?      00:00:00 [kdevtmpfs]
root          30       2  0 12月08 ?      00:00:00 [netns]
root          31       2  0 12月08 ?      00:00:01 [kauditd]
root          32       2  0 12月08 ?      00:00:01 [khungtaskd]
root          33       2  0 12月08 ?      00:00:00 [oom_reaper]
root          34       2  0 12月08 ?      00:00:00 [writeback]
root          35       2  0 12月08 ?      00:00:00 [kcompactd0]
root          36       2  0 12月08 ?      00:00:00 [ksmd]
root          37       2  0 12月08 ?      00:00:06 [khugepaged]
root          89       2  0 12月08 ?      00:00:00 [kintegrityd]
root          90       2  0 12月08 ?      00:00:00 [kblockd]
root          91       2  0 12月08 ?      00:00:00 [blkcg_punt_bio]
root          93       2  0 12月08 ?      00:00:00 [tpm_dev_wq]
root          94       2  0 12月08 ?      00:00:00 [md]
root          95       2  0 12月08 ?      00:00:00 [edac-poller]
root          96       2  0 12月08 ?      00:00:00 [devfreq_wq]
root          97       2  0 12月08 ?      00:00:00 [watchdogd]
root          98       2  0 12月08 ?      00:00:01 [kswapd0]
root         100       2  0 12月08 ?      00:00:00 [kthrotld]
root         101       2  0 12月08 ?      00:00:00 [irq/24-pciehp]
root         102       2  0 12月08 ?      00:00:00 [irq/25-pciehp]
root         103       2  0 12月08 ?      00:00:00 [irq/26-pciehp]
root         104       2  0 12月08 ?      00:00:00 [irq/27-pciehp]
root         105       2  0 12月08 ?      00:00:00 [irq/28-pciehp]
root         106       2  0 12月08 ?      00:00:00 [irq/29-pciehp]
root         107       2  0 12月08 ?      00:00:00 [irq/30-pciehp]
root         108       2  0 12月08 ?      00:00:00 [irq/31-pciehp]
root         109       2  0 12月08 ?      00:00:00 [irq/32-pciehp]
root         110       2  0 12月08 ?      00:00:00 [irq/33-pciehp]
root         111       2  0 12月08 ?      00:00:00 [irq/34-pciehp]
root         112       2  0 12月08 ?      00:00:00 [irq/35-pciehp]
root         113       2  0 12月08 ?      00:00:00 [irq/36-pciehp]
root         114       2  0 12月08 ?      00:00:00 [irq/37-pciehp]
root         115       2  0 12月08 ?      00:00:00 [irq/38-pciehp]
root         116       2  0 12月08 ?      00:00:00 [irq/39-pciehp]
root         117       2  0 12月08 ?      00:00:00 [irq/40-pciehp]
root         118       2  0 12月08 ?      00:00:00 [irq/41-pciehp]
root         119       2  0 12月08 ?      00:00:00 [irq/42-pciehp]
root         120       2  0 12月08 ?      00:00:00 [irq/43-pciehp]
root         121       2  0 12月08 ?      00:00:00 [irq/44-pciehp]
root         122       2  0 12月08 ?      00:00:00 [irq/45-pciehp]
root         123       2  0 12月08 ?      00:00:00 [irq/46-pciehp]
root         124       2  0 12月08 ?      00:00:00 [irq/47-pciehp]
root         125       2  0 12月08 ?      00:00:00 [irq/48-pciehp]
root         126       2  0 12月08 ?      00:00:00 [irq/49-pciehp]
root         127       2  0 12月08 ?      00:00:00 [irq/50-pciehp]
root         128       2  0 12月08 ?      00:00:00 [irq/51-pciehp]
root         129       2  0 12月08 ?      00:00:00 [irq/52-pciehp]
root         130       2  0 12月08 ?      00:00:00 [irq/53-pciehp]
root         131       2  0 12月08 ?      00:00:00 [irq/54-pciehp]
root         132       2  0 12月08 ?      00:00:00 [irq/55-pciehp]
root         134       2  0 12月08 ?      00:00:00 [acpi_thermal_pm]
root         135       2  0 12月08 ?      00:00:00 [kmpath_rdacd]
root         136       2  0 12月08 ?      00:00:00 [kaluad]
root         138       2  0 12月08 ?      00:00:00 [ipv6_addrconf]
root         139       2  0 12月08 ?      00:00:00 [kstrp]
root         150       2  0 12月08 ?      00:00:00 [charger_manager]
root         821       2  0 12月08 ?      00:00:00 [ata_sff]
root         828       2  0 12月08 ?      00:00:00 [scsi_eh_0]
root         837       2  0 12月08 ?      00:00:00 [scsi_tmf_0]
root         843       2  0 12月08 ?      00:00:00 [scsi_eh_1]
root         848       2  0 12月08 ?      00:00:00 [scsi_tmf_1]
root         854       2  0 12月08 ?      00:01:24 [irq/16-vmwgfx]
root         858       2  0 12月08 ?      00:00:00 [ttm_swap]
root        1768       2  0 12月08 ?      00:00:00 [scsi_eh_2]
root        1772       2  0 12月08 ?      00:00:00 [scsi_tmf_2]
root        1777       2  0 12月08 ?      00:00:00 [vmw_pvscsi_wq_2]
root        2597       2  0 12月08 ?      00:02:43 [kworker/1:1H-kb]
root        2630       2  0 12月08 ?      00:02:42 [kworker/2:1H-kb]
root        2638       2  0 12月08 ?      00:02:36 [kworker/0:1H-kb]
root        2639       2  0 12月08 ?      00:02:38 [kworker/3:1H-kb]
root        2715       2  0 12月08 ?      00:00:00 [kdmflush]
root        2728       2  0 12月08 ?      00:00:00 [kdmflush]
root        2752       2  0 12月08 ?      00:00:00 [xfsalloc]
root        2754       2  0 12月08 ?      00:00:00 [xfs_mru_cache]
root        2761       2  0 12月08 ?      00:00:00 [xfs-buf/dm-0]
root        2764       2  0 12月08 ?      00:00:00 [xfs-conv/dm-0]
root        2774       2  0 12月08 ?      00:00:00 [xfs-cil/dm-0]
root        2776       2  0 12月08 ?      00:00:00 [xfs-reclaim/dm-]
root        2777       2  0 12月08 ?      00:00:00 [xfs-eofblocks/d]
root        2779       2  0 12月08 ?      00:00:00 [xfs-log/dm-0]
root        2780       2  0 12月08 ?      00:06:38 [xfsaild/dm-0]
root        2850       1  0 12月08 ?      00:31:43 /usr/lib/systemd/systemd-journald
root        2879       1  0 12月08 ?      00:00:00 /usr/sbin/lvmetad -f
root        2891       1  0 12月08 ?      00:00:01 /usr/lib/systemd/systemd-udevd
root        5208       2  0 12月08 ?      00:00:00 [cryptd]
root        5215       2  0 12月08 ?      00:00:00 [xfs-buf/sda1]
root        5216       2  0 12月08 ?      00:00:00 [xfs-conv/sda1]
root        5217       2  0 12月08 ?      00:00:00 [xfs-cil/sda1]
root        5218       2  0 12月08 ?      00:00:00 [xfs-reclaim/sda]
root        5219       2  0 12月08 ?      00:00:00 [xfs-eofblocks/s]
root        5220       2  0 12月08 ?      00:00:00 [xfs-log/sda1]
root        5221       2  0 12月08 ?      00:00:00 [xfsaild/sda1]
root        5280       2  0 12月08 ?      00:00:00 [nfit]
root        5307       1  0 12月08 ?      00:00:03 /sbin/auditd
root        5330       1  0 12月08 ?      00:00:37 /usr/sbin/irqbalance --foreground
root        5331       1  0 12月08 ?      00:00:03 /usr/lib/systemd/systemd-logind
polkitd     5333       1  0 12月08 ?      00:00:01 /usr/lib/polkit-1/polkitd --no-debug
dbus        5335       1  0 12月08 ?      00:00:47 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root        5396       1  0 12月08 ?      00:00:01 /usr/sbin/crond -n
root        5397       1  0 12月08 tty1   00:00:00 /sbin/agetty --noclear tty1 linux
chrony      5400       1  0 12月08 ?      00:00:00 /usr/sbin/chronyd
root        5647       1  0 12月08 ?      00:01:51 /usr/bin/python2 -Es /usr/sbin/tuned -l -P
root        5648       1  0 12月08 ?      00:00:00 /usr/sbin/sshd -D
root        5651       1  0 12月08 ?      00:08:54 /usr/sbin/rsyslogd -n
root        5744       1  0 12月08 ?      00:00:03 /usr/libexec/postfix/master -w
postfix     5746    5744  0 12月08 ?      00:00:00 qmgr -l -t unix -u
root      134171       1  0 12月16 ?      00:00:04 /usr/sbin/NetworkManager --no-daemon
root      136263       1  0 12月16 ?      00:07:31 /usr/local/bin/containerd
root      138633       1  3 12月16 ?      00:52:35 /usr/local/bin/kubelet --logtostderr=true --v=2 --node-ip=10.6.214.13 --hostname-override=worker1 --bootstrap-kubeconfig=/etc/
root      139554       1  0 12月16 ?      00:00:56 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id 381a54655df0ada752668f900273b9c0f2421906b20aa523d21484f07be63c57
root      139566       1  0 12月16 ?      00:00:52 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id 26e1744719d65b0bee1fc35b4b79648ca30f50c72d04f82fca41fb4a3e95b01c
65535     139626  139554  0 12月16 ?      00:00:00 /pause
65534     139707  139566  0 12月16 ?      00:00:00 /pause
root      140981  139554  0 12月16 ?      00:00:00 nginx: master process nginx -g daemon off;
101       141010  140981  0 12月16 ?      00:00:02 nginx: worker process
101       141011  140981  0 12月16 ?      00:00:00 nginx: worker process
65534     148596  139566  0 12月16 ?      00:03:10 /bin/node_exporter --path.procfs=/host/proc --path.sysfs=/host/sys --path.rootfs=/host/root --web.listen-address=[0.0.0.0]:910
root      163586       1  0 12月16 ?      00:00:22 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id 4aca6906ca399dadee9b30f908af340baed46426fffd81717f35fb32fbba9348
65535     163609  163586  0 12月16 ?      00:00:00 /pause
root      302305       1  0 12月16 ?      00:00:49 /usr/local/bin/containerd-shim-runc-v2 -namespace k8s.io -id b74a8dc4fd6a848dc361831e58a06907d40d5552b8bdcec570960965dcc2c1de
65535     302330  302305  0 12月16 ?      00:00:00 /pause
root      302364  302305  0 12月16 ?      00:00:22 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=worker1
root     3013276       2  0 03:38 ?        00:00:02 [kworker/0:2-eve]
root     3034959       2  0 03:48 ?        00:00:00 [kworker/2:6-eve]
postfix  3042894    5744  0 03:52 ?        00:00:00 pickup -l -t unix -u
root     3045515       2  0 03:53 ?        00:00:00 [kworker/3:0-eve]
root     3045524       2  0 03:53 ?        00:00:00 [kworker/0:3-cgr]
root     3059390       2  0 03:59 ?        00:00:00 [kworker/u8:1-ev]
root     3061713       2  0 04:01 ?        00:00:00 [kworker/3:2-eve]
root     3067363       2  0 04:03 ?        00:00:00 [kworker/1:4-rcu]
root     3067364       2  0 04:03 ?        00:00:00 [kworker/1:5-eve]
root     3078161       2  0 04:08 ?        00:00:00 [kworker/u8:2-ev]
root     3079406       2  0 04:09 ?        00:00:00 [kworker/2:1-ata]
root     3088971       2  0 04:13 ?        00:00:00 [kworker/3:1-cgr]
root     3090974       2  0 04:14 ?        00:00:00 [kworker/2:0-ata]
root     3096203    5648  0 04:17 ?        00:00:00 sshd: root@pts/0
root     3096458 3096203  0 04:17 pts/0    00:00:00 -bash
root     3099017       2  0 04:18 ?        00:00:00 [kworker/u8:0-ev]
root     3099886       2  0 04:19 ?        00:00:00 [kworker/1:0-eve]
root     3099887       2  0 04:19 ?        00:00:00 [kworker/u8:3-fl]
root     3099895       2  0 04:19 ?        00:00:00 [kworker/2:2-eve]
root     3099901       2  0 04:19 ?        00:00:00 [kworker/3:3-eve]
root     3099902       2  0 04:19 ?        00:00:00 [kworker/3:4-eve]
root     3099903       2  0 04:19 ?        00:00:00 [kworker/3:5-eve]
root     3099904       2  0 04:19 ?        00:00:00 [kworker/3:6]
root     3105885 3096458  0 04:21 pts/0    00:00:00 ps -ef

No special software here 🤔

@aojea
Copy link
Member

aojea commented Dec 18, 2022

you should check with Calico folks

@cyclinder
Copy link
Contributor Author

@aojea In fact, It was because calico failed to start on the work nodes that I discovered this problem. calico-node logs show a failed connection to "10.233.0.1".

[root@controller-node-1 ~]# kubectl logs -f -n kube-system calico-node-gbqhk -c install-cni
time="2022-12-19T05:27:40Z" level=info msg="Running as a Kubernetes pod" source="install.go:140"
2022-12-19 05:27:41.820 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/bandwidth"
2022-12-19 05:27:41.820 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/bandwidth
2022-12-19 05:27:41.901 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/calico"
2022-12-19 05:27:41.901 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/calico
2022-12-19 05:27:41.994 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/calico-ipam"
2022-12-19 05:27:41.994 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/calico-ipam
2022-12-19 05:27:41.998 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/flannel"
2022-12-19 05:27:41.998 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/flannel
2022-12-19 05:27:42.003 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/host-local"
2022-12-19 05:27:42.004 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/host-local
2022-12-19 05:27:42.073 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/install"
2022-12-19 05:27:42.073 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/install
2022-12-19 05:27:42.078 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/loopback"
2022-12-19 05:27:42.078 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/loopback
2022-12-19 05:27:42.083 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/portmap"
2022-12-19 05:27:42.083 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/portmap
2022-12-19 05:27:42.088 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/tuning"
2022-12-19 05:27:42.088 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/tuning
2022-12-19 05:27:42.088 [INFO][1] cni-installer/<nil> <nil>: Wrote Calico CNI binaries to /host/opt/cni/bin

2022-12-19 05:27:42.124 [INFO][1] cni-installer/<nil> <nil>: CNI plugin version: v3.23.3

2022-12-19 05:27:42.124 [INFO][1] cni-installer/<nil> <nil>: /host/secondary-bin-dir is not writeable, skipping
W1219 05:27:42.124423       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2022-12-19 05:27:42.126 [ERROR][1] cni-installer/<nil> <nil>: Unable to create token for CNI kubeconfig error=Post "https://10.233.0.1:443/api/v1/namespaces/kube-system/serviceaccounts/calico-node/token": dial tcp 10.233.0.1:443: connect: connection refused
2022-12-19 05:27:42.126 [FATAL][1] cni-installer/<nil> <nil>: Unable to create token for CNI kubeconfig error=Post "https://10.233.0.1:443/api/v1/namespaces/kube-system/serviceaccounts/calico-node/token": dial tcp 10.233.0.1:443: connect: connection refused

@aojea
Copy link
Member

aojea commented Dec 19, 2022

from the paste there seems to be only 2 components that fiddle with iptables: kube-proxy and Calico, and we practically discarded kube-proxy, we have to discard Calico now ... maybe @caseydavenport can help us here

@cyclinder
Copy link
Contributor Author

cyclinder commented Dec 19, 2022

I think it has nothing to do with calico, I stopped calico and restarted kube-proxy, same result, I suspect the problem is caused by incompatibility of iptables bin and kernel version

@aojea
Copy link
Member

aojea commented Dec 19, 2022

well, there seems to be 2 players with the iptables rules, and this strange --persistent flag

I can't see how kube-proxy can append the flag, and kube-proxy uses its own binary if it runs containerized,

Last report mentioned by DanWinship due to binary incompatibility was because of kube-router #112477 (comment) ,

I will try to find where this persistentflag is coming and start from there

@cyclinder
Copy link
Contributor Author

cyclinder commented Dec 19, 2022

I seem to have found the cause of the problem , I set the log_level of kube-proxy to -v=9, and I extract the iptables rules, and then manually execute iptables-restore on the working node:

[root@worker1 ~]# iptables-save -t nat | grep KUBE-SEP-DASZGVVO7T2FMT4N
:KUBE-SEP-DASZGVVO7T2FMT4N - [0:0]
-A KUBE-SEP-DASZGVVO7T2FMT4N -s 10.6.214.12/32 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-DASZGVVO7T2FMT4N -p tcp -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0:0
-A KUBE-SVC-OEYFLNX2UKKSAKI7 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics -> 10.6.214.12:9100" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-DASZGVVO7T2FMT4N
[root@worker1 ~]# iptables-restore < data
[root@worker1 ~]# iptables-save -t nat | grep KUBE-SEP-DASZGVVO7T2FMT4N
:KUBE-SEP-DASZGVVO7T2FMT4N - [0:0]
-A KUBE-SEP-DASZGVVO7T2FMT4N -s 10.6.214.12/32 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-DASZGVVO7T2FMT4N -p tcp -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -m tcp -j DNAT --to-destination 10.6.214.12:9100
-A KUBE-SVC-OEYFLNX2UKKSAKI7 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics -> 10.6.214.12:9100" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-DASZGVVO7T2FMT4N

I found that the iptables versions on the node and in the kube-proxy container are not the same, and that the iptables rules work fine in the kube-proxy container.

[root@controller-node-1 ~]# kubectl get po -n kube-system  -o wide
NAME                                        READY   STATUS    RESTARTS       AGE     IP             NODE                NOMINATED NODE   READINESS GATES
calico-kube-controllers-56758dc98b-hns7j    1/1     Running   0              3d4h    10.6.214.12    controller-node-1   <none>           <none>
coredns-7dc66c6d4-p562b                     1/1     Running   0              3d4h    10.233.74.67   controller-node-1   <none>           <none>
dns-autoscaler-77db77644c-s5nxx             1/1     Running   0              3d4h    10.233.74.64   controller-node-1   <none>           <none>
etcd-controller-node-1                      1/1     Running   0              3d4h    10.6.214.12    controller-node-1   <none>           <none>
kube-apiserver-controller-node-1            1/1     Running   0              3d4h    10.6.214.12    controller-node-1   <none>           <none>
kube-controller-manager-controller-node-1   1/1     Running   2 (3d4h ago)   3d4h    10.6.214.12    controller-node-1   <none>           <none>
kube-proxy-dnmnk                            1/1     Running   0              3m4s    10.6.214.12    controller-node-1   <none>           <none>
kube-proxy-z85q9                            1/1     Running   0              3m4s    10.6.214.13    worker1             <none>           <none>
kube-scheduler-controller-node-1            1/1     Running   1              3d4h    10.6.214.12    controller-node-1   <none>           <none>
nginx-proxy-worker1                         1/1     Running   0              4h29m   10.6.214.13    worker1             <none>           <none>
[root@controller-node-1 ~]# kubectl exec -it kube-proxy-dnmnk -n kube-system sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
# iptables --version
iptables v1.8.7 (legacy)
# iptables-save -t nat | grep KUBE-SEP-DASZGVVO7T2FMT4N
:KUBE-SEP-DASZGVVO7T2FMT4N - [0:0]
-A KUBE-SEP-DASZGVVO7T2FMT4N -s 10.6.214.12/32 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-DASZGVVO7T2FMT4N -p tcp -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -m tcp -j DNAT --to-destination 10.6.214.12:9100
-A KUBE-SVC-OEYFLNX2UKKSAKI7 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics -> 10.6.214.12:9100" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-DASZGVVO7T2FMT4N
# exit
[root@controller-node-1 ~]# iptables --version
iptables v1.4.21
[root@controller-node-1 ~]# iptables-save -t nat | grep KUBE-SEP-DASZGVVO7T2FMT4N
:KUBE-SEP-DASZGVVO7T2FMT4N - [0:0]
-A KUBE-SEP-DASZGVVO7T2FMT4N -s 10.6.214.12/32 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-DASZGVVO7T2FMT4N -p tcp -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics" -m tcp -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0:0
-A KUBE-SVC-OEYFLNX2UKKSAKI7 -m comment --comment "insight-system/insight-agent-prometheus-node-exporter:http-metrics -> 10.6.214.12:9100" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-DASZGVVO7T2FMT4N

So this looks like a different version of the iptables bin, which is causing the iptables-save result to be abnormal

@aojea
Copy link
Member

aojea commented Dec 19, 2022

iptables v1.8.7 (legacy)

iptables v1.4.21

:/

@danwinship
Copy link
Contributor

danwinship commented Dec 20, 2022

Yay RHEL 7. Oof.
But the iptables-legacy code really isn't changing that much, so this is pretty surprising.

It looks like the RHEL 7 iptables package hasn't changed since May 2020, so it's presumably not a change on that side. The kernel is still getting security fixes, but it doesn't look like there's anything iptables-related there in a long time either.

Can you check the iptables and kernel RPM versions on the host just to confirm?

And can you confirm that if you add a new rule from the container, it shows up mangled in the host? eg

kubectl exec -it kube-proxy-dnmnk -n kube-system -- iptables -A PREROUTING -d 1.2.3.4 -m comment --comment "foo" -j DNAT --to-destination 10.6.214.12:9100
iptables-save|grep 1.2.3.4

@cyclinder
Copy link
Contributor Author

Can you check the iptables and kernel RPM versions on the host just to confirm?

[root@controller-node-1 ~]# rpm -qa | grep iptables
iptables-1.4.21-35.el7.x86_64
[root@controller-node-1 ~]# rpm -qa | grep kernel
kernel-tools-3.10.0-957.el7.x86_64
kernel-tools-libs-3.10.0-957.el7.x86_64
kernel-ml-5.19.10-1.el7.elrepo.x86_64
kernel-3.10.0-957.el7.x86_64

And can you confirm that if you add a new rule from the container, it shows up mangled in the host? eg

[root@controller-node-1 ~]# kubectl exec -it kube-proxy-dnmnk -n kube-system sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
# iptables -A PREROUTING -t nat -d 1.2.3.4 -p tcp --dport 12345 -m comment --comment "foo" -j DNAT --to-destination 10.6.214.12:9100
# iptables-save -t nat | grep 1.2.3.4
-A PREROUTING -d 1.2.3.4/32 -p tcp -m tcp --dport 12345 -m comment --comment foo -j DNAT --to-destination 10.6.214.12:9100
[root@controller-node-1 ~]# iptables-save -t nat | grep 1.2.3.4
-A PREROUTING -d 1.2.3.4/32 -p tcp -m tcp --dport 12345 -m comment --comment foo -j DNAT --to-destination :0 --persistent --to-destination :0 --persistent --to-destination 0.0.0.0:0

It looks like the iptables rules on the node are still broken

@danwinship
Copy link
Contributor

So did this used to work? Did you previously have a cluster with modern-ish kubernetes running on a CentOS 7 base, and it worked, and then you something and it broke? (If so, what?) Or is this a brand new install and it failed right out of the box?

@cyclinder
Copy link
Contributor Author

I'm not sure. The kernel version was upgraded from CentOS 7 base. and then new install a cluster and I found it failed to iptables. Is this related to upgrading the kernel version?

@danwinship
Copy link
Contributor

any idea what the previous kernel version was? (I can't remember... did yum have yum history? Or is that only with dnf?)

@cyclinder
Copy link
Contributor Author

@danwinship sorry for the delay, I'm busy on working these days...

previous kernel version should be 3.10

@danwinship
Copy link
Contributor

I mean exact version, so we can try to bisect if there was some kernel patch that introduced a regression. (You might be able to use yum history to see what you upgraded it from/to.)

@cyclinder
Copy link
Contributor Author

cyclinder commented Dec 30, 2022

@danwinship It seems no special here, I'm not sure how the kernel was upgraded.

[root@controller-node-1 ~]# yum history
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
ID     | 命令行                   | 日期和时间       | 操作           | 变更数
-------------------------------------------------------------------------------
    97 | install tcpdump          | 2022-12-19 09:10 | Install        |    2
    96 | upgrade iptables         | 2022-12-19 04:56 | Update         |    1
    95 | -d 2 -y remove runc      | 2022-11-01 05:27 | Erase          |    3 EE
    94 | install -y -q docker-ce  | 2022-10-27 23:03 | Install        |    6 EE
    93 | -d 2 -y install containe | 2022-10-07 23:16 | Install        |    1
    92 | -d 2 -y remove container | 2022-10-07 22:47 | Erase          |    2
    91 | -d 2 -y remove docker-ce | 2022-10-07 22:47 | Erase          |    4
    90 | -d 2 -y --enablerepo=doc | 2022-09-30 01:49 | Install        |    5
    89 | -d 2 -y install containe | 2022-09-30 01:47 | Install        |    1
    88 | -d 2 -y remove container | 2022-09-30 01:36 | Erase          |    2
    87 | -d 2 -y remove docker-ce | 2022-09-30 01:36 | Erase          |    4
    86 | -d 2 -y --enablerepo=doc | 2022-09-29 22:21 | Install        |    5
    85 | -d 2 -y install containe | 2022-09-29 22:19 | Install        |    1
    84 | -d 2 -y remove container | 2022-09-28 08:05 | Erase          |    2 EE
    83 | -d 2 -y remove docker-ce | 2022-09-28 08:05 | Erase          |    4
    82 | -d 2 -y --enablerepo=doc | 2022-09-28 06:53 | Install        |    5 EE
    81 | -d 2 -y install containe | 2022-09-28 05:30 | Install        |    1
    80 | -d 2 -y remove container | 2022-09-28 04:39 | Erase          |    2 EE
    79 | -d 2 -y remove docker-ce | 2022-09-28 04:39 | Erase          |    4
    78 | -d 2 -y --enablerepo=doc | 2022-09-28 03:58 | Install        |    5 EE
history list
[root@controller-node-1 ~]# uname -a
Linux controller-node-1 5.19.10-1.el7.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep 17 11:34:40 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
[root@controller-node-1 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)

@danwinship
Copy link
Contributor

Kernel guys suggest that this is purely an iptables-1.4 display problem; They believe the kernel has the correct representation of the rule (as seen by the fact that iptables 1.8 can consistently show it correctly), it's just that iptables 1.4 isn't displaying it correctly.

So in that case, whatever bug you're hitting is somewhere else, and is unrelated to this particular iptables rule...

@cyclinder
Copy link
Contributor Author

cyclinder commented Jan 16, 2023

Thanks @danwinship for getting to the bottom of this.

I agree that the problem I'm having shouldn't be related to the iptables rule, I just found this issue by accident. Now let me close this issue.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

4 participants