Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube proxy (1.21.2): unknown option --random-fully #1229

Closed
bovy89 opened this issue Jun 30, 2021 · 21 comments
Closed

kube proxy (1.21.2): unknown option --random-fully #1229

bovy89 opened this issue Jun 30, 2021 · 21 comments

Comments

@bovy89
Copy link

bovy89 commented Jun 30, 2021

Environmental Info:

RKE2 Version:
rke2 version v1.21.2+rke2r1 (d58ad61)
go version go1.16.4b7

Node(s) CPU architecture, OS, and Version:

  • centos 8 (4.18.0-305.7.1.el8_4.x86_64)

Cluster Configuration:

  • CNI: cilium
  • kube-proxy ipvs mode

Describe the bug:

Failed to execute iptables-restore: exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option "--random-fully"

see full log below

Steps To Reproduce:

I was able to reproduce this error on a simple RKE2 setup (rpm setup with canal and kube-proxy iptables mode)

Additional context / logs:

kube-proxy ipvs mode logs:

W0630 13:25:34.806553       1 proxier.go:663] Failed to load kernel module nf_conntrack_ipv4 with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
I0630 13:25:34.872314       1 node.go:172] Successfully retrieved node IP: 192.168.20.24
I0630 13:25:34.872350       1 server_others.go:140] Detected node IP 192.168.20.24
I0630 13:25:34.963115       1 server_others.go:206] kube-proxy running in dual-stack mode, IPv4-primary
I0630 13:25:34.963157       1 server_others.go:274] Using ipvs Proxier.
I0630 13:25:34.963168       1 server_others.go:276] creating dualStackProxier for ipvs.
W0630 13:25:34.963178       1 server_others.go:512] detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6
W0630 13:25:34.963751       1 proxier.go:445] IPVS scheduler not specified, use rr by default
W0630 13:25:34.964571       1 proxier.go:445] IPVS scheduler not specified, use rr by default
W0630 13:25:34.964596       1 ipset.go:113] ipset name truncated; [KUBE-6-LOAD-BALANCER-SOURCE-CIDR] -> [KUBE-6-LOAD-BALANCER-SOURCE-CID]
W0630 13:25:34.964606       1 ipset.go:113] ipset name truncated; [KUBE-6-NODE-PORT-LOCAL-SCTP-HASH] -> [KUBE-6-NODE-PORT-LOCAL-SCTP-HAS]
I0630 13:25:34.965170       1 server.go:643] Version: v1.21.1
I0630 13:25:34.978475       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0630 13:25:34.984963       1 conntrack.go:83] Setting conntrack hashsize to 32768
I0630 13:25:34.995400       1 config.go:315] Starting service config controller
I0630 13:25:34.995424       1 shared_informer.go:240] Waiting for caches to sync for service config
I0630 13:25:34.995542       1 config.go:224] Starting endpoint slice config controller
I0630 13:25:34.995554       1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
W0630 13:25:35.007884       1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
W0630 13:25:35.012864       1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
I0630 13:25:35.096275       1 shared_informer.go:247] Caches are synced for endpoint slice config
I0630 13:25:35.096466       1 shared_informer.go:247] Caches are synced for service config
E0630 13:25:35.348408       1 proxier.go:1629] Failed to execute iptables-restore: exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option "--random-fully"
Error occurred at line: 13
Try `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.
)
Rules:
*nat
:KUBE-SERVICES - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-NODE-PORT - [0:0]
:KUBE-LOAD-BALANCER - [0:0]
:KUBE-MARK-MASQ - [0:0]
-A KUBE-SERVICES -m addrtype --dst-type LOCAL -j KUBE-NODE-PORT
-A KUBE-LOAD-BALANCER -j KUBE-MARK-MASQ
-A KUBE-FIREWALL -j KUBE-MARK-DROP
-A KUBE-POSTROUTING -m mark ! --mark 0x00004000/0x00004000 -j RETURN
-A KUBE-POSTROUTING -j MARK --xor-mark 0x00004000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully
-A KUBE-MARK-MASQ -j MARK --or-mark 0x00004000
COMMIT
*filter
:KUBE-FORWARD - [0:0]
:KUBE-NODE-PORT - [0:0]
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x00004000/0x00004000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-NODE-PORT -m comment --comment "Kubernetes health check node port" -m set --match-set KUBE-6-HEALTH-CHECK-NODE-PORT dst -j ACCEPT
COMMIT

kube-proxy iptables mode logs:

W0630 19:35:36.546774       1 proxier.go:663] Failed to load kernel module nf_conntrack_ipv4 with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
I0630 19:35:36.565364       1 node.go:172] Successfully retrieved node IP: 192.168.18.19
I0630 19:35:36.565378       1 server_others.go:140] Detected node IP 192.168.18.19
I0630 19:35:36.593287       1 server_others.go:206] kube-proxy running in dual-stack mode, IPv4-primary
I0630 19:35:36.593305       1 server_others.go:212] Using iptables Proxier.
I0630 19:35:36.593313       1 server_others.go:219] creating dualStackProxier for iptables.
W0630 19:35:36.593320       1 server_others.go:512] detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6
I0630 19:35:36.593569       1 server.go:643] Version: v1.21.1
I0630 19:35:36.594742       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0630 19:35:36.595157       1 config.go:315] Starting service config controller
I0630 19:35:36.595164       1 shared_informer.go:240] Waiting for caches to sync for service config
I0630 19:35:36.595174       1 config.go:224] Starting endpoint slice config controller
I0630 19:35:36.595177       1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
W0630 19:35:36.598661       1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
W0630 19:35:36.602192       1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
I0630 19:35:36.695809       1 shared_informer.go:247] Caches are synced for endpoint slice config
I0630 19:35:36.695860       1 shared_informer.go:247] Caches are synced for service config
E0630 19:35:36.762616       1 proxier.go:1612] "Failed to execute iptables-restore" err="exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option \"--random-fully\"\nError occurred at line: 18\nTry `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.\n)"
I0630 19:35:36.762682       1 proxier.go:859] "Sync failed" retryingTime="30s"
E0630 19:36:06.785222       1 proxier.go:1612] "Failed to execute iptables-restore" err="exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option \"--random-fully\"\nError occurred at line: 18\nTry `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.\n)"
I0630 19:36:06.785242       1 proxier.go:859] "Sync failed" retryingTime="30s"
@brandond
Copy link
Contributor

brandond commented Jun 30, 2021

I'm confused, does this only happen with cilium + kube-proxy ipvs mode? You said

I was able to reproduce this error on a simple RKE2 setup (rpm setup with canal and kube-proxy iptables mode)

but we have tested canal and iptables mode on RHEL8 and not seen any issues.

@brandond
Copy link
Contributor

I'm also at a loss as to why kube-proxy would be attempting to use --random-fully; this is supposed to be auto-detected and only enabled if the detected iptables version is >= 1.6.2:

https://github.com/kubernetes/kubernetes/blob/master/pkg/util/iptables/iptables.go#L164-L166

Either way, I believe this is going to be an upstream issue and not RKE2 specific, as we're just packaging the upstream kube-proxy.

@bovy89
Copy link
Author

bovy89 commented Jun 30, 2021

I'm confused, does this only happen with cilium + kube-proxy ipvs mode? You said

I was able to reproduce this error on a simple RKE2 setup (rpm setup with canal and kube-proxy iptables mode)

but we have tested canal and iptables mode on RHEL8 and not seen any issues.

I did some tests and don't seems to be related to the CNI being used or to the kube-proxy mode, because I found the same error with both canal + iptables mode (rke2 standard setup) and cilium + ipvs mode (my setup).

@brandond
Copy link
Contributor

Do you have something else on your host that's adding rules with --random-fully? As I mentioned earlier, we first qualified RHEL8 in #16 and it has been part of our QA matrix ever since, and we haven't seen this issue.

@bovy89
Copy link
Author

bovy89 commented Jun 30, 2021

Maybe something related to ip6tables

sh-4.2# iptables -V
iptables v1.8.5 (nf_tables)
sh-4.2#
sh-4.2# ip6tables -V
ip6tables v1.8.5 (nf_tables)
sh-4.2#
sh-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
iptables: No chain/target/match by that name.
sh-4.2#
sh-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
ip6tables v1.8.5 (nf_tables): unknown option "--random-fully"
Try `ip6tables -h' or 'ip6tables --help' for more information.

Testbed - single node (server) setup (rke2 standard: canal + kube-proxy iptables mode)

Vagrant box: bento/centos-8 (virtualbox, 202105.25.0)

4GB ram, 2 vcpu

yum update -y && reboot

setenforce 0
getenforce

cat > /etc/selinux/config <<EOF
SELINUX=permissive
SELINUXTYPE=targeted
EOF

systemctl stop firewalld
systemctl disable firewalld

cat > /etc/NetworkManager/conf.d/rke2-canal.conf <<'EOF'
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:flannel*interface-name:kube*;
EOF

systemctl reload NetworkManager

useradd -r -c "etcd user" -s /sbin/nologin -M etcd

mkdir -p /var/lib/rancher/rke2/server/manifests/

cat > /var/lib/rancher/rke2/server/manifests/rke2-canal-config.yml <<'EOF'
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-canal
  namespace: kube-system
spec:
  valuesContent: |-
    flannel:
      iface: "eth1"

EOF

mkdir -p /etc/rancher/rke2
cat > /etc/rancher/rke2/config.yaml <<'EOF'
cni: canal
write-kubeconfig-mode: 420
protect-kernel-defaults: true
kube-apiserver-arg:
  - "enable-admission-plugins=NodeRestriction,PodSecurityPolicy,AlwaysPullImages"
  - "tls-min-version=VersionTLS12"
  - "tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
tls-san:
  - test9.example.com
  - 192.168.18.19
node-ip: 192.168.18.19
disable:
  - rke2-ingress-nginx

EOF


cat > /etc/yum.repos.d/rancher-rke2-latest.repo <<EOF
[rancher-rke2-common-latest]
name=Rancher RKE2 Common Latest
baseurl=https://rpm.rancher.io/rke2/latest/common/centos/8/noarch
enabled=1
gpgcheck=1
gpgkey=https://rpm.rancher.io/public.key

[rancher-rke2-1-21-latest]
name=Rancher RKE2 1.21 Latest
baseurl=https://rpm.rancher.io/rke2/latest/1.21/centos/8/x86_64
enabled=1
gpgcheck=1
gpgkey=https://rpm.rancher.io/public.key

EOF

yum -y install rke2-server
cp -f /usr/share/rke2/rke2-cis-sysctl.conf /etc/sysctl.d/60-rke2-cis.conf
systemctl restart systemd-sysctl

cat > /etc/profile.d/k8s.sh <<'EOF'

export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
export PATH=$PATH:/var/lib/rancher/rke2/bin

EOF

source /etc/profile.d/k8s.sh


systemctl enable rke2-server.service
systemctl start rke2-server.service
journalctl -u rke2-server -f


[root@test9 ~]# kubectl logs -n kube-system kube-proxy-hl894
W0630 22:04:26.025053       1 proxier.go:663] Failed to load kernel module nf_conntrack_ipv4 with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
I0630 22:04:26.071655       1 node.go:172] Successfully retrieved node IP: 192.168.18.19
I0630 22:04:26.071855       1 server_others.go:140] Detected node IP 192.168.18.19
W0630 22:04:26.071894       1 server_others.go:598] Unknown proxy mode "", assuming iptables proxy
I0630 22:04:26.226507       1 server_others.go:206] kube-proxy running in dual-stack mode, IPv4-primary
I0630 22:04:26.226575       1 server_others.go:212] Using iptables Proxier.
I0630 22:04:26.226584       1 server_others.go:219] creating dualStackProxier for iptables.
W0630 22:04:26.226591       1 server_others.go:512] detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6
I0630 22:04:26.226895       1 server.go:643] Version: v1.21.1
I0630 22:04:26.229313       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0630 22:04:26.236970       1 conntrack.go:83] Setting conntrack hashsize to 32768
I0630 22:04:26.244661       1 config.go:315] Starting service config controller
I0630 22:04:26.244676       1 shared_informer.go:240] Waiting for caches to sync for service config
I0630 22:04:26.244686       1 config.go:224] Starting endpoint slice config controller
I0630 22:04:26.244688       1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
W0630 22:04:26.248622       1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
W0630 22:04:26.253047       1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
I0630 22:04:26.345588       1 shared_informer.go:247] Caches are synced for endpoint slice config
I0630 22:04:26.345686       1 shared_informer.go:247] Caches are synced for service config
E0630 22:04:26.475912       1 proxier.go:1612] "Failed to execute iptables-restore" err="exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option \"--random-fully\"\nError occurred at line: 18\nTry `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.\n)"
I0630 22:04:26.476172       1 proxier.go:859] "Sync failed" retryingTime="30s"
E0630 22:04:56.501395       1 proxier.go:1612] "Failed to execute iptables-restore" err="exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option \"--random-fully\"\nError occurred at line: 18\nTry `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.\n)"
I0630 22:04:56.501411       1 proxier.go:859] "Sync failed" retryingTime="30s"
E0630 22:05:26.522208       1 proxier.go:1612] "Failed to execute iptables-restore" err="exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option \"--random-fully\"\nError occurred at line: 18\nTry `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.\n)"
I0630 22:05:26.522227       1 proxier.go:859] "Sync failed" retryingTime="30s"
E0630 22:05:56.541066       1 proxier.go:1612] "Failed to execute iptables-restore" err="exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option \"--random-fully\"\nError occurred at line: 18\nTry `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.\n)"
I0630 22:05:56.541085       1 proxier.go:859] "Sync failed" retryingTime="30s"
E0630 22:06:26.561963       1 proxier.go:1612] "Failed to execute iptables-restore" err="exit status 2 (ip6tables-restore v1.8.5 (nf_tables): unknown option \"--random-fully\"\nError occurred at line: 18\nTry `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.\n)"
I0630 22:06:26.561982       1 proxier.go:859] "Sync failed" retryingTime="30s"

@brandond
Copy link
Contributor

brandond commented Jun 30, 2021

sh-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
iptables: No chain/target/match by that name.
sh-4.2#
sh-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
ip6tables v1.8.5 (nf_tables): unknown option "--random-fully"

This seems like the root cause - this version of iptables should support --random-fully. What do you get from iptables --version and ip6tables --version - it almost sounds like your host is somehow using a mix of legacy and nftables iptables CLI tools, or has a broken version of ip6tables.

@bovy89
Copy link
Author

bovy89 commented Jun 30, 2021

sh-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
iptables: No chain/target/match by that name.
sh-4.2#
sh-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
ip6tables v1.8.5 (nf_tables): unknown option "--random-fully"

This seems like the root cause - this version of iptables should support --random-fully. What do you get from iptables --version and ip6tables --version - it almost sounds like your host is somehow using a mix of legacy and nftables iptables CLI tools, or has a broken version of ip6tables.

That test was made inside kube-proxy container

@brandond
Copy link
Contributor

What do you get if you run those two commands on your host?

@brandond brandond added this to To Triage in Development [DEPRECATED] via automation Jun 30, 2021
@brandond brandond added this to the v1.22 - Backlog milestone Jun 30, 2021
@brandond brandond moved this from To Triage to To Verify in Development [DEPRECATED] Jun 30, 2021
@bovy89
Copy link
Author

bovy89 commented Jul 1, 2021

What do you get if you run those two commands on your host?

[root@test9 ~]# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
iptables: No chain/target/match by that name.
[root@test9 ~]#
[root@test9 ~]# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
ip6tables v1.8.4 (nf_tables): host/network `10.244.0.0' not found
Try `ip6tables -h' or 'ip6tables --help' for more information

@manuelbuil
Copy link
Contributor

According to my investigations, when using iptables mode nft, we require v1.8.6 or greater to use --random-fully flag in ip6tables. Kube-proxy code assumes the flag is available as long as iptables version is greater than v.1.6.2 here, but from what I learnt, that only applies to iptables in legacy mode.
When using RHEL8 or centos8, we are using iptables mode nft => https://github.com/k3s-io/k3s-root/blob/master/iptables-detect/iptables-detect.sh#L199-L201

@rancher-max
Copy link
Contributor

The fix for this will need backports to 19, 20, and 21 release branches as well @manuelbuil

@manuelbuil
Copy link
Contributor

According to my investigations, when using iptables mode nft, we require v1.8.6 or greater to use --random-fully flag in ip6tables. Kube-proxy code assumes the flag is available as long as iptables version is greater than v.1.6.2 here, but from what I learnt, that only applies to iptables in legacy mode.
When using RHEL8 or centos8, we are using iptables mode nft => https://github.com/k3s-io/k3s-root/blob/master/iptables-detect/iptables-detect.sh#L199-L201

After updating the version the result is the same. Therefore, my theory was wrong. ip6tables-nft does not include the --random-fully flag for unkown reasons when built with buildroot. If I build iptables from source, it works and when comparing with how buildroot is building it, it looks exactly the same

@bovy89
Copy link
Author

bovy89 commented Jul 11, 2021

iptables 1.8.5 seems to support ip6tables nft + random-fully. I also tried to compile iptables from source and it works as expected

centos 7 (3.10.0-1160.31.1.el7.x86_64)

# iptables -V
iptables v1.4.21

# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random

# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random

---> ok

centos 8 (4.18.0-305.7.1.el8_4.x86_64)

# iptables -V
iptables v1.8.4 (nf_tables)

# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

---> ok

simple run of docker.io/rancher/hardened-kube-proxy:v1.21.1-build20210520 using containerd/docker

bash-4.2# iptables -V
iptables v1.8.5 (legacy)

bash-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

bash-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

----> ok, same result on centos 7 and 8 (NOTE: kube-proxy and iptables-detect not executed)

kube-proxy on rke2 (kubectl exec -it kube-proxy-ppwmr -n kube-system -- /bin/bash)

bash-4.2# iptables -V
iptables v1.8.5 (nf_tables)

bash-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

bash-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random

---> random-fully missing from ip6tables

iptables compiled from source code

# ./xtables-nft-multi iptables -V
iptables v1.8.5 (nf_tables)

# ./xtables-nft-multi iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help  | grep random
 --random
 --random-fully

# ./xtables-nft-multi ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help  | grep random
 --random
 --random-fully

@manuelbuil
Copy link
Contributor

iptables 1.8.5 seems to support ip6tables nft + random-fully. I also tried to compile iptables from source and it works as expected

centos 7 (3.10.0-1160.31.1.el7.x86_64)

# iptables -V
iptables v1.4.21

# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random

# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random

---> ok

centos 8 (4.18.0-305.7.1.el8_4.x86_64)

# iptables -V
iptables v1.8.4 (nf_tables)

# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

---> ok

simple run of docker.io/rancher/hardened-kube-proxy:v1.21.1-build20210520 using containerd/docker

bash-4.2# iptables -V
iptables v1.8.5 (legacy)

bash-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

bash-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

----> ok, same result on centos 7 and 8 (NOTE: kube-proxy and iptables-detect not executed)

kube-proxy on rke2 (kubectl exec -it kube-proxy-ppwmr -n kube-system -- /bin/bash)

bash-4.2# iptables -V
iptables v1.8.5 (nf_tables)

bash-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

bash-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random

---> random-fully missing from ip6tables

iptables compiled from source code

# ./xtables-nft-multi iptables -V
iptables v1.8.5 (nf_tables)

# ./xtables-nft-multi iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help  | grep random
 --random
 --random-fully

# ./xtables-nft-multi ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help  | grep random
 --random
 --random-fully

Thanks for checking @bovy89! According to our latest investigations, the issue is in iptables itself when building statically. It does not call the function init_extensions6() for the case that defined(NO_SHARED_LIBS) is set. We will create a patch today to verify this

@manuelbuil
Copy link
Contributor

iptables 1.8.5 seems to support ip6tables nft + random-fully. I also tried to compile iptables from source and it works as expected

centos 7 (3.10.0-1160.31.1.el7.x86_64)

# iptables -V
iptables v1.4.21

# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random

# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random

---> ok

centos 8 (4.18.0-305.7.1.el8_4.x86_64)

# iptables -V
iptables v1.8.4 (nf_tables)

# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

---> ok

simple run of docker.io/rancher/hardened-kube-proxy:v1.21.1-build20210520 using containerd/docker

bash-4.2# iptables -V
iptables v1.8.5 (legacy)

bash-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

bash-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

----> ok, same result on centos 7 and 8 (NOTE: kube-proxy and iptables-detect not executed)

kube-proxy on rke2 (kubectl exec -it kube-proxy-ppwmr -n kube-system -- /bin/bash)

bash-4.2# iptables -V
iptables v1.8.5 (nf_tables)

bash-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully

bash-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random

---> random-fully missing from ip6tables

iptables compiled from source code

# ./xtables-nft-multi iptables -V
iptables v1.8.5 (nf_tables)

# ./xtables-nft-multi iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help  | grep random
 --random
 --random-fully

# ./xtables-nft-multi ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help  | grep random
 --random
 --random-fully

Thanks for checking @bovy89! According to our latest investigations, the issue is in iptables itself when building statically. It does not call the function init_extensions6() for the case that defined(NO_SHARED_LIBS) is set. We will create a patch today to verify this

Bug report in netfilter: https://bugzilla.netfilter.org/show_bug.cgi?id=1550

@vadorovsky
Copy link
Contributor

vadorovsky commented Jul 14, 2021

The patch will probably get more attention if you send it to the netfilter-devel mailing list

http://vger.kernel.org/vger-lists.html#netfilter-devel
https://lore.kernel.org/netfilter-devel/

They probably don't accept patches attached to bugzilla and they would expect some commit message. git send-email is what they usually use.

But if you are not comfortable with using git send-email, I can maybe send that patch for you with @erikwilson as an author and with his Signed-off-by? Up to you, whatever you feel better with.

/cc @erikwilson

@erikwilson
Copy link
Contributor

An iptables patch has been submitted and accepted, thanks for the info @vadorovsky!
https://patchwork.ozlabs.org/project/netfilter-devel/patch/20210713234823.36131-1-Erik.E.Wilson@gmail.com/

manuelbuil pushed a commit to manuelbuil/rke2 that referenced this issue Jul 14, 2021
This new image fixes the bug described in rancher#1229

Signed-off-by: manuel <manuel@suse.com>
manuelbuil pushed a commit to manuelbuil/rke2 that referenced this issue Jul 14, 2021
Both on:
* static-pod (kubernetes image)
* helm chart (kube-proxy image)

This new image fixes the bug described in rancher#1229

Signed-off-by: manuel <manuel@suse.com>
manuelbuil pushed a commit to manuelbuil/rke2 that referenced this issue Jul 14, 2021
Both on:
* static-pod (kubernetes image)
* helm chart (kube-proxy image)

This new image fixes the bug described in rancher#1229

Signed-off-by: manuel <manuel@suse.com>
manuelbuil pushed a commit to manuelbuil/rke2 that referenced this issue Jul 14, 2021
Both on:
* static-pod (kubernetes image)
* helm chart (kube-proxy image)

This new image fixes the bug described in rancher#1229

Signed-off-by: manuel <manuel@suse.com>
manuelbuil pushed a commit to manuelbuil/rke2 that referenced this issue Jul 16, 2021
Both on:
* static-pod (kubernetes image)
* helm chart (kube-proxy image)

This new image fixes the bug described in rancher#1229

Signed-off-by: manuel <manuel@suse.com>
manuelbuil pushed a commit to manuelbuil/rke2 that referenced this issue Jul 19, 2021
Both on:
* static-pod (kubernetes image)
* helm chart (kube-proxy image)

This new image fixes the bug described in rancher#1229

Signed-off-by: manuel <manuel@suse.com>
manuelbuil pushed a commit to manuelbuil/rke2 that referenced this issue Jul 20, 2021
Both on:
* static-pod (kubernetes image)
* helm chart (kube-proxy image)

This new image fixes the bug described in rancher#1229

Signed-off-by: manuel <manuel@suse.com>
@manuelbuil manuelbuil moved this from Working to To Test in Development [DEPRECATED] Jul 20, 2021
@bovy89
Copy link
Author

bovy89 commented Jul 26, 2021

Hi, is there any chance of getting that fixed in 1.21?

@manuelbuil
Copy link
Contributor

Hi, is there any chance of getting that fixed in 1.21?

It should be fixed in 1.21.3. Can you please check? Thanks!

@bovy89
Copy link
Author

bovy89 commented Jul 26, 2021

Seems to be fixed, thanks.

@rancher-max
Copy link
Contributor

Validated on master branch commit e889d2e273bd88b8ed273207334f394fac0a6a11

  • Please note this was also fixed in v1.21.3 and validated there as part of the backport issues
$ k exec -it pod/kube-proxy-ip-172-31-30-224 -n kube-system -- /bin/sh
sh-4.2# iptables -V
iptables v1.8.6 (legacy)
sh-4.2# ip6tables -V
ip6tables v1.8.6 (legacy)
sh-4.2# iptables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully
				Fully randomize source port.
sh-4.2# ip6tables -A POSTROUTING -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --help | grep random
 --random
 --random-fully
				Fully randomize source port.

Development [DEPRECATED] automation moved this from To Test to Done Issue / Merged PR Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development [DEPRECATED]
Done Issue / Merged PR
Development

No branches or pull requests

6 participants