Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy cannot set iptables rules for services #36652

Closed
seppeljordan opened this issue Nov 11, 2016 · 12 comments · Fixed by #36833
Closed

kube-proxy cannot set iptables rules for services #36652

seppeljordan opened this issue Nov 11, 2016 · 12 comments · Fixed by #36833
Assignees
Milestone

Comments

@seppeljordan
Copy link

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.0+a16c0a7", GitCommit:"a16c0a7f71a6f93c7e0f222d961f4675cd97a46b", GitTreeState:"not a git tree", BuildDate:"2016-11-08T01:00:01Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.2", GitCommit:"cfdaf18277e1ebaa28fcdaed1160a0243eb81be1", GitTreeState:"clean", BuildDate:"2016-10-27T22:10:26Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration:
    x86_64
  • OS (e.g. from /etc/os-release):
    Ubuntu 16.04
  • Kernel (e.g. uname -a):
    4.4.0-38-generic
  • Install tools:
    downloaded tarball from github and installed it manually

What happened:
Installed kubernetes on fresh ubuntu machine. Everthing was fine. all services started up. I wanted to install a service of type "NodePort" (see below) but the service was not accessable. I checked the kube-proxy logs and it gave me errors (see also below).

service:

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/external-traffic: OnlyLocal
  creationTimestamp: 2016-11-11T14:51:01Z
  name: py3-tornado-sample
  namespace: default
  resourceVersion: "4216"
  selfLink: /api/v1/namespaces/default/services/py3-tornado-sample
  uid: 432109bc-a81e-11e6-8450-6c626da071ed
spec:
  clusterIP: 192.168.0.55
  ports:
  - nodePort: 35166
    port: 80
    protocol: TCP
    targetPort: 8000
  selector:
    app: py3-tornado-sample
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

errors from kube-proxy:

Nov 11 16:11:01 XXXXXXXX kube-proxy[26731]: E1111 16:11:01.985452   26731 proxier.go:1282] Failed to execute iptables-restore: exit status 2 (Bad argument `KUBE-SVC-574IYFDEANSYIEBF'
Nov 11 16:11:01 XXXXXXXX kube-proxy[26731]: Error occurred at line: 33
Nov 11 16:11:01 XXXXXXXX kube-proxy[26731]: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Nov 11 16:11:01 XXXXXXXX kube-proxy[26731]: )

How to reproduce it (as minimally and precisely as possible):
Install kubernetes 1.5.0-alpha.2 on a ubuntu machine with kube-proxy set to iptables mode, launch a service of type NodePort.

I don't know what else might help you, please ask if you need further information on the issue, and I'll be happy to give more Info.

@mandarjog
Copy link
Contributor

mandarjog commented Nov 12, 2016

Turn up the loglevel of kube-proxy to --v=3
ssh into the node

sudo vi /etc/kubernetes/manifests/kube-proxy.manifest

look for a section that looks like this

  command:
    - /bin/sh
    - -c
    - kube-proxy --master=https://104.154.18.4 --kubeconfig=/var/lib/kube-proxy/kubeconfig  --cluster-cidr=10.0.0.0/14 --resource-container="" --v=2   1>>/var/log/kube-proxy.log 2>&1

and change --v=2 to --v=3.
kube-proxy will restart automatically after saving.

The logs on the next failure will contain more info.

@seppeljordan
Copy link
Author

Sorry for the long log file. I did not know exactly what you need, so I posted last 100 lines.

-- Logs begin at Sun 2016-11-13 23:51:54 CET, end at Mon 2016-11-14 11:06:12 CET. --
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-I6V54ZEKLM63UYYR -m comment --comment default/py3-tornado-sample: -m tcp -p tcp -j DNAT --to-destination 172.17.41.2:8000
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-XLB-574IYFDEANSYIEBF -m comment --comment "Redirect pods trying to reach external loadbalancer VIP to clusterIP" -s  -j KUBE-SVC-574IYFDEANSYIEBF
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-XLB-574IYFDEANSYIEBF -m comment --comment "default/py3-tornado-sample: has no local endpoints" -j KUBE-MARK-DROP
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -X KUBE-SEP-GAZ6D67M54LH5EY5
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: COMMIT
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: E1114 11:06:11.955523   12085 proxier.go:1282] Failed to execute iptables-restore: exit status 2 (Bad argument `KUBE-SVC-574IYFDEANSYIEBF'
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: Error occurred at line: 33
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: )
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: I1114 11:06:11.955556   12085 proxier.go:1363] Closing local port "nodePort for default/py3-tornado-sample:" (:35166/tcp) after iptables-restore failure
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: I1114 11:06:11.955598   12085 config.go:99] Calling handler.OnEndpointsUpdate()
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: I1114 11:06:11.955639   12085 proxier.go:777] Syncing iptables rules
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: I1114 11:06:11.974660   12085 proxier.go:1353] Opened local port "nodePort for default/py3-tornado-sample:" (:35166/tcp)
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: I1114 11:06:11.974743   12085 proxier.go:1279] Restoring iptables rules: *filter
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SERVICES - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp -p udp -d 192.168.0.2/32 --dport 53 -j REJECT
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp -p tcp -d 192.168.0.2/32 --dport 53 -j REJECT
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: COMMIT
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: *nat
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SERVICES - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-NODEPORTS - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-POSTROUTING - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-MARK-MASQ - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SVC-574IYFDEANSYIEBF - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-XLB-574IYFDEANSYIEBF - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SEP-I6V54ZEKLM63UYYR - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SEP-SSNSCRUIRMZDLXKS - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: :KUBE-SEP-GAZ6D67M54LH5EY5 - [0:0]
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x00004000/0x00004000 -j MASQUERADE
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-MARK-MASQ -j MARK --set-xmark 0x00004000/0x00004000
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp -p udp -d 192.168.0.2/32 --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp -p tcp -d 192.168.0.2/32 --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "default/py3-tornado-sample: cluster IP" -m tcp -p tcp -d 192.168.0.55/32 --dport 80 -j KUBE-SVC-574IYFDEANSYIEBF
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-NODEPORTS -m comment --comment default/py3-tornado-sample: -m tcp -p tcp --dport 35166 -j KUBE-XLB-574IYFDEANSYIEBF
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SVC-574IYFDEANSYIEBF -m comment --comment default/py3-tornado-sample: -j KUBE-SEP-I6V54ZEKLM63UYYR
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-I6V54ZEKLM63UYYR -m comment --comment default/py3-tornado-sample: -s 172.17.41.2/32 -j KUBE-MARK-MASQ
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-I6V54ZEKLM63UYYR -m comment --comment default/py3-tornado-sample: -m tcp -p tcp -j DNAT --to-destination 172.17.41.2:8000
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-XLB-574IYFDEANSYIEBF -m comment --comment "Redirect pods trying to reach external loadbalancer VIP to clusterIP" -s  -j KUBE-SVC-574IYFDEANSYIEBF
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-XLB-574IYFDEANSYIEBF -m comment --comment "default/py3-tornado-sample: has no local endpoints" -j KUBE-MARK-DROP
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "default/kubernetes:https cluster IP" -m tcp -p tcp -d 192.168.0.1/32 --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment default/kubernetes:https -m recent --name KUBE-SEP-SSNSCRUIRMZDLXKS --rcheck --seconds 180 --reap -j KUBE-SEP-SSNSCRUIRMZDLXKS
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment default/kubernetes:https -j KUBE-SEP-SSNSCRUIRMZDLXKS
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-SSNSCRUIRMZDLXKS -m comment --comment default/kubernetes:https -s 1.2.3.5/32 -j KUBE-MARK-MASQ
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-SSNSCRUIRMZDLXKS -m comment --comment default/kubernetes:https -m recent --name KUBE-SEP-SSNSCRUIRMZDLXKS --set -m tcp -p tcp -j DNAT --to-destination 1.2.3.5:8080
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -X KUBE-SEP-GAZ6D67M54LH5EY5
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: COMMIT
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: E1114 11:06:11.978073   12085 proxier.go:1282] Failed to execute iptables-restore: exit status 2 (Bad argument `KUBE-SVC-574IYFDEANSYIEBF'
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: Error occurred at line: 28
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: )
Nov 14 11:06:11 HOSTNAME kube-proxy[12085]: I1114 11:06:11.978109   12085 proxier.go:1363] Closing local port "nodePort for default/py3-tornado-sample:" (:35166/tcp) after iptables-restore failure
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: I1114 11:06:12.468455   12085 config.go:99] Calling handler.OnEndpointsUpdate()
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: I1114 11:06:12.468543   12085 proxier.go:592] Setting endpoints for "default/kubernetes:https" to [1.2.3.4:8080]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: I1114 11:06:12.468607   12085 proxier.go:777] Syncing iptables rules
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: I1114 11:06:12.488196   12085 proxier.go:1353] Opened local port "nodePort for default/py3-tornado-sample:" (:35166/tcp)
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: I1114 11:06:12.488273   12085 proxier.go:1279] Restoring iptables rules: *filter
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SERVICES - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns has no endpoints" -m udp -p udp -d 192.168.0.2/32 --dport 53 -j REJECT
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp has no endpoints" -m tcp -p tcp -d 192.168.0.2/32 --dport 53 -j REJECT
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: COMMIT
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: *nat
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SERVICES - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-NODEPORTS - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-POSTROUTING - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-MARK-MASQ - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SEP-GAZ6D67M54LH5EY5 - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SVC-574IYFDEANSYIEBF - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-XLB-574IYFDEANSYIEBF - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: :KUBE-SEP-I6V54ZEKLM63UYYR - [0:0]
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x00004000/0x00004000 -j MASQUERADE
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-MARK-MASQ -j MARK --set-xmark 0x00004000/0x00004000
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "default/kubernetes:https cluster IP" -m tcp -p tcp -d 192.168.0.1/32 --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment default/kubernetes:https -m recent --name KUBE-SEP-GAZ6D67M54LH5EY5 --rcheck --seconds 180 --reap -j KUBE-SEP-GAZ6D67M54LH5EY5
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment default/kubernetes:https -j KUBE-SEP-GAZ6D67M54LH5EY5
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-GAZ6D67M54LH5EY5 -m comment --comment default/kubernetes:https -s 1.2.3.4/32 -j KUBE-MARK-MASQ
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-GAZ6D67M54LH5EY5 -m comment --comment default/kubernetes:https -m recent --name KUBE-SEP-GAZ6D67M54LH5EY5 --set -m tcp -p tcp -j DNAT --to-destination 1.2.3.4:8080
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp -p udp -d 192.168.0.2/32 --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp -p tcp -d 192.168.0.2/32 --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "default/py3-tornado-sample: cluster IP" -m tcp -p tcp -d 192.168.0.55/32 --dport 80 -j KUBE-SVC-574IYFDEANSYIEBF
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-NODEPORTS -m comment --comment default/py3-tornado-sample: -m tcp -p tcp --dport 35166 -j KUBE-XLB-574IYFDEANSYIEBF
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SVC-574IYFDEANSYIEBF -m comment --comment default/py3-tornado-sample: -j KUBE-SEP-I6V54ZEKLM63UYYR
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-I6V54ZEKLM63UYYR -m comment --comment default/py3-tornado-sample: -s 172.17.41.2/32 -j KUBE-MARK-MASQ
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SEP-I6V54ZEKLM63UYYR -m comment --comment default/py3-tornado-sample: -m tcp -p tcp -j DNAT --to-destination 172.17.41.2:8000
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-XLB-574IYFDEANSYIEBF -m comment --comment "Redirect pods trying to reach external loadbalancer VIP to clusterIP" -s  -j KUBE-SVC-574IYFDEANSYIEBF
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-XLB-574IYFDEANSYIEBF -m comment --comment "default/py3-tornado-sample: has no local endpoints" -j KUBE-MARK-DROP
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: -A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: COMMIT
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: E1114 11:06:12.491680   12085 proxier.go:1282] Failed to execute iptables-restore: exit status 2 (Bad argument `KUBE-SVC-574IYFDEANSYIEBF'
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: Error occurred at line: 32
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: Try `iptables-restore -h' or 'iptables-restore --help' for more information.
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: )
Nov 14 11:06:12 HOSTNAME kube-proxy[12085]: I1114 11:06:12.491725   12085 proxier.go:1363] Closing local port "nodePort for default/py3-tornado-sample:" (:35166/tcp) after iptables-restore failure

Also: My kube-proxy systemd service file looks like this:

[Unit]
Description=Kubernetes node proxy service
After=flanneld.service
PartOf=flanneld.service


[Service]
ExecStart={{ k8s_bin_dir }}/kube-proxy \
    --master=https://{{ groups['kube-masters'][0] }}:{{ k8s_api_secure_port }} \
    --kubeconfig={{ k8s_kubeconfig_file }} \
    --hostname-override={{ ansible_ssh_host }} \
    --proxy-mode=iptables \
    --feature-gates AllowExtTrafficLocalEndpoints=true \
    --v=3
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

@mandarjog
Copy link
Contributor

Can you add the following argument

--cluster-cidr="your cluster cidr"

to ExecStart ?

It is potentially generating incorrect rules because of missing cluster cidr

@mandarjog
Copy link
Contributor

mandarjog commented Nov 14, 2016

@bprashanth
93f9b54 line: 1188 adds this issue.

@bprashanth bprashanth added this to the v1.5 milestone Nov 14, 2016
@bprashanth bprashanth self-assigned this Nov 14, 2016
@bprashanth
Copy link
Contributor

Thanks for the report

@bprashanth
Copy link
Contributor

We want the vm to accept the lb vip as "local" if the traffic is from outside the cluster coming in, but we don't want the vm to accept the lb vip as local for traffic from the vm going out. In both cases the destination is the vip, so to distinguish them, we check if the source is within the clusterCIDR.

If we don't have the CIDR, we have 2 options:

  1. fallback to blackholing the lb vip on nodes that don't have endpoints
  2. find a better way to detect local pod traffic (we can't assume cbr0 because users might be using a different network plugin, maybe we can plug the rule into a different chain?)

In an ideal situation, we'd find a way to hit the literal public vip. Given that both 1 and 2 are hopefully short term, we're doing the easier one (2) and documenting that without podCIDR, traffic gets blackholed in the specific scenario.

The intersection of platforms that use packet forwarding (and hence need the local rule for public lb) but don't provide clusterCIDR, should be small enough if it even exists (i.e you can setup a kube cluster by hand on gce and not provider clusterCIDR, then accessing a public vip of a Type=LoadBalancer service from within a pod might not work if the same node doesn't have any endpoints).

@bprashanth
Copy link
Contributor

FTR @mandarjog volunteered to send a pr

@mandarjog
Copy link
Contributor

The specific issue here arises when

service.alpha.kubernetes.io/external-traffic: OnlyLocal

In this case we populate XLB chain as follows

  1. Add rule to jump to SVC chain if source == cluster_cidr
  2. Add load balancing rules to route to local endpoints
    or add
    "has no local endpoint" rule.

If cluster_cidr is not available, There are 2 options:

  1. Rule (1) can either be removed, which means
    If there are local endpoints, routing will succeed, it will fail otherwise
  2. Rule (1) will be relaxed to route to SVC if no local endpoints are available.

I think with adequate error feedback option (1) is better.

@bprashanth
Copy link
Contributor

Is 2 valid? you don't need podCIDR to actually implement onlyLocal. If I startup without podCIDR, and you add a blanket rule that just jumps straight to SVC because we don't have podCIDR, that mens traffic coming in from the internet will get sprayed across service endpoints. Traffic from the internet going to a Service with onlyLocal should always be either kept local, or blackholed. Traffic from pods in the cluster can be sprayed (because source ip is preseved in either case). If we can't differentiate the 2, we have no choice but to handle them the same way (i.e 1).

@mandarjog
Copy link
Contributor

while writing tests, is iptables-restore available ? I don't see a way to invoke iptable-restore from the tests with custom args.

specifically, how how to verify if the produced filter chains are syntactically correct?

iptables-restore --test

does this verification.

mandarjog added a commit to mandarjog/kubernetes that referenced this issue Nov 15, 2016
Empty clusterCIDR causes invalid rules generation.
Fixes issue kubernetes#36652
@peterochodo
Copy link

while you are in there fixing this issue, also please take a look at issue #36835

mandarjog added a commit to mandarjog/kubernetes that referenced this issue Nov 15, 2016
Empty clusterCIDR causes invalid rules generation.
Fixes issue kubernetes#36652
@mandarjog
Copy link
Contributor

I think we can do better when clusterCIDR is not specified.
we can use

-m addrtype --src-type LOCAL

We know with certainty that --src-type LOCAL is in the cluster.

Similar logic can be applied when dealing with Masquerade

k8s-github-robot pushed a commit that referenced this issue Nov 16, 2016
Automatic merge from submit-queue

Handle Empty clusterCIDR

**What this PR does / why we need it**:
Handles empty clusterCIDR by skipping the corresponding rule.

**Which issue this PR fixes** 
fixes #36652

**Special notes for your reviewer**:
1. Added test to check for presence/absence of XLB to SVC rule
2. Changed an error statement to log rules along with the error string in case of a failure; This ensures that full debug info is available in case of iptables-restore errors.


Empty clusterCIDR causes invalid rules generation.
Fixes issue #36652
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants