Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iptables proxier: route local traffic to LB IPs to service chain #77523

Open
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@andrewsykim
Copy link
Member

commented May 6, 2019

Signed-off-by: Andrew Sy Kim kiman@vmware.com

What type of PR is this?
/kind bug

What this PR does / why we need it:
For any traffic to an LB IP that originates from the local node, re-route that traffic to the Kubernetes service chain. This allows traffic to an external LB from inside a cluster reachable. The implication of this is that internal traffic to an LB IP will need to go through SNAT. This is likely okay since source IP preservation with externalTrafficPolicy=Local only applies for external traffic anyways. The fix was spelled out in more detail by Tim here #65387.

I think the correct behavior is to actually route the traffic to the LB instead of intercepting it with iptables but with the current set of rules I'm not sure this is possible. We also already have rules that route pods in the cluster cidr that want to reach LB IPs to the service chain:

-A KUBE-XLB-ECF5TUORC5E2ZCRD -s 10.8.0.0/14 -m comment --comment "Redirect pods trying to reach external loadbalancer VIP to clusterIP" -j KUBE-SVC-ECF5TUORC5E2ZCRD

Allowing traffic with --src-type LOCAL to do the same makes sense to me

Which issue(s) this PR fixes:
Fixes #65387 #66607

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

iptables proxier: route local traffic to LB IPs to service chain
@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 6, 2019

@m1093782566 @Lion-Wei any ideas if we need this for IPVS proxier?

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 7, 2019

/priority important-soon

@andrewsykim andrewsykim force-pushed the andrewsykim:fix-xlb-from-local branch from cfa210f to 4c1f5d5 May 7, 2019

@k8s-ci-robot k8s-ci-robot added size/M and removed size/S labels May 7, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: andrewsykim
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: thockin

If they are not already assigned, you can assign the PR to them by writing /assign @thockin in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

andrewsykim added some commits May 6, 2019

iptables proxier: route local traffic to LB IPs to service chain
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
add unit tests for -src-type=LOCAL from LB chain
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>

@andrewsykim andrewsykim force-pushed the andrewsykim:fix-xlb-from-local branch from 0cf0983 to 8dfd4de May 7, 2019

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 7, 2019

Validated and tested this on a Kind cluster with metallb (thanks @mauilion!) and on GKE by applying the rules manually. @jcodybaker can you test this on DOKS please (re: #66607)?

@jcodybaker

This comment has been minimized.

Copy link

commented May 8, 2019

@andrewsykim - I've tested this and unfortunately it doesn't seemed to have changed the behavior seen in #66607. I've left my test cluster up, and can provide any debug that's helpful--just message on slack. I've started debugging by looking through the iptables counter. I didn't get the full picture tonight, but it's does seem to be taking a different path than we were seeing before.

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 8, 2019

@jcodybaker can you share the output for sudo iptables-save please?

@mauilion

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

@andrewsykim
In my testing I was able to reproduce this case:
Bring up a 3 node cluster and schedule pods on 2 of them.
Define a service of type loadbalancer and ensure that the loadbalancer provides an ip address
modify the service and set externalTrafficPolicy to Local
Bring up a pod with hostNetwork: true on the node where there are no endpoints.
before your change I am not able to connect to the service.
after your change I can.

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 8, 2019

I spoke with @jcodybaker on slack. What he was looking for was the ability to preserve source IP (via proxy protocol from the LB) for local traffic in the cluster via the LB IP. I'm not sure this is possible or even supported for internal traffic. Internal traffic should use cluster IP and the externalTrafficPolicy type shouldn't matter in that case.

With that said, it might be possible to remove the fw chain that detects the LB IP as the destination and actually route traffic to the LB instead of intercepting it with itpables. This probably needs a longer discussion/proposal though.

@mauilion

This comment has been minimized.

Copy link
Contributor

commented May 9, 2019

@andrewsykim Here is the article I put together for testing this stuff:
https://mauilion.dev/posts/kind-k8s-testing/

Thanks!

@donbowman

This comment has been minimized.

Copy link

commented May 9, 2019

I spoke with @jcodybaker on slack. What he was looking for was the ability to preserve source IP (via proxy protocol from the LB) for local traffic in the cluster via the LB IP. I'm not sure this is possible or even supported for internal traffic. Internal traffic should use cluster IP and the externalTrafficPolicy type shouldn't matter in that case.

With that said, it might be possible to remove the fw chain that detects the LB IP as the destination and actually route traffic to the LB instead of intercepting it with itpables. This probably needs a longer discussion/proposal though.

We are in the process of upstreaming all the PR to do this w/ envoy + istio (it uses the sidecar to unpack the original public IP in the pod).
envoyproxy/envoy#4128 discusses that.

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 9, 2019

/retest
/assign @thockin

@kinolaev

This comment has been minimized.

Copy link

commented May 23, 2019

Maybe we can just ACCEPT packets to loadBalancerIP instead of redirecting it to SVC chain? You can test it with:

sudo iptables -t nat -I ${XLB chain name} 2 -m addrtype --src-type LOCAL -j ACCEPT
wget -Y off -O- https://${LB IP}
sudo iptables -nvL --line-numbers -t nat | grep  -A 4 "Chain ${XLB chain name}"

Draft PR #78247, can anyone test it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.