-
Notifications
You must be signed in to change notification settings - Fork 40.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In IPVS mode, host services become available on ClusterIPs #72236
Comments
/sig network |
/assign @m1093782566 |
The ClusterIP addresses (and external addresses) are added to the local
The The problem however seems to be the route that is automatically added to the
Now I can ssh to my local machine using these addresses as described in this issue;
But if the entry in the
I think removing the entry in the |
Related |
Pod-to-pod traffic does not work when the local table entry is removed. So I must withdraw my proposal. Please read more in cloudnativelabs/kube-router#623 |
In my case, Kubernetes v1.11.5 with kube-proxy in the iptables mode I get the very same behavior.
Where 10.149.0.1 is the
|
You should use ping instead of telnet. |
@m1093782566 maybe I misread the original description of the issue, but I took it as it was possible to ssh to the sshd service running on a host from a pod using the IP address of any service created in Kubernetes when kube-proxy is in the ipvs mode, as opposed to iptables mode. In my case the behavior is the same with either mode. Why should I use ICMP and ping when @fasaxc was talking about ssh'ing as the provided example? |
ping clusterip inside cluster. ssh/telnet clusterip:port outside cluster. |
/shrug I can ping clusterip and I can telnet to host's sshd from a pod using clusterip:22 address. Regardless of kube-proxy mode. |
All I wanted to say is the behavior is the same regardless of the kube-proxy mode and the issue starter said it was different with iptables proxier and ipvs proxier. Maybe it's different in Kubernetes 1.12, but with 1.11 here it's exactly the same. |
@emptywee The issue is about accessing a service ( |
@song-jiang I understand that, and in my case even with iptables it's still working as though I am in the ipvs mode. As I've shown above: #72236 (comment) |
@emptywee did you happen to switch from IPVS to iptables mode without rebooting? I'm wondering if there was an old kube-ipvs device hanging around to give the IPVS behaviour. I don't have a rig handy to check the iptables behaviour but I thought I'd seen that traffic get dropped before. |
@fasaxc no, I have two clusters, one in ipvs mode and one that has never been in it, which I was going to switch to ipvs soon. Tested the described behavior in both of them and it worked identically. |
@emptywee that's odd then, does the service's IP show up in |
@fasaxc yeah it could be the case, as we have BGP routing set up so services and pods IP space is routable in our network. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
I recently discovered this when testing IPVS mode on kube-proxy, wondering why the hell I was able to SSH into a load balancer and get into the underlying host. Not only does this apply to A quick hack to patch both these issues is as mentioned above - to use iptables to ensure traffic bound for the LoadBalancer/ClusterIP subnets gets dropped if the port number doesn't match the one in the service definition. Obviously depending on your iptables setup, the placement of this rule within your chain will be different, but the general gist is as mentioned above:
|
I tried escalating this to the security team but they weren't able to help!
Unfortunately I can't easily reproduce it at the moment (too many mitigations) so I don't feel like I can strongly campaign to get this fixed on my own. Can you detail your setup a bit to help? Were the load balancers on internet facing IPs? What's your cni? |
In my test rig I'm on bare metal, and I put kube-proxy in ipvs mode, and then metallb to assign the VIPs from some private IP space. I have tried both Cilium and Calico for the CNI and both have this bug (my hunch would be that if I tried cilium's kube-proxy replacement this wouldn't be an issue). I haven't looked into it, but I wouldn't be surprised if the cloud load balancer providers (e.g. https://github.com/kubernetes-sigs/aws-load-balancer-controller) put a security policy on the loadbalancer itself that only allow access on the configured ports, so this dodgy behaviour won't be seen as the LB itself is dropping non-configured ports before it gets to the host. On bare metal, however, BGP just tells the switch "give me all traffic for this IP" and if your network ACLs allow it, it will just forward all traffic to the LB IP onto the k8s host, which then happily exposes the host services on this IP. |
or (and I suspect this is more likely) none of the cloud providers support IPVS out of the box so this isn't really on anyone's radar Azure/AKS#1846 |
Thanks @jpiper. Yeah thats basically my setup (Calico + MetalLB), with VIP's coming from public ip space (not RFC1918). In the original calico ticket (projectcalico/calico#2279) @KashifSaadat had calico managed iptable rules that couldn't block this traffic. So there was a bug in kube-proxy AND calico couldn't even mitigate it. I think calico managed pre-dnat rules can block this traffic, but at this point in my environments I have too many other layers of filtering in place to drawn and be confident in any experimental conclusions. And even if it does its not a replacement for the bug being fixed. kube-router's kube-proxy replacement did have this issue, their fix is here. I do wonder how many live systems are unknowingly exposed like this. |
@Jc2k How do we go about proposing this as an agenda item then? This seems like an awful bug to have. |
@jpiper I was given this link but thats all. That doc mentions someone called hanamant is involved in IPVS at that meeting, is that @hanamantagoudvk? Maybe they can help us get this security issue resolved. |
@jpiper given you reproduced it on cilium, maybe worth dropping a line to security@cilium.io as well? |
@Jc2k in the mean time, I'm just going to use |
Oh thats a really good tip. I've been bitten by #75262 too. |
The proposal in #72236 (comment) seems simple;
But in practice (in code) it is actually hard for several reasons;
That said, it is of course possible but will require a larger PR that you might think in the first place. I guess that's why it hasn't been done. A way to implement this may be;
This is repeated for IPv6, but that is a minor problem. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
An open PR exist that fixes this problem. /remove-lifecycle stale |
And there was much rejoicing 😄 |
What happened:
In IPVS mode, if a service on the host is listening on
0.0.0.0:<some port>
, pods on the local host (or incoming traffic from other hosts if routing allows it) to any ClusterIP on<some port>
reaches the host service.For example, a pod can access its host's ssh service at something like
10.96.0.1:22
What you expected to happen:
ClusterIPs should reject or drop traffic to unexpected ports (as is the behaviour in iptables mode).
How to reproduce it (as minimally and precisely as possible):
10.96.0.1:22
, should reach the host.Anything else we need to know?:
This unexpected behaviour has some security impact; it's an extra, unexpected, way for packets to reach the host. In addition, it's very hard for a NetworkPolicy provider to secure this path (such as Calico) to secure this path because IPVS captures traffic that, to the kernel's policy engines, looks like it's going to be terminated at the host. For example, in iptables, there's no way to tell the difference between traffic that is about to be terminated by IPVS and traffic that's going to go to a local service.
Related Calico issue; user was trying to block the traffic with Calico policy but was unable to (because Calico has to whitelist all potential IPVS traffic): projectcalico/calico#2279
Environment:
kubectl version
): 1.12uname -a
):/kind bug
The text was updated successfully, but these errors were encountered: