-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected behaviour using "OnlyLocal" annotation on NodePort with 1.6.1 and Weave-Net 1.6 #44963
Comments
/assign |
Thanks for the detailed info. It seems that the source address indeed got masqueraded from 212.9.183.78 to 10.40.0.0 when packages reached the wordpress pod (10.40.0.2).
Pinging more folks @bowei @kubernetes/sig-network-bugs |
I reinstalled the whole cluster to be sure there is nothing left from old installations and testings. Kubernetes is now 1.6.2. All other versions remain the same. The problem still persists except the MASQ adress now moved to 10.32.0.1.
Pods and services are deployed as follows
As I can see on dt-kube-test-2 the NodePort 31062 is opened by kube-proxy
I identified the kube-proxy Pod running on dt-kube-test-02 and I can see the following in the logs.
As far as I understand this is a non-critical warning, because Type: NodePort does not set up and use healthcheck-nodeport? |
This message went away if I create a service of "Type: LoadBalancer" (even not having a cloudprovider in place). I can see the healthcheck-nodeport annotation now:
If I access the nodePort I still have the MASQ IP:
If I curl the healthcheck-nodeport from external, I get
This looks good so far from my point of view. Putting it all together I assume that kube-proxy takes the connection and forwards it to the backend. Using the interface weave and it's adress 10.32.0.1 as outgoing interface for connecting the backend. 10.32.0.1 is the weave interface adress on dt-kube-test-2.
Unfortunately I have no idea why and how to fix it :( |
Sorry about the delay, I don't have enough insights of weave to explain this issue. I'm now looking into their design and will hopefully have a brief answer soon. For the nodeport warnings you saw in logs, they are benign --- or we should say they shouldn't show up. #42888 was tracking this and it already got fixed on upstream (#44578). Though the fix is not in k8s 1.6. |
BTW two quick questions:
|
the wget output as follows
the apache log from the wordpress pod
the 10.44.0.1 is the IP adress of the mysql pod as of kubectl get pods -o wide
output of wget on nativ port 80
apache logs from the wordpress pod
this request seems to be MASQ. Request to the NodePort on the cluster-ip does not work at all. Request times out.
|
Thanks for the experiments.
Yeah this would not work, nodePort is opened on node IPs but not on service IP. Quick summarize what you got this time:
This is surprising and sounds like the issue is wider than just "OnlyLocal" service. I wonder how do we ensure the compatibility for various k8s network plugins? cc @freehan |
Can you suggest another network plugin I can try to see if the problem goes away? It is not much work to redeploy the whole setup. Even with new VMs or in parallel if necessary . As far as I can appraise at the moment, all we need is a multi-node network overlay. We are not bound to weave if other solutions are available. What we badly need is the possibility to preserve the external ip in on premises kubernetes clusters. I meanwhile had some success using a NGINX ingress controller. But as it relies on X-Forward-For headers, it will not solve the problem for other protocols. And TLS seems to be available only using SNI which causes concerns in the project team. |
@tomte76 May you file an issue against weaveworks/weave as well? Weave folks might have more insight of this. |
Yes. I can file this. Maybe I should try first, if the problem disappears if I use another network plugin to ensure it is related to weave? |
Sorry, your comment popped up after I sent that. Probably try flannel? |
Thank you. I redeployed and installed flannel using kube-flannel-rbac.yml and kube-flannel.yml but at the moment the dns-pod does not start.
I'll look into that the next days. It's pretty late at night here in germany. |
I made some mistakes. The iptables rules mentioned above seems to break the external source IP preservation mechanism:
The packet flow as below:
Though I don't have a theory yet for why packet that goes through service VIP would fail to preserve source pod IP. |
Hi, I work on Weave Net; just seeing this issue for the first time. I can understand how I cannot see how it could ever work with an overlay network, in the absence of those routing rules. How would the return packets get back to the original client? |
In our case we try to find a way to preserve the client ip of connections in our on-premises setup. Therefore we have kubernetes-clusters deployed on VMs in e.g. openstack or proxmox running on bare metal. As far as I understand I need the overlay network to spread the pods and communication across the minions in my cluster. And we would need OnlyLocal to have some pods running pinned on dedicated VMs we can route IP space on or port-nat on an external firewall to the NodePorts. This would enable us to preserve the external IP in these pods and set up any kind of ingress there. Even if we have to rely on the client's IP in any way (Blacklist, Whitelist, Geo-IP Stuff e.g.) Behind these pods we would use the overlay to communicate to the backend pods running on different VMs in the cluster. We are aware of the fact, that client source will not be preserved in this step. This step will be a in-cluster communication from the ingress nodes to the backend systems. This is compareable to the nginx ingress project, which uses host-network to preserve client ip. But this ingress seems to be optimized for HTTP and also supports TLS only with SNI. Also host-network seems to have other side-effects. This information to clarify, what we are trying to do. Summarized we are trying to set up something like the cloud providers ingress for our on-premises clusters. |
cc @dnardo |
In meanwhile managed it to deploy working flannel instead of weave. Sorry for the delay. I had some trouble with the openstack security groups. The observed behaviour also exists on flannel. The MASQ IP is now taken from the flannel required "--pod-network-cidr=10.244.0.0/16". The wget from a cluster-external IP on the openstack floating ip and the NodePort as follows:
Logs from the wordpress pod.
The wordpress pods is located on the dt-kube-minion-2. Looking there I can find the MASQ IP assigned to the cni0 interface
The iptables on the dt-kube-minion-2 looks like this
The result looks very similar to weave. I assume the packet gets MASQ on traversing cni0 to reache the pod which has an IP adress in the flannel node assignment 10.244.2.0/24. In this case 10.244.2.2.
|
If I insert one iptables rule
it works as expected
iptables-save looks like this now
|
So what does the return path look like? Are the packets from the pod coming back to the original client with the pod's IP as source address, or is that getting rewritten to the service IP? |
In this case, service IP shouldn't be involved, as the original destination IP is the node IP instead of service IP. I'd expect the return packets' source address got rewritten to the node IP. |
@bboreham Also to clarify, this |
OK, I marked the Weave Net issue as a feature request, since it is currently hard-coded to masquerade everything on and off the overlay; suggest you close this issue. |
I have the same problem. I have a service that is listening on port 1813 and I have included the next configuration to preserve source IP inside the container that runs in Kubernetes. apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/external-traffic: "OnlyLocal"
labels:
name: connector-udp
name: connector-udp
spec:
ports:
# The port that this service should serve on.
- port: 1813
name: accounting
targetPort: 1813
nodePort: 31813
protocol: UDP
# Label keys and values that must match in order to receive traffic for this service.
externalIPs:
- "172.19.18.72"
selector:
app: connector
type: LoadBalancer When I send traffic UDP from a simulator to de Kubernetes's service I see that the source IP not is the ip addres of simulator's machine.
Otherwise, on the CoreOS host I see the source IP correctly.
I'm using flannel and calico to solve network configuration, so I dont think that only is a problem with Weave Net. There are any solution for my problem. |
Before we close, can we make an equivalent Flannel bug? @caseydavenport
who own's flannel/canal?
…On Fri, May 5, 2017 at 2:55 AM, manuperera ***@***.***> wrote:
I have the same problem. I have a service that is listening on port 1813
and I have included the next configuration to preserve source IP inside the
container that runs in Kubernetes.
apiVersion: v1kind: Servicemetadata:
annotations:
service.beta.kubernetes.io/external-traffic: "OnlyLocal"
labels:
name: connector-udp
name: connector-udpspec:
ports:
# The port that this service should serve on.
- port: 1813
name: accounting
targetPort: 1813
nodePort: 31813
protocol: UDP
# Label keys and values that must match in order to receive traffic for this service.
externalIPs:
- "172.19.18.72"
selector:
app: connector
type: LoadBalancer
[image: image]
<https://cloud.githubusercontent.com/assets/28434493/25740377/93d3af0c-3186-11e7-9712-743979122d84.png>
When I send traffic UDP from a simulator to de Kubernetes's service I see
that the source IP not is the ip addres of simulator's machine.
***@***.***:/# tcpdump -n dst port 1813
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:44:14.602461 IP 10.1.47.0.35686 > 10.1.67.2.1813: RADIUS, Accounting-Request (4), id: 0x01 length: 164
09:44:15.502967 IP 10.1.47.0.35686 > 10.1.67.2.1813: RADIUS, Accounting-Request (4), id: 0x02 length: 164
09:44:16.404781 IP 10.1.47.0.35686 > 10.1.67.2.1813: RADIUS, Accounting-Request (4), id: 0x03 length: 164
09:44:17.305984 IP 10.1.47.0.35686 > 10.1.67.2.1813: RADIUS, Accounting-Request (4), id: 0x04 length: 164
09:44:18.207565 IP 10.1.47.0.35686 > 10.1.67.2.1813: RADIUS, Accounting-Request (4), id: 0x05 length: 164
09:44:19.109202 IP 10.1.47.0.35686 > 10.1.67.2.1813: RADIUS, Accounting-Request (4), id: 0x06 length: 164
Otherwise, on the CoreOS host I see the source IP correctly.
***@***.***:~# tcpdump -n dst port 1813
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:44:14.584830 IP 172.19.18.53.35686 > 172.19.18.72.1813: RADIUS, Accounting-Request (4), id: 0x01 length: 164
09:44:15.485514 IP 172.19.18.53.35686 > 172.19.18.72.1813: RADIUS, Accounting-Request (4), id: 0x02 length: 164
09:44:16.387202 IP 172.19.18.53.35686 > 172.19.18.72.1813: RADIUS, Accounting-Request (4), id: 0x03 length: 164
09:44:17.288524 IP 172.19.18.53.35686 > 172.19.18.72.1813: RADIUS, Accounting-Request (4), id: 0x04 length: 164
09:44:18.190003 IP 172.19.18.53.35686 > 172.19.18.72.1813: RADIUS, Accounting-Request (4), id: 0x05 length: 164
09:44:19.091797 IP 172.19.18.53.35686 > 172.19.18.72.1813: RADIUS, Accounting-Request (4), id: 0x06 length: 164
There are any solution for my problem.
Thanks in advanced.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#44963 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFVgVIV0VAXt0uGSBdN0CM0SC-_3kZwcks5r2vIYgaJpZM4NJOpK>
.
|
Adding @tomdee for the flannel/canal bits. |
I've raised this issue against flannel: flannel-io/flannel#734 We can close this now. |
@caseydavenport Thanks! /close |
ii docker-engine 1.12.6-0~debian-jessie amd64
kubeadm init
ansible -become -i ansible-hosts all -a "kubeadm join --token=<token from init> 192.168.141.24"
kubectl apply -f https://git.io/weave-kube-1.6
kubectl create -f wordpress.yaml
kubectl create -f wordpress-service.yaml
@MrHohn: If you need any more information please let me know
The text was updated successfully, but these errors were encountered: