Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No local endpoints when service.beta.kubernetes.io/external-traffic: OnlyLocal configured #48437

Closed
liggetm opened this issue Jul 3, 2017 · 27 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@liggetm
Copy link

liggetm commented Jul 3, 2017

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
I'm running a service with NodePort and the OnlyLocal annotation (bare-metal/flannel) but receive no traffic on local pod. When the annotation is applied to the service, packets are not marked for masquerading in iptables but always dropped by a rule that states "servicexyz has no local endpoints". This is not the case, however - the service does indeed have load endpoints available.

What you expected to happen:
Route traffic to the local endpoints.

How to reproduce it (as minimally and precisely as possible):
Create a nodeport service with the onlylocal annotation
apiVersion: v1
kind: Service
metadata:
name: localsvc
annotations:
service.beta.kubernetes.io/external-traffic: OnlyLocal
spec:
ports:

  • name: snmp
    port: 1620
    protocol: UDP
    nodePort: 30162
    type: LoadBalancer
    selector:
    k8s-app: localapp
    Create a replication-controller for a single pod (restricting it to the master via a node-selector):
    apiVersion: v1
    kind: ReplicationController
    metadata:
    name: localrc
    spec:
    replicas: 1
    template:
    metadata:
    labels:
    k8s-app:localapp
    spec:
    containers:
    • name: localapp
      image: myImage
      ports:
      • containerPort: 1620
        protocol: UDP
        nodeSelector:
        runOn: master

Anything else we need to know?:
Relevant iptables nat rules:
From the KUBE-NODEPORTS chain:
KUBE-XLB-WGDEPLIALVG6VF4L udp -- 0.0.0.0/0 0.0.0.0/0 /* default/localsvc:snmp */ udp dpt:30162

Chain KUBE-XLB-WGDEPLIALVG6VF4L (1 references)
target prot opt source destination
KUBE-MARK-DROP all -- 0.0.0.0/0 0.0.0.0/0 /* default/localsvc:snmp has no local endpoints */

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: v1.5.4
    Server Version: v1.5.2
  • Cloud provider or hardware configuration**:
    bare-metal
  • OS (e.g. from /etc/os-release):
    CentOS atomic host (CentOS Linux release 7.3.1611 (Core))
  • Kernel (e.g. uname -a):
    Linux atomic64.jnpr.net 3.10.0-514.16.1.el7.x86_64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Others:
@k8s-github-robot
Copy link

@liggetm There are no sig labels on this issue. Please add a sig label by:
(1) mentioning a sig: @kubernetes/sig-<team-name>-misc
e.g., @kubernetes/sig-api-machinery-* for API Machinery
(2) specifying the label manually: /sig <label>
e.g., /sig scalability for sig/scalability

Note: method (1) will trigger a notification to the team. You can find the team list here and label list here

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 3, 2017
@liggetm
Copy link
Author

liggetm commented Jul 3, 2017

@kubernetes/sig-network

@xiangpengzhao
Copy link
Contributor

/sig network

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Jul 4, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 4, 2017
@MrHohn
Copy link
Member

MrHohn commented Jul 5, 2017

@liggetm Have you confirmed the endpoint was up and ready? Like kubectl get ep?

My guess is that, the health-check is failing. Is this likely the case, and if so, what do I need to do in order that the health-check succeeds?

Saw you comment on another thread, this shouldn't be related to health check --- health check is used by LoadBalancer from cloud providers, but your cluster is running on bare metal and using nodePort.

From what you described above, it seems like the cause is kube-proxy didn't notice there is a local pod running on node, so it didn't set up iptables rule properly.

As you mentioned, your backend pod is running on master, have you tried run the pod on one of the nodes and see if it work?

@MrHohn
Copy link
Member

MrHohn commented Jul 5, 2017

I created a v1.5.2 k8s cluster but failed to reproduce your issue. But I was running cluster on GCE and the backend pod is on one of the nodes instead of master.

@thockin
Copy link
Member

thockin commented Aug 11, 2017

Is this still a live issue? Over a month old...

@lin99shen
Copy link

Any updates on this issue? I've seen the same issue with 1.5.2.

@MrHohn
Copy link
Member

MrHohn commented Aug 21, 2017

@lin99shen Any chance to post more details? Is the endpoint ready? Logs from kube-proxy? iptables-save output?

@lin99shen
Copy link

The endpoint is ready. The endpoint is running on the same node as of the master too.

I'm basically trying out the source IP preservation beta feature per https://kubernetes.io/docs/tutorials/services/source-ip/.

Without the OnlyLocal annotation in the service spec, curl goes through, but the source IP is MASQed. Once I add the annotation to the service spec, curl will timeout and tcpdump shows no packet going into the endpoint any more. iptables-save shows that the same KUBE-MARK-DROP rules is generated:
From KUBE-NODEPORTS chain:
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/nodeport:" -m tcp --dport 7895 -j KUBE-XLB-XP7QDA4CRQ2QA33W

From KUBE-XLB-XP7QDA4CRQ2QA33W chain:
KUBE-MARK-DROP all -- 0.0.0.0/0 0.0.0.0/0 /* default/nodeport: has no local endpoints */

@MrHohn
Copy link
Member

MrHohn commented Aug 21, 2017

@lin99shen So this seems to be an issue with onlyLocal endpoints on master? Or it happens on node as well?

Took a look at the code, the real thing kube-proxy checks for local endpoint is barely (https://github.com/kubernetes/kubernetes/blob/v1.5.2/pkg/proxy/iptables/proxier.go#L591-L600):

isLocalEndpoint = *addr.NodeName == proxier.hostname

Just in case, could you check if the endpoint on master has nodeName set?

$ kubectl get ep $SERVICE_NAME -o yaml | grep nodeName
    nodeName: XXX

@lin99shen
Copy link

kubectl get ep nodeport -o yaml | grep nodeName
nodeName: 192.168.125.85

the nodeName is just the ip address. guess I'll need to give it a name?

@MrHohn
Copy link
Member

MrHohn commented Aug 21, 2017

kubectl get ep nodeport -o yaml | grep nodeName
nodeName: 192.168.125.85

the nodeName is just the ip address. guess I'll need to give it a name?

Humm..This looks weird, I'm expecting something like nodeName: e2e-test-mrhohn-minion-group-7vch.

For proxiser.hostname, I think it comes from os.Hostname() (
https://github.com/kubernetes/kubernetes/blob/v1.5.2/pkg/util/node/node.go#L41-L51), so it wouldn't match the ip adress in this case.

Endpoint's nodeName comes from podSpec (https://github.com/kubernetes/kubernetes/blob/v1.5.2/pkg/controller/endpoint/endpoints_controller.go#L413), I wonder in which case that would be set to ip address :/

@lin99shen
Copy link

nodeName should be auto-filled, right? Not specified in podSpec by the user.

@lin99shen
Copy link

I tried to delete the below rule manually just to verify, but the rule gets added back automatically right away. I had kube-proxy stopped. Any idea?

KUBE-NODEPORTS -p tcp -m comment --comment "default/nodeport:" -m tcp --dport 7895 -j KUBE-XLB-XP7QDA4CRQ2QA33W

@MrHohn
Copy link
Member

MrHohn commented Aug 22, 2017

nodeName should be auto-filled, right? Not specified in podSpec by the user.

Yeah nodeName should be auto-filled if user don't specify it. I will need to dig a bit more on the auto-filling behavior later.

I tried to delete the below rule manually just to verify, but the rule gets added back automatically right away. I had kube-proxy stopped. Any idea?

That is odd, maybe kube-proxy somehow restarted? It should be the only process that writes to KUBE-NODEPORTS chain.

@lin99shen
Copy link

I checked, it's inactive. Will confirm it again.

@lin99shen
Copy link

Just some context, maybe there is other options. We have an app that depends on the originating source IP. I'm trying to use the external-traffic annotation to prevent the source IP of the incoming packets to be SNAT'ed. I followed the tutorial as-is and ran into this problem. Besides this approach, is there any other way I can achieve the same. For now, our k8s runs on a single VM.

@pnuzyf
Copy link

pnuzyf commented Aug 23, 2017

I got the same issue. In my scenario, there are two minon nodes. The iptables on one node is normal, but that on the other node is weird.(actually there is one pod running on this node).

@MrHohn
Copy link
Member

MrHohn commented Aug 28, 2017

@lin99shen After poking around, I believe pod.nodeName is filled by scheduler using node.name:

In PrioritizeNodes():

for i := range nodes {
result = append(result, schedulerapi.HostPriority{Host: nodes[i].Name, Score: 0})
for j := range priorityConfigs {
result[i].Score += results[j][i].Score * priorityConfigs[j].Weight
}
}

return schedulerapi.HostPriority{
Host: node.Name,
Score: 1,
}, nil

In selectHost():

return priorityList[ix].Host, nil

Or maybe I'm wrong...Is that IP address (nodeName: 192.168.125.85) also your node name?

kubectl get nodes -o yaml | grep " name: "

@MrHohn
Copy link
Member

MrHohn commented Aug 28, 2017

@pnuzyf Are you having the same issue that your endpoints use the IP address as hostname (#48437 (comment)), or are you seeing a different behavior?

@lin99shen
Copy link

Yes, IP address is my node name. I restarted kube-proxy with --override-hostname (the ip address), the feature works now.

@tanjinfu
Copy link

tanjinfu commented Sep 4, 2017

I've the same issue. and I'm using the version 1.6. @lin99shen could you please share how you restart the kube-proxy with the --override-hostname? As what I see on my cluster, the kube-proxy is a pod, I can delete it to have the k8s create a new one, but I can't pass the option to it.

@MrHohn
Copy link
Member

MrHohn commented Sep 5, 2017

As what I see on my cluster, the kube-proxy is a pod, I can delete it to have the k8s create a new one, but I can't pass the option to it.

@tanjinfu It seems kube-proxy runs as static pods in your case. You should be able to modify kube-proxy's manifest file under the path specified by --pod-manifest-path (for kubelet) on each node.

Though it seems odd to me that user needs to explicitly override hostname. Note that kubelet also provides --hostname-override flag and does retrieve the hostname in the same way as kube-proxy. If you didn't override hostname on kubelet, you may not need to override hostname on kube-proxy.

@lin99shen
Copy link

I'm not overriding hostname on kubelet, at least not intentional.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 4, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 9, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

10 participants