Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POD DNS reverse lookup #266

Closed
scrwr opened this issue Sep 25, 2018 · 44 comments
Closed

POD DNS reverse lookup #266

scrwr opened this issue Sep 25, 2018 · 44 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@scrwr
Copy link

scrwr commented Sep 25, 2018

We have an increasing problem with Apache Hadoop-like services like Spark, Flink and co. These are trying to communicate via their hostnames in the cluster instead of their IPs. So they look up their own hostname and hence come up with the unresolvable podname. We see now two incomplete solutions:

a) Pod A records are created in a format like 1-2-3-4.namespace.pod.cluster.local variant. But the pod itself cannot be spec’ed to use this A record as its own hostname.

b) one can use hostname and subdomain together with a headless service in order to create an FQDN in the KubeDNS + in pods hostname, but this requires static hostnames and won’t work with Replicasets or Daemonsets.

We are looking for a complete solution, e.g. via optionally switch the pod hostname to its KubeDNS A Record, or injecting the A record as first entry in /etc/hosts etc. The type of pods which are heavily effected by the issue are Jobs. In the Spark context, these are driver pods / jobs. But the problem is not limited to Spark. We see the same effects all over in recent apache projects. E.g. Flink has a similar issue.

@thockin thockin added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Sep 25, 2018
@thockin
Copy link
Member

thockin commented Sep 25, 2018

We have an increasing problem with Apache Hadoop-like services like Spark, Flink and co. These are trying to communicate via their hostnames in the cluster instead of their IPs. So they look up their own hostname and hence come up with the unresolvable podname. We see now two incomplete solutions:

Is this a new trend? Is it actually important to these workloads' corectness, or can we convince them they are following a bad pattern?

a) Pod A records are created in a format like 1-2-3-4.namespace.pod.cluster.local variant. But the pod itself cannot be spec’ed to use this A record as its own hostname.

In general, these names are not very useful names. It's (in theory) possible to get those names into hostname -f but it would be a significant change to the overall system.

b) one can use hostname and subdomain together with a headless service in order to create an FQDN in the KubeDNS + in pods hostname, but this requires static hostnames and won’t work with Replicasets or Daemonsets.

We are looking for a complete solution, e.g. via optionally switch the pod hostname to its KubeDNS A Record, or injecting the A record as first entry in /etc/hosts etc. The type of pods which are heavily effected by the issue are Jobs. In the Spark context, these are driver pods / jobs. But the problem is not limited to Spark. We see the same effects all over in recent apache projects. E.g. Flink has a similar issue.

To do "real" reverse lookups of pod names (the default hostname) would require a larger DNS architecture change - watching all pods is expensive, so it would have to be multi-level. Are the IP-based names sufficient for this sort of use-case?

What about other use-cases that have emerged for DNS PTR?

@bowei @kubernetes/sig-network-feature-requests

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 25, 2018
@krmayankk
Copy link

I am told this already works in coredns and I tried to make a pr to make this work for dns but that didn’t move mainly because I was not able to convince the team . Can you try coredns and see if that helps. Also if coredns implements I am not sure it’s in violation of he dns spec in any way or not @johnbelamaric

@krmayankk
Copy link

#232 is the pr @thockin if I make a pr to the dns spec would you be open to consider it ?

@johnbelamaric
Copy link
Member

It's not in violation of the spec, but it goes beyond what the spec prescribes. You need to use the endpoint_pod_names directive in your Corefile. See https://coredns.io/plugins/kubernetes/

@scrwr
Copy link
Author

scrwr commented Sep 26, 2018

Is this a new trend? Is it actually important to these workloads' corectness, or can we convince them they are following a bad pattern?

Spark changed it just recently with version 2.3. Other products like hadoop are doing it since ages. I don't really know, what is driving them, as I have hard times to believe, that even outside the kubernetes world, every datacenter has a solid DNS setup by default. I guess the majority of self-hosters have to set it up in order to comply with hadoop-ish software stack.

In general, these names are not very useful names. It's (in theory) possible to get those names into hostname -f but it would be a significant change to the overall system.

Whats the purpose of these names then? Why do they exist when on the opposite, we are afraid to make the podnames DNS available because of spamming the DNS server.

Can you try coredns and see if that helps.
Excellent. I wasn't aware of this feature. We'll give it a try.

@realknorke
Copy link

I don't really know, what is driving them, as I have hard times to believe, that even outside the kubernetes world, every datacenter has a solid DNS setup by default. I guess the majority of self-hosters have to set it up in order to comply with hadoop-ish software stack.

What "every datacenter" has is not the question. The (v-)servers in big ones all have hostnames that can be resolved. I think almost every root server or cloud node can be accessed by just using its host name.

Maybe the Apache guys generalized from that.

@scrwr
Copy link
Author

scrwr commented Sep 27, 2018

I had a look and this is unfortunately not, what we are looking for. Let me give you an example.

A pod (job) connects to a Service master, telling this service its own hostname is driver_85nfu (podname). The service now tries to connect back to the driver_85nfu pod via this hostname, but it is not resolvable.

Having endpoint names in CoreDNS, doesn't help, because:
a) the pod doesn't use the endpoint as its own hostname and hence the master doesn't get this name in order to resolve it via CoreDNS.
b) it requires a Job pod to be part of a service, which feels conceptionally wrong and the setup to manipulate DNS search paths for the master pod in order to be able to resolve it via CoreDNS is complex and due to limitations in search items even impossible.

What I would expect is, that every pod hostname is available in DNS similar to the ip-based hostame under podname.namespace.pod.cluster.local so I can just simply add namespace.pod.cluster.localto the search path and will be covered.

@bowei
Copy link
Member

bowei commented Sep 27, 2018

Can someone give a pointer to the spark docs that reference this behavior? Maybe for spark (and associated jobs), we can give the job the synthetic pod IP as its "hostname"

@scrwr
Copy link
Author

scrwr commented Sep 27, 2018

https://spark.apache.org/docs/latest/configuration.html#networking
spark.driver.host => defaults to (local hostname)

@scrwr
Copy link
Author

scrwr commented Oct 6, 2018

@bowei:

Maybe ... we can give the job the synthetic pod IP as its "hostname"

How?

@realknorke
Copy link

realknorke commented Nov 20, 2018

Is this issue dead?
Is there any way I can help? Providing more information or testing?
Currently it is not possible for me to use Spark2.3 in Kubernetes (standalone mode, no k8s scheduler). Are @scrwr and I the only people with this kind of problem (so far)?
Thanks in advance!

@tsuna
Copy link

tsuna commented Dec 6, 2018

@realknorke nope, there are other people with this problem too. Just ran into it today with Flink 1.7.

As far as @thockin's question on whether or not this is a new trend: unfortunately not, it's been like this in Apache projects for almost a decade now. These issues have impacted other environments before kubernetes was a thing, and yet have not been addressed. There is virtually no hope to fix all this giant corpus of horrible Java code. And I say this as someone that has been part of some of those Apache projects for years and has contributed... 😰

@krmayankk
Copy link

Having endpoint names in CoreDNS, doesn't help, because:
a) the pod doesn't use the endpoint as its own hostname and hence the master doesn't get this name in order to resolve it via CoreDNS.
b) it requires a Job pod to be part of a service, which feels conceptionally wrong and the setup to manipulate DNS search paths for the master pod in order to be able to resolve it via CoreDNS is complex and due to limitations in search items even impossible.

@scrwr didnt understand this . I thought coredns makes pod names DNS resolvable

@realknorke
Copy link

@scrwr didnt understand this . I thought coredns makes pod names DNS resolvable

@krmayankk Not from the "outside" (e.g. from within another pod).

@johnbelamaric
Copy link
Member

johnbelamaric commented Dec 10, 2018 via email

@realknorke
Copy link

realknorke commented Dec 11, 2018

@johnbelamaric Hi. sadly I'm not that of a DNS guy. But I'd like to give you a detailed problem description and findings. From a DNS perspective all written below (only) relates to A record lookups.

Please see a shorter version below

Long story

The situation is as follows (I use Apache Spark as an example to illustrate):

I have a running Spark cluster. That is, one master pod with a (k8s) service (some ports open to connect to), and several slave pods. Everything fine here.

In order to get work carried out an external application (the "Spark program", called the driver) connects to the spark-master-pod (via K8S Service, e.g. spark-master.namespace.svc.cluster.local), retrieves the "workload" and distributes it in the cluster (to the worker nodes and executors). The Spark cluster is static; the service is up 24/7 and so are the pods in the spark cluster. The jobs to feed the cluster with (Spark drivers) are dynamic. They are K8S Jobs (like in run-to-completion pods) and are generated on demand somewhere. Together with the work description the driver submits its own hostname.

When the slaves are done with the work (or parts of it) they want to directly connect to the driver and submit a) metadata (like progress) and b) results.

And here the whole thing hits the fan: The executor cannot connect to the driver, or to put it in K8S language, the spark-slave pod cannot connect to the driver job/pod, because the job's hostname is not DNS-resolvable from the outside of the job itself.

Short description

From within a pod a call to hostname returns a non-resolvable hostname (either .metadata.name or spec.hostname). But sometimes it is necessary to resolve this hostname from another pod to be able to connect to the pod.

Using a K8S Service here is not a solution because it doesn't work with Job and sometimes pods in a Replicaset have to communicate directly (e.g. MongoDB replication or multi-master architectures or quorum situations?)

Findings

Some findings without any order:

  • Services I found where non-resolvable hostname is an issue: Mongodb, Flink, Spark (v2.3+), Kafka, Sendmail.
  • Making the IP-ish domain name available in the pod, the problem with Spark is gone. But this A record is not part of /etc/hosts and thus not in the result returned by gethostname()
  • The problem is not Spark-specific. In general it is not necessary for a hostname to be resolvable. But in K8S a hostname is never resolvable asking CoreDNS. See also the comments on K8S issue 4825.

Final Thoughts

Manually adding the IP-ish hostname to a pod's /etc/hosts, right between the IP and the pod's "default" hostname does the trick. From 10.1.2.3 driverjob to 10.1.2.3 10-1-2-3.namespace.pod.clustername.local driverjob. While editing the hosts file is not an option maybe a (future) K8S pod setting could activate the IP-ish hostname to be written to the hosts file?

Alternatively adding the pod's current hostname to DNS in order to be resolvable by all (other) pods may be a solution (but we have to add search paths then to the DNS config, like spec.dnsConfig.searches and this is not dynamic "enough".

Alternatively making a hostname -f call returning a FQDN, including podname, namespace, clustername and adding this to K8S' DNS could do the job.

Please don't be mad if I mixed something up or misunderstood concepts. This issue is very important to my company and we already spent a lot of time searching for workarounds. Please ask for more details if necessary.
Thanks!

@whipit
Copy link

whipit commented Jan 29, 2019

Is there any update on this issue?

@chrisohaver
Copy link
Contributor

Can this issue be illustrated with a specific example? I'm not quite grasping the issue from the explanations above, after have read them many times over.

@realknorke
Copy link

Can this issue be illustrated with a specific example? I'm not quite grasping the issue from the explanations above, after have read them many times over.

Hi Chris,
which aspect is unclear?
Problem in one sentence: You cannot resolve the hostname of a pod from within another pod.

Start two Pods. Lets assume the pod's names: pod1 and pod2. EXEC into pod2 and do a "ping pod1". It is not working because hostname pod1 cannot be resolved from within pod2. That's the problem in a nutshell.

@chrisohaver
Copy link
Contributor

In your use case, are the host names of the pods predictable in any way, or are they completely random. Can you give an actual real example, with real hostnames.

@realknorke
Copy link

realknorke commented Jan 30, 2019

Sure @chrisohaver

Lets assume I have two Pods in a RS

apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
  name: dnstest
  namespace: ops
spec:
  replicas: 2
  selector:
    matchLabels:
      k8s-app: dnstest
  template:
    metadata:
      labels:
        k8s-app: dnstest
    spec:
      containers:
      - name: dnstestpod
        image: opensuse/tumbleweed
        ports:
        - containerPort: 12345
        command: ["/bin/bash"]
        args: ["-c", "zypper -q install -y netcat-openbsd net-tools && hostname -A && hostname -I && grep $(hostname -i) /etc/hosts && while :; do netcat -lvp 12345; done"]

I start the RS/Pods

[~] -> kubectl get pods -o wide |grep dnstest
dnstest-sdsjz                          1/1     Running     0          4s      10.2.175.39    ip-192-168-20-102.eu-west-1.compute.internal   <none>
dnstest-tnclr                          1/1     Running     0          4s      10.2.155.240   ip-192-168-20-167.eu-west-1.compute.internal   <none>

Lets check STDOUT of one of the pods:

[knorke@eddie] 2019-01-30 9:54:50
[~] ->  kubectl logs dnstest-sdsjz |cat -n
     1
     2  The following 4 NEW packages are going to be installed:
     3    hostname libbsd0 net-tools netcat-openbsd
     4
     5  4 new packages to install.
     6  Overall download size: 214.7 KiB. Already cached: 0 B. After the operation, additional 535.9 KiB will be used.
     7  Continue? [y/n/...? shows all options] (y): y
     8  dnstest-sdsjz 
     9  10.2.175.39 
    10  10.2.175.39     dnstest-sdsjz
    11  Listening on [unknown] (family 0, port -1899267359)

The pod with name dnstest-sdsjz has hostname dnstest-sdsjz and IP 10.2.175.39.

Now lets exec into the second Pod and try to connect to the first one.

[knorke@eddie] 2019-01-30 9:58:46
[~] ->  kubectl exec -it dnstest-tnclr bash
dnstest-tnclr:/ # echo foobar |netcat -w0 10.2.175.39 12345
dnstest-tnclr:/ # echo foobar |netcat -w0 dnstest-sdsjz 12345
netcat: getaddrinfo for host "dnstest-sdsjz" port 12345: Name or service not known
dnstest-tnclr:/ # echo ip-ish hostname |netcat -w0 10-2-175-39.ops.pod.k8s.local 12345

Connecting via IP works. Connecting via IP-ish hostname (manually generated by me using the K8S DNS documentation!) works, too. Connecting via hostname dnstest-tnclr does not work, because the hostname is not resolvable.

Log of the receiving pod (already known lines removed):

[knorke@eddie] 2019-01-30 10:03:45
[/ram] ->  kubectl logs dnstest-sdsjz |cat -n |tail -n7
    11  Listening on [unknown] (family 0, port -1899267359)
    12  Connection from ip-10-2-155-240.eu-west-1.compute.internal 40492 received!
    13  foobar
    14  Listening on [unknown] (family 0, port 130939617)
    15  Connection from ip-10-2-155-240.eu-west-1.compute.internal 42842 received!
    16  ip-ish hostname
    17  Listening on [unknown] (family 0, port 1635764961)

Problem: hostname dnstest-sdsjz is not DNS-resolvable from within Pod dnstest-tnclr. Or more general: You cannot resolve a pod's hostname from within another pod.

Possible fix/workaround (yet to be implemented): Make the IP-ish hostname, WHICH IS ALWAYS RESOLVABLE, part of /etc/hosts, like this:

[knorke@eddie] 2019-01-30 10:17:28
[~] ->  kubectl exec -it dnstest-sdsjz bash
dnstest-sdsjz:/ # cat -n /etc/hosts
     1  # Kubernetes-managed hosts file.
     2  127.0.0.1       localhost
     3  ::1     localhost ip6-localhost ip6-loopback
     4  fe00::0 ip6-localnet
     5  fe00::0 ip6-mcastprefix
     6  fe00::1 ip6-allnodes
     7  fe00::2 ip6-allrouters
     8  10.2.175.39     dnstest-sdsjz
dnstest-sdsjz:/ # sed 's/^10.2.175.39.*/10.2.175.39 10-2-175-39.ops.pod.k8s.local dnstest-sdsjz/' /etc/hosts >newhosts && cat newhosts >/etc/hosts
dnstest-sdsjz:/ # cat -n /etc/hosts
     1  # Kubernetes-managed hosts file.
     2  127.0.0.1       localhost
     3  ::1     localhost ip6-localhost ip6-loopback
     4  fe00::0 ip6-localnet
     5  fe00::0 ip6-mcastprefix
     6  fe00::1 ip6-allnodes
     7  fe00::2 ip6-allrouters
     8  10.2.175.39 10-2-175-39.ops.pod.k8s.local dnstest-sdsjz
dnstest-sdsjz:/ # hostname -f
10-2-175-39.ops.pod.k8s.local
dnstest-sdsjz:/ # hostname -a
dnstest-sdsjz
dnstest-sdsjz:/ # hostname -A
10-2-175-39.ops.pod.k8s.local 

Please note the difference of the result of calling hostname -A!

Result achived by this workaround: The pod now can retrieve its globally resolvable hostname by asking the OS's DNS subsystem (e.g. hostname command or any other form of getaddrinfo()).

PS: The IP-ish hostname may not the only way (or not the best way) to make a Pod's hostname resolvable. Adding a Pod's hostname to the K8S DNS server could also do the trick, as discussed above.

@chrisohaver
Copy link
Contributor

Thanks, is there a need to be able to connect to a specific pod in a replica set? Or would connecting to any pod in the replica set be ok?

@chrisohaver
Copy link
Contributor

If connecting to any pod in the replica set is OK, then you could do this with Service selection and the CoreDNS rewrite plugin.

CoreDNS rewrite plugin: to re-write any request for dnstest-****.ops.svc.cluster.local. to dnstest.ops.svc.cluster.local..
K8s service: named dnstest that selects the pods in the replicaset.

When a Pod in the namespace ops does a lookup of dnstest-wxyz, it will first search the first search domain resulting in a search fordnstest-wxyz.ops.svc.cluster.local.. CoreDNS would then rewrite this request to dnstest.ops.svc.cluster.local., and return the Service's Cluster IP. When connecting to the IP, k8s then loadbalances randomly to one of the Pods in the replica set.

@realknorke
Copy link

realknorke commented Jan 30, 2019

Sadly, it has to work for every container manged by K8S. In my particular case it has to work for Kubernetes Jobs. The idea to have a K8S Service for every Job sounds odd to me.

And for your first question: Its not ReplicaSet-specific. I used a RS in my example to easily start two demo pods. In real life I have a lot of K8S Jobs (called driver in the Spark world) and it is important to be able to connect to open TCP ports within these Job pods. Not like anycast but to a specific one. Thus, a rewrite rule for (in my example) dnstest-* doesn't do the job (because there are many of them, only distinguishable by the hole hostname/pod name).

@chrisohaver
Copy link
Contributor

chrisohaver commented Jan 30, 2019

You could create a headless service that selects all the jobs, and (as @johnbelamaric says above) use the endpoint_pod_names directive in CoreDNS.

If the service is named driversvc, this would create records that look like ...

driver-wxyz.driversvc.ops.svc.cluster.local. -> pod-ip address of driver-wxyz
driver-abcd.driversvc.ops.svc.cluster.local. -> pod-ip address of driver-abcd

A CoreDNS rewrite rule could then rewrite driver-****.ops.svc.cluster.local. to driver-****.driversvc.ops.svc.cluster.local..

e.g. when a pod in ops namespace queries for driver-wxyz, it appends .ops.svc.cluster.local., coredns receives driver-wxyz.ops.svc.cluster.local., rewrites it to driver-wxyz.driversvc.ops.svc.cluster.local., and returns the pod IP.

@realknorke
Copy link

Thanks for the advice. Please give me some time to try this out. At the moment we're not using CoreDNS but Kubernetes-DNS. I'll get back to you later with results.
Thanks for your help!

@rdsubhas
Copy link

rdsubhas commented Feb 11, 2019

EDIT: This was resolved and alternative was found, please see next comment

We would like to chip in here as well. We're running Couchbase in StatefulSet mode.

  • Couchbase statefulset pod boots up, it has a stable pod name (say couchbase-0)
  • It's not in a live/ready state yet
  • It needs to join the cluster to get to a ready state
  • It tries to join the cluster, but then, there is a procedure for every node when joining the cluster where other members acknowledge the existence of this hostname
  • Joining the cluster fails, hence readiness probe also fails

If we try to use any Service-based Identity, like <pod-name>.<headless-service-name>.<namespace>.svc.cluster.local – It doesn't work, because the service is backed by endpoints, and endpoints are created only for Ready pods.

Our core problem: We need a stable Pod DNS Identity to make it healthy. But it looks like, the only way to get a stable Pod DNS identity is by making it healthy first. Chicken and egg problem.

We tried looking around:

  • We have created a headless service. And we have found that inside the pod a /etc/hosts entry is created to sidestep this readiness problem. But it doesn't help if other nodes want to validate the newly joined member
  • We set Pod’s hostname and subdomain fields as per kubernetes documentation. It doesn't work. This documentation page is out of date with the kubernetes DNS specification which is followed by coredns, and it does not support any .pod. naming. We have validated that it does not with kubeadm and vanilla kubernetes 1.12 clusters

Generally speaking, if we boot a VM in almost all Cloud providers (including AWS and Google Cloud), the VM has a VM Name and IP, which is not tied to any other component. It has an identity of it's own. Yes, that's too expensive in a container world. But atleast it should be possible for StatefulSet pods. Having a pod's DNS identity depend on the load balancer (service name, which is a totally external resource), which in turn depends on the endpoints, which in turn depend on the pod's readiness, and to have an injected name to avoid this cycle – so that a pod can resolve itself if it's unhealthy, but others can't resolve it if it's not healthy – is very confusing for operators. Having a stable Pod "name" for StatefulSets, but a correspondingly unstable DNS name tied to a service with health checks, is very confusing and restrictive.

One could indeed say, that this is a problem of Couchbase, Spark, or Hadoop (found Zookeeper reported somewhere but not sure if it's related). But if we're able to run them on VMs, then a naive point of view is – we should not need to change their source code or fix couchbase clustering – just because statefulset pods don't have a stable DNS identity. We're not complaining, we really love kubernetes, and we're trying our best to run clustered applications on kubernetes, but somehow this looks increasingly like a kubernetes limitation, than the underlying software (couchbase/spark/hadoop/zk) problem.

If readiness probes were available on a Service-level instead of a Pod-level (again not really a hack, but a standard cloud load balancer pattern, both AWS and Google Cloud allow to set probes on Load Balancers), we might have created a headless service with no health checks, and used that to provide a stable DNS name that always points to the current running pod no matter what it's state is. But for good reasons, that's not possible in k8s, and probes are directly on the pods, so we're really stuck here.

At the very least, this looks to be a documentation bug in the service resolution page. At the best, it would be great to have a solution for a stable DNS identity corresponding to a stable Pod identity.

I'm not sure if what I'm saying makes any sense, but we've hit across these chain of issues multiple times when we try to run stateful clustered resources, hence thought of adding here.

@MrHohn
Copy link
Member

MrHohn commented Feb 11, 2019

If we try to use any Service-based Identity, like <pod-name>.<headless-service-name>.<namespace>.svc.cluster.local – It doesn't work, because the service is backed by endpoints, and endpoints are created only for Ready pods.

@rdsubhas Would Service.spec.publishNotReadyAddresses help with this specific use case (ref https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.13/#servicespec-v1-core)? It configures that DNS implementations must publish the notReadyAddresses of subsets for the Endpoints associated with the Service.

@rdsubhas
Copy link

rdsubhas commented Feb 12, 2019

Hi @MrHohn this looks like it could do the trick, will check it out and update (and will check out what version it's available from) 👍 Thank you so much for the tip!

@rdsubhas
Copy link

rdsubhas commented Feb 12, 2019

@MrHohn quick update: It works flawlessly, and just what we needed!

Now we're moving to bootstrap every StatefulSet with serviceName pointing to a Service with publishNotReadyAddresses: true used only for stable DNS name for every stateful pod by default. And then we create separate services for client connectivity purposes (so that clients don't connect to non-ready endpoints). So far it's working as planned, albeit a roundabout way to get a stable pod DNS ;)

Thanks for the heads up again! To follow up: What do you think about this docs page which still mentions the deprecated .pod naming which doesn't work anymore: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pods – should we consider that as a documentation issue and update it to reflect current scenario or alternative?

@MrHohn
Copy link
Member

MrHohn commented Feb 12, 2019

@rdsubhas Thanks for pointing that out, I will see if we can update https://kubernetes.io/docs/concepts/services-networking/dns-pod-service to match up with the latest DNS specification.

@uhjh-zz
Copy link

uhjh-zz commented May 3, 2019

i just ran into the same issue when deploying a jupyter notebook onto a kubernetes cluster and trying to execute spark jobs against the cluster, the executor pods spin up, then error because they cannot talk back to the notebook because they were given the hostname of the jupyternotebook, which is a podname,
so i exposed the jupyter notebook deployment via headless service with the spec publishNotReadyAddresses=true and service name the same as the pod, it worked, but damn
edit: just tried without setting that spec, works fine for exposing deployments also

@rporres
Copy link

rporres commented Jun 25, 2019

We have created a small operator that watches for pods with a certain annotation and creates a headless service for them. It helps us as a workaround for this issue: https://github.com/src-d/k8s-pod-headless-service-operator

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 23, 2019
@realknorke
Copy link

So everybody with (relatively) long-running services found a workaround here. But what about the short-living fire-and-forget jobs (e.g. spark jobs)? You don't want (headless) services for that, right?

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 23, 2019
@realknorke
Copy link

We patched the Kubernetes-Plugin of CoreDNS. Now it is possible to resolve a podname. If someone is interested just give me a shout.

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@novakov-alexey-zz
Copy link

We patched the Kubernetes-Plugin of CoreDNS. Now it is possible to resolve a podname. If someone is interested just give me a shout.

@realknorke I am quite interested in your patch for CoreDNS. Could you share it here?

@realknorke
Copy link

We patched the Kubernetes-Plugin of CoreDNS. Now it is possible to resolve a podname. If someone is interested just give me a shout.

@realknorke I am quite interested in your patch for CoreDNS. Could you share it here?

Can you read the fork: https://github.com/smartclip/coredns ?
When resolving a podname (hostname from within a pod) the K8S coredns plugin asks Kubernetes for the IP for a given podname.

@novakov-alexey-zz
Copy link

@realknorke yes, I can read the fork.

Perhaps I misunderstood your patch logic. I am looking for pod name, which does not have dashed IP in it. Currently in my AKS cluster with CoreDNS, my StatefulSet PODs are resolved as
10-244-1-81.<service>.<namespace>.svc.cluster.local, but I need to stay as
<pod_name>.<service>.<namespace>.svc.cluster.local. I use headless service in the StatefulSet

@realknorke
Copy link

realknorke commented Nov 28, 2019

The IP-ish hostname is always resolvable. What is not resolvable is the podname (this is NOT the service name, i mean that hostnames with $random in it for ReplicaSet, Deployment or with numbers in StatefulSet).

We need podnames resolvable because for some (micro-)services the application in the pod determine his hostname, transfer that hostname to some other instance and that instance then wants to connect back to the first pod. That is not possible when the hostname=podname is not resolvable.

Our patch solves exactly this.

@novakov-alexey-zz
Copy link

novakov-alexey-zz commented Nov 28, 2019

Understood your case now.

In our situation, we want StatefulSet POD hostnames remain as:
<pod_name>-<index>.<service>.<namespace>.svc.cluster.local,
so that we can configure our services in advance by knowing the domain name and number of replicas we can generate full list of address for entire StatefulSet. Having IP-ish hostname, which is resolvable - yes, does not allow us to generate required configuration in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests