Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reverse lookup for pods / resolve a podname to IP address #3888

Closed
omitrowski opened this issue May 14, 2020 · 23 comments
Closed

Reverse lookup for pods / resolve a podname to IP address #3888

omitrowski opened this issue May 14, 2020 · 23 comments

Comments

@omitrowski
Copy link

omitrowski commented May 14, 2020

What would you like to be added:
I would like to be able to do a reverse lookup for pods. I need the IP address for a given pod name. The "worker" pod reports it's hostname to the "master" pod. The "master" pod is not able to contact the "worker" pod on an IP address. All would be fine if I could define the hostname/ip address via ENV variables. The problem is, that there are application (especially Apache/Java) that do not care about the env variables in term of host names and IP address but force to make a reverse lookup. An according request I found already on kubernetes, (see kubernetes/dns#266):

We have an increasing problem with Apache Hadoop-like services like Spark, Flink and co. These are trying to communicate via their hostnames in the cluster instead of their IPs. So they look up their own hostname and hence come up with the unresolvable podname. We see now two incomplete solutions:
a) Pod A records are created in a format like 1-2-3-4.namespace.pod.cluster.local variant. But the pod itself cannot be spec’ed to use this A record as its own hostname.
b) one can use hostname and subdomain together with a headless service in order to create an FQDN in the KubeDNS + in pods hostname, but this requires static hostnames and won’t work with Replicasets or Daemonsets.
We are looking for a complete solution, e.g. via optionally switch the pod hostname to its KubeDNS A Record, or injecting the A record as first entry in /etc/hosts etc. The type of pods which are heavily effected by the issue are Jobs. In the Spark context, these are driver pods / jobs. But the problem is not limited to Spark. We see the same effects all over in recent apache projects. E.g. Flink has a similar issue.

I also found a similar request here: #2409 and this points to a "workaround" where one should use headless service.

Why is this needed:
because we are using applications, that do not need a service, but do need the A records for other pods running in cluster. I don't want to create objects in my kubernetes cluster without any purpose. Adding service/headless service for applications that do not need them is a workaround. CoreDNS should be able to list A records for all pods regardless of endpoints, services or whatever. Exactly this does a common DNS server out there in the internet.

the current docu states (see https://meet.google.com/linkredirect?authuser=1&dest=https%3A%2F%2Fgithub.com%2Fcoredns%2Fcoredns%2Fblob%2Fmaster%2Fplugin%2Fkubernetes%2FREADME.md):

.... It will handle all queries in that zone and connect to Kubernetes in-cluster. It will not provide PTR records for services or A records for pods.

@chrisohaver
Copy link
Member

I think since this acts in the same domain as the existing k8s dns based service discovery spec, this kind of change should be proposed for the spec first. Then if approved, we would update coredns to meet the spec.

One problem with this as a spec requirement is that implementing it requires a watch on all pods, which is a significantly heavier burden on the k8s API than just watching service endpoints. see kubernetes/dns#266 (comment)

@omitrowski
Copy link
Author

omitrowski commented May 14, 2020

@chrisohaver thank you for your advise! What do you think about a solution like this I found on github? master...smartclip:feature/resolve-pods
looks like it works as expected for me and is even configurable. But I would have to build from a third party branch. Does this change have chance to be merged in your opinion?

@chrisohaver
Copy link
Member

I didn't dwell on it for too long, but that clip seems to collide with the existing spec domain scheme, for pods named pod or svc. It also appears to not handle namespaces, so pods in different namespaces with the same name would collide with each other.

@chrisohaver
Copy link
Member

To clarify, are you only requesting that we be able to do reverse lookups on the existing Pod A records we already provide? Or are you also requesting pod lookups by Pod name (i.e. Add new A records for Pods)?

@chrisohaver
Copy link
Member

If the existing pod A records work for you, and all you are looking for are reverse lookups for those records, ... and you only need this to work within one known namespace, you could use the template plugin to build the PTR records for you ...
https://github.com/coredns/coredns/tree/master/plugin/template#resolve-aptr-for-example

@scrwr
Copy link

scrwr commented May 14, 2020

Actually, because of the lack of interest on kubernetes/dns#266 we decided to create this patch ourselves. We have this stable in production now since over a year. Of course it is more expensive than service lookups, but as it fixes so many clustered projects to be able to run on kubernetes without any tweaks, this is not only an acceptable, but rather a no-brainer trade-off for us. Recent examples: Presto and Alluxio (the latter has an official workaround too - with emphasizing the term workaround).

@chrisohaver we decided to only resolve lookups of {podname}.{zone} and not if the domain contains the .svc subdomain. This is the reason for not considering the namespace so far, as the standard search path in Kubernetes only uses the namespace in combination with the svc subdomain. So we had to decide, whether to collide with the svc meaning or to potentially generate pod name collisions. I'm absolutely happy to discuss this approach.

@omitrowski
Copy link
Author

To clarify, are you only requesting that we be able to do reverse lookups on the existing Pod A records we already provide? Or are you also requesting pod lookups by Pod name (i.e. Add new A records for Pods)?

@chrisohaver I am requesting pod lookups by Pod name. I am managing several k8s clusters for different customers using different cloud provider. As soon it comes to use software with master-worker relation resolving pod names to IP becomes crutial. Thinking about your concerns and the obviously small amount of people looking for a solution without a workaround I am very interested in a discussion too how to approach this topic. As well obvious to me is, that coredns is the much better solution than kubedns. Perhaps one could think about using pod annotations to advice coredns whether to add A/PTR records for pod names or not. It makes then easy to filter pods that need special handling. Perhaps it could lower the impact on API.

@chrisohaver
Copy link
Member

When adding a feature to CoreDNS there are generally 3 options:

  1. extend an existing plugin: e.g. adding an option to an existing plugin
  2. creating a new plugin in the coredns/coredns tree
  3. creating an new plugin as an external plugin

Creating an external plugin is always an option.

@omitrowski
Copy link
Author

@chrisohaver got it, thank you!

@scrwr
Copy link

scrwr commented May 15, 2020

As this functionality is just an extension of the existing kubernetes plugin, so I would say 2 and 3 are a lot of code duplication. The question is, what would be needed to get this into the existing plugin. We would be happy to provide this as a PR.

@johnbelamaric
Copy link
Member

johnbelamaric commented May 15, 2020 via email

@chrisohaver
Copy link
Member

Yeah - i've thought of this. E.g. having the api connection be a separate plugin in itself that all other k8s related plugins use ...

@miekg
Copy link
Member

miekg commented May 15, 2020 via email

@scrwr
Copy link

scrwr commented May 15, 2020

@johnbelamaric The schema explicitely talks only about service discovery. But we are talking about host/pod discovery here. Also this implementation only enhances an already existing pod discovery mode of ip-ish hostnames.

Lets try to consider for a moment, that the combination of automatically assigned random hostnames to pods in kubernetes and the missing resolution of these, irritates a lot of clustered applications (Spark, Flink, Presto, Alluxio, Akka, just to name few). Shouldn't we try to address this on the resolution or the assignment side instead of tweaking the system with additional services and search-path manipulations?

@chrisohaver
Copy link
Member

The question is if the connection to the API server should/can be shared.

I think we'd rather share the k8s caches, not the api connection itself. We already share the caches in the existing k8s related plugins (autopath, k8s_external)

I've pondered splitting non-spec features out of the existing kubernetes plugin into a separate plugin, and also splitting out the api connection ... e.g.

  • k8s_watcher - The api connector. Connects to the k8s api, and watches objects registered by other plugins.
  • k8s_service_discovery - The K8s Spec. The provides the k8s dns service discovery spec only, registers namespace, service, and endpoints watches.
  • k8s_pods - Non-spec features. Registers a pod watch to provide non-spec pod based features (e. g. pods verified pod A records).

Thus, a default k8s corefile would include the k8s_watcher and k8s_service_discovery plugins.

Splitting the api connection out into a separate plugin came to mind with the k8s_external plugin. k8s_external requires the kubernetes plugin to be enabled. But the kubernetes plugin also monitors endpoints, which is high cost, and k8s_external does not need them. Serendipitously, we happened to already have the noendpoints option, for unrelated reasons. But that got me thinking that other future plugins might want to "piggyback" on the k8s api connection without the need to monitor all the objects required to fulfill the k8s dns service discovery spec.

Perhaps it's over-modularizing.

@scrwr
Copy link

scrwr commented May 20, 2020

I guess, this concept would probably mess with backward compatiblity. Unless you would maintain kubernetes plugin in its actual form in parallel to the several new plugins.

What we tried to achive did not feel too much of an alien, especially as I think that this piece (pod name resolution) is a real miss on the kubernetes side.

However, if you think this refactoring would be a huge benefit and is achievable, we would be more then happy, to provide the pod name resolution PR to this new structure as well.

@omitrowski
Copy link
Author

Hi @chrisohaver, any progress on this from your side? Could I help some how?

@miekg
Copy link
Member

miekg commented Aug 31, 2020

looks stale, closing soon

@chrisohaver
Copy link
Member

chrisohaver commented Aug 31, 2020

any progress on this from your side?

I have made some progress here. I have split the kubernetes plugin into k8s_watcher and k8s_service_discovery plugins (as described above), and its mostly working. A few small to-dos left. Adding a k8s_pods plugin to work with the k8s_watcher would not require any changes to k8s_watcher or k8s_service_discovery.

@chrisohaver
Copy link
Member

@omitrowski,

I've just shared https://github.com/chrisohaver/k8s_api - an external plugin that connects to the k8s api, and enables other plugins to register Kubernetes API watches with it. It includes an example of the coredns built-in kubernetes plugin refactored to use k8s_api to manage the API connection. In theory, one could write a new plugin that uses k8s_api to hook into the Pods index established by the refactored kubernetes plugin, and answers for pod names, falling-through to kubernetes to answer if nothing matches.

@omitrowski
Copy link
Author

@chrisohaver thank you very much! I guess this means, that a solution for the pod discovery in my case shall come from a plugin and has no chance to be o core function?

@chrisohaver
Copy link
Member

chrisohaver commented Sep 23, 2020

I guess this means, that a solution for the pod discovery in my case ... has no chance to be to core function?

IMO, Pod name lookups are not likely to become a core function of the Kubernetes DNS Service Discovery spec. The primary reason is that it would currently require CoreDNS to watch all Pods, which puts a high load on the Kubernetes API server at scale, profoundly degrading cluster performance. In the past, requests for adding features to Kubernetes DNS Service Discovery that would also require a watch on Pods have been rejected for this reason. The landscape could eventually change, but with the current state of things watching Pods isn't feasible at scale.

@omitrowski
Copy link
Author

okay, so far I have to stick to the fork for much longer time I hoped to ;(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants