-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verify correctness of service discovery for metadata',
metadata.google.internal`, etc. on GKE and AWS
#392
Comments
This, #62, #366, and #384 all make me think that we should do DNS resolution within the proxy, and not within the Destination service. Specifically, I think we need to do something like this: When we see a partially-qualified name like "metadata" or "metadata.$namespace", we need to use DNS to resolve both that name and "metadata.$namespace.svc.$zone". If and only if they resolve to the same IP address then subscribe to the name using the control plane's Destination service. Otherwise, use the IP addresses in the DNS response received (and probably skip our internal load balancing). |
The GCE documentation for the metadata service is https://cloud.google.com/compute/docs/storing-retrieving-metadata#querying. The documentation for AWS is at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html#instancedata-data-retrieval. The stack overflow answer at https://stackoverflow.com/a/42315582 is very helpful to understand that our service discovery logic probably needs to consider link-local addresses specially. |
In Conduit 0.3, "metadata" won't resolve to the right thing (it will resolve as if it were "metadata.$namespace.svc"), but "metadata.google.internal" should work correctly. I'll tentatively propose that we make "metadata" (IIUC, the deprecated legacy name for "metadata.google.internal") work in 0.4. |
It would be good to get the following information from a shell in a conduit-proxy container in a GKE cluster: $ cat /etc/resolv.conf
$ dig +showsearch metadata |
running in the myorg Google Cloud organization:
|
For what it's worth, this is not fixed on master (37434d0) yet. the following request times out:
whereas the fqdn works properly:
I think we basically understand how we want to do name resolution/discovery to resolve this sort of issue. I'll open a new issue to start documenting exactly what needs to change.. |
As of 0.4.1:
I believe this is resolved now? |
See kubernetes/kubernetes#8512 (comment):
If that is still true, then we can't safely do DNS resolution for these hostnames from the controller's Destination service's pods, because the metadata returned would be the metadata intended for the Destination service's node, not the node that the proxied pod is running on.
See also kubernetes/kubernetes#8867.
/cc @olix0r @adleong
.
The text was updated successfully, but these errors were encountered: