Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use a FQDN for metadata.google.internal #117

Merged
merged 10 commits into from
Dec 11, 2018

Conversation

crwilcox
Copy link
Contributor

@crwilcox crwilcox commented Oct 31, 2018

Minimize lookups by not expanding search path when resolving metadata.google.internal.

A customer found that about 85 DNS requests were made at init of packages relying on gcp-metadata. Upon investigation the issue was we were forcing path expansion for DNS resolution. After this change about 21 requests are made.

Minimize lookups by not expanding search path when resolving metadata.google.internal
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Oct 31, 2018
@JustinBeckwith JustinBeckwith added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Oct 31, 2018
@wyardley
Copy link

wyardley commented Oct 31, 2018

It does result in slightly odd looking requests, but I think it's [appending the trailing dot] the correct behavior, otherwise, the resolver will first try metadata.google.internal prepended to every domain in the search path, e.g. (from tcpdump):

22:02:20.530118 IP 10.48.2.14.49662 > 10.51.240.10.53: 62251+ AAAA? metadata.google.internal.default.svc.cluster.local. (68)
22:02:20.530633 IP 10.48.2.14.52955 > 10.51.240.10.53: 49598+ A? metadata.google.internal.svc.cluster.local. (60)
22:02:20.530657 IP 10.48.2.14.52955 > 10.51.240.10.53: 49950+ AAAA? metadata.google.internal.svc.cluster.local. (60)
22:02:20.530957 IP 10.48.2.14.40507 > 10.51.240.10.53: 45306+ A? metadata.google.internal.cluster.local. (56)
22:02:20.530984 IP 10.48.2.14.40507 > 10.51.240.10.53: 45635+ AAAA? metadata.google.internal.cluster.local. (56)
22:02:20.531301 IP 10.48.2.14.50796 > 10.51.240.10.53: 62724+ A? metadata.google.internal.c.project-name.internal. (67)
22:02:20.531330 IP 10.48.2.14.50796 > 10.51.240.10.53: 63148+ AAAA? metadata.google.internal.c.project-name.internal. (67)
22:02:20.535246 IP 10.48.2.14.53737 > 10.51.240.10.53: 1695+ A? metadata.google.internal.google.internal. (58)
22:02:20.535284 IP 10.48.2.14.53737 > 10.51.240.10.53: 2075+ AAAA? metadata.google.internal.google.internal. (58)

The quad-A lookups also multiply the # of total lookups. Ideally, the result could be read once and then cached in-memory, but this should be a big improvement.

In normal cases, this might not be so bad, but lots of apps on lots of pods running on GKE with lots of domains in the (default, not trivially changeable) search path results in a pretty heavy load on kube-dns / dnsmasq, which as of now, has no caching layer.

@JustinBeckwith
Copy link
Contributor

Thanks for the patience folks. Full disclosure - we are waiting on a dns lookup fix to land in Google Cloud Functions before we can merge this in.

Copy link
Contributor

@JustinBeckwith JustinBeckwith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to wait for GCF fix to land

@JustinBeckwith JustinBeckwith added kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed do not merge Indicates a pull request not ready for merge, due to either quality or timing. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Nov 19, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 19, 2018
@JustinBeckwith
Copy link
Contributor

@crwilcox @ofrobots the system test did exactly what we wanted it to! 💃 I've never been so excited to see a failing test.

@stephenplusplus
Copy link
Contributor

We did this elsewhere: googleapis/google-cloud-node#2214, and we had to revert it because of this: googleapis/google-cloud-node#2249.

@ofrobots
Copy link
Contributor

After reviewing the above, I am forced to conclude that software is terrible. 😭

Let's just use an ip address and avoid DNS resolution altogether. I believe the python client libs already use ips.

@JustinBeckwith JustinBeckwith changed the title Use a FQDN for metadata.google.internal fix: Use IP address instead of metadata.google.internal Nov 21, 2018
Copy link
Contributor

@JustinBeckwith JustinBeckwith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handing this over to someone else to approve :)

Copy link
Contributor

@ofrobots ofrobots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be a bit more complicated, unfortunately. I should have thought of this earlier, but given the test:

const request = require('request');

const IP = '169.254.169.254';

const host = process.argv[2] || IP;
const url = `http://${host}/computeMetadata/v1/instance/id`;
console.log('requesting', url);
request(url, (err, res) => {
  console.dir(err, res);
});

Consider the output:

time node ip.js metadata.google.internal
requesting http://metadata.google.internal/computeMetadata/v1/instance/id
{ Error: getaddrinfo ENOTFOUND metadata.google.internal metadata.google.internal:80
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:58:26)
  errno: 'ENOTFOUND',
  code: 'ENOTFOUND',
  syscall: 'getaddrinfo',
  hostname: 'metadata.google.internal',
  host: 'metadata.google.internal',
  port: 80 }
node ip.js metadata.google.internal  0.22s user 0.05s system 97% cpu 0.272 total

~/tmp/foo
❯ time node ip.js
requesting http://169.254.169.254/computeMetadata/v1/instance/id
{ Error: connect ETIMEDOUT 169.254.169.254:80
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1117:14)
  errno: 'ETIMEDOUT',
  code: 'ETIMEDOUT',
  syscall: 'connect',
  address: '169.254.169.254',
  port: 80 }
node ip.js  0.20s user 0.05s system 0% cpu 1:15.74 total

The good thing about the DNS lookup was that it failed before being subject to the OS controlled TCP timeouts. Not to mention retries.

@ofrobots
Copy link
Contributor

We need another fast way to detect if an environment is GCP before attempting the TCP connection.

@ofrobots
Copy link
Contributor

ofrobots commented Nov 21, 2018

The solution will need to achieve these attributes:

  • We cannot use a trailing dot because software is terrible (we get Java IllegalArgumentExceptions on Android as per @google-cloud/storage file getSignedUrl returns trailing dot in URL google-cloud-node#2249.)
  • Regardless of if we are running on GCP or not, we should be able to discover the presence (or otherwise) of the metadata service fast (~seconds, ideally milliseconds).
  • Should not make repeated requests to resolve addresses. (i.e. use DNS caching).

Concurrently:

  1. Try to resolve metadata.google.internal (no trailing dot).
  2. Open a TCP socket to 169.254.169.254. If the socket hasn't connected in N seconds, treat this as a failure to connect. N is a small number, say 5. (Is there an SLA for the metadata service response time?)

Whichever operation finishes first, cache the result as the hostname of the metadata service and use that going forward.

When running on GCP, step 2 will finish in milliseconds. When running off GCP, step 2 will fail, so the operation latency will be dependent on the first check passing. This adds no additional latency to our current behavior in the off-GCP environment.

WDYT?

@theacodes
Copy link

s there an SLA for the metadata service response time?

The metadata service has absolutely no SLA.

@wyardley
Copy link

Try to resolve metadata.google.internal (no trailing dot).

Resolving it without the trailing dot will result in Nx # of DNS requests for each entry in the OS’s resolver library search path before trying it as a FQDN. My suspicion is that resolving it should work on all platforms; it’s actually using it in the Host: header that’s probably causing a problem.

@ofrobots
Copy link
Contributor

Actually as worded, my strategy doesn't depend upon the SLA for metadata service at all (any more than today). The concurrent direct connection to the IP is merely a fast path.

@wyardley

Resolving it without the trailing dot will result in Nx # of DNS requests for each entry in the OS’s resolver library search path before trying it as a FQDN

In GCP environments the IP connect will finish first which means that these M DNS queries will be initiated but the result will be inconsequential.

@wyardley
Copy link

wyardley commented Nov 21, 2018

@ofrobots in GKE (where there are ~ 6 domains in the search path and the resolver makes both A and quad-A lookups), this will hit kube-dns heavily which is the whole thing this is trying to fix, so I wouldn’t say it’s inconsequential. Also, wouldn’t deciding too quickly that we’re not in GCP potentially could cause auth failures, no?

If it’s only for isGCP check, the number of lookups shouldn’t matter as much but if it’s just a DNS lookup, vs an http request, the trailing dot shouldn’t cause problems.

@ofrobots
Copy link
Contributor

Also, wouldn’t deciding too quickly that we’re not in GCP potentially could cause auth failures, no?

We don't decide that we are not GCP until both IP-connect and DNS-lookups fail, so no, from auth point of view my proposal is same as current behavior.

in GKE (where there are ~ 6 domains in the search path and the resolver makes both A and quad-A lookups), this will hit kube-dns heavily which is the whole thing this is trying to fix, so I wouldn’t say it’s inconsequential.

To make sure I understand, is the 6 lookups the problem or the fact that there was no cache of the result?

My suspicion is that resolving it should work on all platforms; it’s actually using it in the Host: header that’s probably causing a problem.

I think we need more definitive information here. Any one with Java chops willing to try OkHttp (possibly on Android)?

@wyardley
Copy link

I think I t’ll be 12-14 lookups for each request or so, because there’s both the A and quad-A lookups. @crwilcox has a POC I threw together that generates about 80 or so DNS lookups by connecting to a handful of Google services in GKE using the official libs.

Yes, once GKE rolls out better caching in front of dnsmasq / kube-dns, this should be less of a problem. I don’t think that will be rolled out til next year sometime?

@ofrobots
Copy link
Contributor

I'm also proposing we cache the resolution result locally in this module – that is this library will try to resolve metadata.google.internal only once (which will generate M DNS requests based on the DNS search path). However, once resolved we would not see any more lookups regarless of whether kube-dns has caching. Would that address your concerns?

@ofrobots
Copy link
Contributor

As a further optimization perhaps we can limit to A lookups only.

@wyardley
Copy link

Yes, resolving it once should definitely be a massive improvement, and would also be cleaner than hard-coding the IP.

I’m pretty confident, though, that if you’re using that approach, it should be safe (and most correct) to use the trailing dot as well.

@ofrobots
Copy link
Contributor

My suspicion is that resolving it should work on all platforms; it’s actually using it in the Host: header that’s probably causing a problem.

Staring at the Java exception a bit more, I think you're right. The error is a SSL handshake error which implies that the DNS lookup was already done. It would be good to get something more definitive though.

@JustinBeckwith JustinBeckwith changed the title fix: Use IP address instead of metadata.google.internal fix: Use a FQDN for metadata.google.internal Nov 26, 2018
@wyardley
Copy link

wyardley commented Nov 30, 2018

Some really interesting points made in this link pointed to me by Google support:

nodejs/node#15780 (comment)
which then links to:
https://github.com/gliderlabs/docker-alpine/blob/master/docs/caveats.md
https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html

These discuss some of the finer points of why this can cause such headaches with GKE, especially with Alpine base images. I'm not sure if there's been any discussion about changing options ndots:5 to 1 or 2 in the GKE nodes' /etc/resolv.conf, but my quick read is that this would tend to mitigate some of these problems.

For me, adding this to the Kube deployment spec "fixes" the problem.

      dnsConfig:
        options:
          - name: ndots
            value: "2"

@JustinBeckwith
Copy link
Contributor

It looks like this actually got fixed in GCF! @ofrobots mind taking another look?

Copy link
Contributor

@ofrobots ofrobots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change LGTM as it will not cause googleapis/google-cloud-node#2249 to come back as it is related to service endpoints hostnames being used with the trailing dot in the headers.

In other libraries, I expect we will want to add a trailing dot too; but that needs to be done with more care.

@kalyanac
Copy link

kalyanac commented Jun 6, 2019

We need another fast way to detect if an environment is GCP before attempting the TCP connection.

FWIW, Here is how go handles it:
https://github.com/googleapis/google-cloud-go/blob/master/compute/metadata/metadata.go#L119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants