Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"context deadline exceeded" - Continuous delay when running with (broken?) DNS settings #289

Closed
fgsch opened this issue May 27, 2021 · 30 comments · Fixed by #824
Closed
Labels
bug Something isn't working

Comments

@fgsch
Copy link
Member

fgsch commented May 27, 2021

Version:

Fastly CLI version 0.30.0 (326882f)
Built with go version go1.16.4 linux/amd64

Description:

For some strange reason, when I have ipv6 enabled, updating the configuration fails in my box with a time out.
This causes a 5 seconds delay every time I run the CLI. This does not seem to happen if I build e.g. the v0.30.0 tag natively.
The output when this happens:

$ fastly -v

INFO:
Compatibility and versioning information for the Fastly CLI is being updated in the background.  The updated data will be used next time you execute a fastly command.

[..]

ERROR: there was a problem updating the versioning information for the Fastly CLI:

Get "...": context deadline exceeded.

[..]

When the update fails last_checked is not updated, so the next time the CLI is run the update process is triggered.
In my case, this means that every time I run the CLI I experience the delay as the update process keeps timing out.

Besides trying to find out why the update is timing out (I suspect it's both a network and a cross-compiling issue) it might be desirable to update the last_checked timestamp to avoid triggering the update continuously when there is a network issue.

@fgsch fgsch added the bug Something isn't working label May 27, 2021
@fgsch
Copy link
Member Author

fgsch commented May 27, 2021

On a related note, we might want to rephrase the error above (Get "...": context deadline exceeded) to something more descriptive.

@fgsch fgsch changed the title Continuous delay on running with (broken?) ipv6 Continuous delay when running with (broken?) ipv6 May 27, 2021
@fgsch
Copy link
Member Author

fgsch commented May 27, 2021

I can reproduce this issue outside the CLI so it looks like a cross compiling problem.

@fgsch
Copy link
Member Author

fgsch commented May 27, 2021

Everything indicates this is because cgo is disabled during cross compilation.

Quoting from https://golang.org/pkg/net/#hdr-Name_Resolution:

By default the pure Go resolver is used, because a blocked DNS request consumes only a goroutine, while a blocked C call consumes an operating system thread. When cgo is available, the cgo-based resolver is used instead under a variety of conditions: on systems that do not let programs make direct DNS requests (OS X), when the LOCALDOMAIN environment variable is present (even if empty), when the RES_OPTIONS or HOSTALIASES environment variable is non-empty, when the ASR_CONFIG environment variable is non-empty (OpenBSD only), when /etc/resolv.conf or /etc/nsswitch.conf specify the use of features that the Go resolver does not implement, and when the name being looked up ends in .local or is an mDNS name.

Running the binary built natively under macos and setting GODEBUG=netdns=go shows the same (incorrect) behaviour.

@peterbourgon
Copy link
Contributor

@fgsch

For some strange reason, when I have ipv6 enabled, updating the configuration fails in my box with a time out.

Can you confirm this is a function of the resolver used by the CLI, and not your system's IPv6 setup? What hostname/IP is the CLI trying to resolve? If you resolve that manually, does it work quickly?

@fgsch
Copy link
Member Author

fgsch commented May 27, 2021

@peterbourgon Yes, I can confirm it works using e.g. curl or running a binary built under macos from the same tag as the one released.

@peterbourgon
Copy link
Contributor

peterbourgon commented May 27, 2021

So, in summary, it appears that your system DNS resolver works correctly when you have IPv6 enabled, but Go's DNS resolver does not, and triggers this timeout?

To confirm (again 😉) could you try the following commands with the official version of the CLI? (The +1 adds DNS debug information, which may be useful.)

env GODEBUG=netdns=go+1  fastly -v  # should time out
env GODEBUG=netdns=cgo+1 fastly -v  # should be OK

I took those env vars from the package net docs, which also tell us that

[The pure Go resolver] sends DNS requests directly to the servers listed in /etc/resolv.conf . . . [The cgo-based resolver] calls C library routines such as getaddrinfo and getnameinfo.

So, if that ^^ does confirm the issue, it would suggest that the DNS servers in your /etc/resolv.conf aren't resolving IPv6 addresses correctly. The debug information provided by the above commands may be helpful in confirming.

If that's true, what should we do? It's possible to build the CLI in a way that forces the cgo-based resolver in all circumstances, but it's not obvious to me that that's the right solution.

@fgsch
Copy link
Member Author

fgsch commented May 27, 2021

The second command times out as well with the official version, I assume because the cross compiled version does not have the cgo codepath builtin - as with the first command it shows go package net: built with netgo build tag; using Go's DNS resolver.

If I try with a natively compiled version (which should include both the cgo and the go variants), this works as expected. The same with a test program cross compiled (added below).

So, if that ^^ does confirm the issue, it would suggest that the DNS servers in your /etc/resolv.conf aren't resolving IPv6 addresses correctly. The debug information provided by the above commands may be helpful in confirming.

Maybe, but then /etc/resolv.conf is not the preferred mechanism for macos, as stated at the top:

# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.

If that's true, what should we do? It's possible to build the CLI in a way that forces the cgo-based resolver in all circumstances, but it's not obvious to me that that's the right solution.

To me the right solution is to use whatever it works regardless of the OS or particular settings, in this case the native (cgo-based) resolution mechanism. After all this works for natively built binaries, and AFAICT it is a side effect of cross compiling the macos binary without CGO available.

@fgsch
Copy link
Member Author

fgsch commented May 27, 2021

Repro:

package main

import (
	"context"
	"fmt"
	"net/http"
	"time"
)

func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
	defer cancel()
	req, err := http.NewRequestWithContext(ctx, http.MethodGet, "http://v6.testmyipv6.com/", nil)
	if err != nil {
		panic(err)
	}
	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		panic(err)
	}
	fmt.Println(resp)
}

@peterbourgon
Copy link
Contributor

peterbourgon commented May 27, 2021

To me the right solution is to use whatever it works regardless of the OS or particular settings, in this case the native (cgo-based) resolution mechanism.

The fact that this is the preferred choice for you does not by itself mean it will be the preferred choice for everyone. It might mean that, but we don't have enough information right now to reach that conclusion. Is there any indication that many/most/all macOS systems won't be able to resolve IPv6 lookups to api.fastly.com correctly using the Go DNS resolver? It seems to me this is a function of your ISP and/or network settings, and not the OS itself. (I could be wrong!)

Maybe, but then /etc/resolv.conf is not the preferred mechanism for macos, as stated at the top:

That comment seems to be descriptive, not normative. It's not saying this file is or isn't preferred, it's saying most processes don't use it.

@peterbourgon
Copy link
Contributor

peterbourgon commented May 27, 2021

As another data point, my current ISP doesn't offer IPv6, but when I use my VPN, both pure-Go and CGO resolvers work fine.

package main

import (
        "context"
        "fmt"
        "log"
        "net/http"
        "time"
)

func main() {
        ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
        defer cancel()
        req, err := http.NewRequestWithContext(ctx, http.MethodGet, "http://v6.testmyipv6.com/", nil)
        if err != nil {
                log.Fatal(err)
        }
        resp, err := http.DefaultClient.Do(req)
        if err != nil {
                log.Fatal(err)
        }
        fmt.Println(resp.Status)
}
ugh ~ env GODEBUG=netdns=cgo+2 go run repro.go
      env GODEBUG=netdns=go+2  go run repro.go
go package net: using cgo DNS resolver
go package net: hostLookupOrder(v6.testmyipv6.com) = cgo
200 OK
go package net: GODEBUG setting forcing use of Go's resolver
go package net: hostLookupOrder(v6.testmyipv6.com) = files,dns
200 OK

@fgsch
Copy link
Member Author

fgsch commented May 28, 2021

The fact that this is the preferred choice for you does not by itself mean it will be the preferred choice for everyone. It might mean that, but we don't have enough information right now to reach that conclusion.

You make it sound like it's a personal preference. It is not.
Ensuring that the macOS binary is using the native resolver is by far the safest option to avoid any issues.

I'm really having a hard time understanding why you are so adamant about this.
But for argument's sake, let's discuss things further. So far we know that:

  • Happens with IPv6
  • Under macOS
  • With a cross compiled binary with CGO disabled
  • With different ISPs
  • With different routers
  • It does NOT happen with a natively built Go binary
  • It has NOT been seen with other (non Go based) binaries

When it comes to local network settings, under macOS these are quite limited AFAICT, and other than enabling or disabling IPv6, or changing your DNSs manually, there is no much you can do. My router is limited to these options as well. (In both cases I'm using the default / automatic settings). Beyond that it is the ISP, and it'd be hard to tell what is going on.

Is there any indication that many/most/all macOS systems won't be able to resolve IPv6 lookups to api.fastly.com correctly using the Go DNS resolver?

api.fastly.com does not have an IPv6 address :)

It seems to me this is a function of your ISP and/or network settings, and not the OS itself. (I could be wrong!)

Maybe but I have not encountered any issues with my ISP and/or network settings outside the CLI.
Another person with a different ISP has also seen this problem.

As another data point, my current ISP doesn't offer IPv6, but when I use my VPN, both pure-Go and CGO resolvers work fine.

Quoting my original comment "I suspect it's both a network and a cross-compiling issue".
Fixing either (or both) of these might resolve this problem. One options is under my direct control, the other is not.

@fgsch
Copy link
Member Author

fgsch commented May 28, 2021

With help from @peterbourgon we got to the bottom of the problem.

My access point is providing a non working primary nameserver and a working secondary one.

When the pure Go resolver is used, iterating through the list of nameservers and resolving the hostname takes more than 5 seconds (ConfigRequestTimeout value), so fetching the configuration fails.

When the CGO based (native) resolver is used, iterating through the list and resolving the hostname is much quicker (a few orders of magnitude faster than the pure Go implementation) and the CLI manages to complete the resolution and fetch within the allotted time.

@fgsch
Copy link
Member Author

fgsch commented May 28, 2021

For completion, in my box successfully resolving developer.fastly.com using the pure Go based resolver takes 5.04 seconds (best of 10 runs). With the native (CGO based) resolver it takes 0.01 seconds.

Code:

package main

import "net"

func main() {
	_, err := net.LookupHost("developer.fastly.com")
	if err != nil {
		panic(err)
	}
}

@ckaznocha
Copy link

+1 I'm hitting this issue as well. I've worked around it by hard coding the IPv4 address for developer.fastly.com in my /etc/hosts without having to recompile with CGO. It would appear to be a name resolution issue with IPv6.

@peterbourgon
Copy link
Contributor

peterbourgon commented Jun 1, 2021

@ckaznocha

It would appear to be a name resolution issue with IPv6.

As @fgsch revealed to me, developer.fastly.com does not have an IPv6 address. Oops! api.fastly.com, rather! My intuition is that this should not prevent it from being reachable unless you're on an exclusively IPv6 network or something, but (a) is that true? and (b) is that what you're doing?

@ckaznocha
Copy link

> dig AAAA developer.fastly.com                                                                                                                                                                                                        

; <<>> DiG 9.10.6 <<>> AAAA developer.fastly.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28204
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;developer.fastly.com.		IN	AAAA

;; ANSWER SECTION:
developer.fastly.com.	1800	IN	CNAME	devhub.map.fastly.net.
devhub.map.fastly.net.	30	IN	AAAA	2a04:4e42::313
devhub.map.fastly.net.	30	IN	AAAA	2a04:4e42:200::313
devhub.map.fastly.net.	30	IN	AAAA	2a04:4e42:400::313
devhub.map.fastly.net.	30	IN	AAAA	2a04:4e42:600::313

;; Query time: 27 msec
;; SERVER: 192.168.1.122#53(192.168.1.122)
;; WHEN: Tue Jun 01 14:22:53 PDT 2021
;; MSG SIZE  rcvd: 196

I have a dual stack, is anyone exclusively using IPv6?

wget is able to resolve and connect using IPv6

> wget "https://developer.fastly.com/api/internal/cli-config"

--2021-06-01 14:26:36--  https://developer.fastly.com/api/internal/cli-config
Resolving developer.fastly.com (developer.fastly.com)... 2a04:4e42::313, 2a04:4e42:200::313, 2a04:4e42:400::313, ...
Connecting to developer.fastly.com (developer.fastly.com)|2a04:4e42::313|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 855 [application/toml]
Saving to: ‘cli-config’

cli-config                  100%[==========================================>]     855  --.-KB/s    in 0s

2021-06-01 14:26:37 (22.6 MB/s) - ‘cli-config’ saved [855/855]

@peterbourgon
Copy link
Contributor

peterbourgon commented Jun 1, 2021

@ckaznocha What are the entries in your /etc/resolv.conf, and what happens if you query them directly with dig?

So far "it works with X but fails with Y" hasn't been high-value signal because it's at least one degree removed from the actual issue: the pure Go resolver queries the nameservers in your /etc/resolv.conf, so if they don't work then you have a problem. But they should work! (Right?)

@fgsch
Copy link
Member Author

fgsch commented Jun 1, 2021

As @fgsch revealed to me, developer.fastly.com does not have an IPv6 address.

Just to be clear, the one that does not have an IPv6 address is api.fastly.com.

@dkegel-fastly
Copy link
Collaborator

Ran into this repeatably here.

Disabling ipv6 works around it.

Happy to try any experiments.

@dkegel-fastly
Copy link
Collaborator

dkegel-fastly commented Jun 18, 2021

These work:

dig example.com
dig -4 example.com
dig @8.8.8.8 -6 example.com

These hang:

dig -6 example.com
dig @2001:4860:4860::8888 -6 example.com
dig @2603:8000:af01:e00:921a:caff:fe6a:5217 -6 example.com

/etc/resolv.conf contains

nameserver 2603:8000:af01:e00:921a:caff:fe6a:5217
nameserver 192.168.1.1

ifconfig says:

Dans-MacBook-Pro:sigsci-edge dkegel$ ifconfig en0
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        options=400<CHANNEL_IO>
        ether 14:7d:da:4c:7a:b7 
        inet 192.168.1.104 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::819:b02c:783a:4f20%en0 prefixlen 64 secured scopeid 0x6 
        inet6 2603:8000:af01:ef0:1445:3b0f:376c:2056 prefixlen 64 autoconf secured 
        inet6 2603:8000:af01:ef0:d49e:e3c3:29b:f2b8 prefixlen 64 autoconf temporary 
        nd6 options=201<PERFORMNUD,DAD>
        media: autoselect
        status: active

@peterbourgon
Copy link
Contributor

Based on that data, it seems that 2603:8000:af01:e00:921a:caff:fe6a:5217 is not actually a working nameserver. @dkegel-fastly Do you agree? Do you know what put it there? What is the output of this:

dig @2603:8000:af01:e00:921a:caff:fe6a:5217 example.com
dig @2603:8000:af01:e00:921a:caff:fe6a:5217 api.fastly.com

@dkegel-fastly
Copy link
Collaborator

So, it seems my router is advertising itself as a DNS server; that's its address. Will investigate further...

@kailan kailan changed the title Continuous delay when running with (broken?) ipv6 "context deadline exceeded" - Continuous delay when running with (broken?) ipv6 Aug 6, 2021
@kailan kailan pinned this issue Aug 6, 2021
@dkegel-fastly
Copy link
Collaborator

Still running into this here, yecch.

Since the broken DNS is from a large ISP that serves many people, perhaps we should add a check
to make sure ipv6 dns is operational... and if not, force ipv4 somehow?

@fgsch fgsch changed the title "context deadline exceeded" - Continuous delay when running with (broken?) ipv6 "context deadline exceeded" - Continuous delay when running with (broken?) DNS server Nov 9, 2021
@fgsch fgsch changed the title "context deadline exceeded" - Continuous delay when running with (broken?) DNS server "context deadline exceeded" - Continuous delay when running with (broken?) DNS settings Nov 9, 2021
@mandyscott
Copy link

Hello! Just confirming that I am also seeing this context deadline exceeded error when publishing/deploying even just updating the fastly cli app. Once I turn off ipv6 at my end it works no problems though.

@dkegel-fastly
Copy link
Collaborator

dkegel-fastly commented Dec 29, 2021

What ISP?
Also, what is your current DNS server, and is it broken (as my cable modem router's is) for ipv6?

@metadaddy
Copy link

One more data point: I experienced exactly this behavior today when tethering my MacBook Pro to my iPhone. The iPhone sets its IP4 and IP6 addresses as DNS servers, but does not respond to IP6 DNS requests.

@fgsch
Copy link
Member Author

fgsch commented Mar 17, 2022

I believe #528 is the best we can do without adding external dependencies or cross compiling.
Should we close this for now?

@dkegel-fastly
Copy link
Collaborator

It might be worth issuing a warning if the connection can't be made. Users will otherwise be surprised their cli is out of date.

@fgsch
Copy link
Member Author

fgsch commented Mar 17, 2022

You do get a warning:

ERROR: there was a problem updating the versioning information for the Fastly CLI (we won't check again until 5m have passed):

[..]

There is a fallback version of the configuration provided with the CLI install (see `fastly configure --display`) which enables the CLI to continue to be usable even though the config couldn't be updated.

@dkegel-fastly
Copy link
Collaborator

Good, then let's close. I didn't see the warning myself, but maybe I blinked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants