Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: DNS lookup timeout only when using go resolver #28419

Closed
sithmein opened this issue Oct 26, 2018 · 7 comments

Comments

Projects
None yet
5 participants
@sithmein
Copy link

commented Oct 26, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.10.3 linux/amd64

Does this issue reproduce with the latest release?

Not tried, go 1.11 isn't available for my system yet.

What operating system and processor architecture are you using (go env)?

Linux, amd64

What did you do?

Run the following program:

package main

import (
	"fmt"
	"net/http"
)

func main() {
	client := &http.Client{}
	resp, err := client.Get("https://api.media.atlassian.com")
	if err != nil {
		fmt.Printf("Got error: %v\n", err)
		return
	}
	fmt.Printf("Got status: %s\n", resp.Status)
}

Using the standard go DNS resolver will results in

Got error: Get https://api.media.atlassian.com: dial tcp: lookup api.media.atlassian.com on 172.17.21.1:53: read udp 172.17.21.51:34254->172.17.21.1:53: i/o timeout

Running the same program with GODEBUG=netdns=cgo returns almost immediately with

Got status: 200 OK

Attached are two straces, one using the Go resolver, one using the libc resolver.

Here's the full record returned by the DNS server:

[jenkins@knime414 tmp]$ dig api.media.atlassian.com

; <<>> DiG 9.9.4-RedHat-9.9.4-61.el7_5.1 <<>> api.media.atlassian.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11556
;; flags: qr rd ra; QUERY: 1, ANSWER: 11, AUTHORITY: 4, ADDITIONAL: 4

;; QUESTION SECTION:
;api.media.atlassian.com.       IN      A

;; ANSWER SECTION:
api.media.atlassian.com. 60     IN      CNAME   atlassian-mediaapi-sfo.d1.teridioncloud.net.
atlassian-mediaapi-sfo.d1.teridioncloud.net. 30 IN CNAME nz-europe-me-atlassian-mediaapi-sfo.d1.teridioncloud.net.
nz-europe-me-atlassian-mediaapi-sfo.d1.teridioncloud.net. 30 IN CNAME b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net.
b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN CNAME 2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net.
2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN A 159.122.189.80
2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN A 159.122.147.208
2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN A 159.122.147.200
2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN A 159.122.147.207
2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN A 159.122.189.12
2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN A 159.122.189.70
2781-b1db7e-softlayer-mil01-atlassian-m.d1.teridioncloud.net. 30 IN A 159.122.189.26

;; AUTHORITY SECTION:
atlassian.com.          607     IN      NS      ns-2018.awsdns-60.co.uk.
atlassian.com.          607     IN      NS      ns-595.awsdns-10.net.
atlassian.com.          607     IN      NS      ns-1388.awsdns-45.org.
atlassian.com.          607     IN      NS      ns-112.awsdns-14.com.

;; ADDITIONAL SECTION:
ns-2018.awsdns-60.co.uk. 4121   IN      A       205.251.199.226
ns-595.awsdns-10.net.   519     IN      A       205.251.194.83
ns-1388.awsdns-45.org.  4274    IN      A       205.251.197.108
ns-112.awsdns-14.com.   76269   IN      A       205.251.192.112

;; Query time: 55 msec
;; SERVER: 172.17.21.1#53(172.17.21.1)
;; WHEN: Fr Okt 26 16:52:17 CEST 2018
;; MSG SIZE  rcvd: 561

See git-lfs/git-lfs#3332 (comment) for a preliminary investigation of the problem.

What did you expect to see?

Successful DNS resolution.

What did you see instead?

I/O timeout

@bradfitz

This comment has been minimized.

Copy link
Member

commented Oct 26, 2018

@bradfitz bradfitz changed the title DNS lookup timeout only when using go resolver net: DNS lookup timeout only when using go resolver Oct 26, 2018

@bradfitz

This comment has been minimized.

Copy link
Member

commented Oct 26, 2018

Not tried, go 1.11 isn't available for my system yet.

As a Deb/RPM package you mean? You can use a more modern Go alongside your system Go binaries. Just run:

$ go get golang.org/dl/go1.11.1
$ go1.11.1 download

And then use the go1.11.1 binary as if it were your normal go binary. (go1.11.1 install ....)

There's been a fair amount of DNS work in the past release & at tip.

Please try Go 1.11.x and report back.

@bradfitz bradfitz added this to the Go1.12 milestone Oct 26, 2018

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2018

@sithmein,

I'm still not sure your circumstances,

;; Query time: 55 msec
;; SERVER: 172.17.21.1#53(172.17.21.1)
;; WHEN: Fr Okt 26 16:52:17 CEST 2018
;; MSG SIZE  rcvd: 561

but looks like:

  1. your own DNS recursive server (e.g., 172.17.21.1) doesn't support DNS transport over TCP (see https://tools.ietf.org/html/rfc7766),
  2. also it replies an over 512 octets answer on DNS transport over UDP unconditionally (see https://tools.ietf.org/html/rfc1035#section-2.3.4).

then, there perhaps might be no workaround except re-configuring your own DNS recursive server. IIRC, the built-in DNS stub resolver implementation is conservative and doesn't take any risk for the operation of DNS transport over UDP and the Kaminsky attack and its variants.

In general, implementing, operating DNS is hard and all of us know that we need some compromise for accommodating various circumstances including yours. If you have any good ideas, please open a new issue for the improvement of DNS stub resolver implementation.

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2018

The new DNS client in Go 1.11 might actually work here. It looks like the offending response is only a little bigger than the limit (561 vs 512), and the new DNS client incrementally parses the response. As long as all of the answers are in the first 512 bytes (which is almost certainly the case here given what was posted), the new DNS client will happily accept it.

That said, you would probably better off fixing your DNS server/config.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2018

If you have any good ideas, please open a new issue for the improvement of DNS stub resolver implementation.

FWIW, I'm now enjoying reading http://www.potaroo.net/ispcol/2018-10/oarc29.html and feels like it might be a good read as a first step.

@sithmein

This comment has been minimized.

Copy link
Author

commented Oct 29, 2018

As a Deb/RPM package you mean? You can use a more modern Go alongside your system Go binaries. Just run:

$ go get golang.org/dl/go1.11.1
$ go1.11.1 download

That didn't do anything for me but I managed to compile 1.11.1 on my own.

Please try Go 1.11.x and report back.

The problem does not occur in that version any more. Considering the other comments it was very likely #21160.

@saurabh-gupta7869

This comment has been minimized.

Copy link

commented May 18, 2019

I was also facing the DNS timeout issue intermittently on go1.10. After changing the resolver from go to cgo everything worked fine. Anyone else facing the issue can try and check if it works for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.