Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: issue with DNS response > 512 bytes (cannot unmarshal DNS message) #21160

Closed
kvaps opened this issue Jul 25, 2017 · 39 comments

Comments

Projects
None yet
6 participants
@kvaps
Copy link

commented Jul 25, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.6.2 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/lib/go-1.6"
GOTOOLDIR="/usr/lib/go-1.6/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"

What did you do?

I have get-ip.go with this code:

package main

import (
    "fmt"
    "net"
)

func main() {
    addr, err := net.LookupHost("storage.googleapis.com")
    fmt.Println(addr, err)
}

What did you expect to see?

# ./get-ip 
[216.58.201.112 2a00:1450:4014:801::2010] <nil>

What did you see instead?

# ./get-ip 
dial tcp: lookup storage.googleapis.com on 10.36.1.10:53: cannot unmarshal DNS message

Additional information:

It works with GODEBUG=netdns=cgo, but not with GODEBUG=netdns=go

# GODEBUG=netdns=cgo ./get-ip 
[172.217.23.208 2a00:1450:4014:80c::2010] <nil>
# GODEBUG=netdns=go ./get-ip 
dial tcp: lookup storage.googleapis.com on 10.36.1.10:53: cannot unmarshal DNS message

It also started working after flushing dns-cache on my dns-server.

So I made capture of answer from dns-server:

before: (got error)

Domain Name System (response)
    [Request In: 2]
    [Time: 0.000410000 seconds]
    Transaction ID: 0xc4c9
    Flags: 0x8180 Standard query response, No error
    Questions: 1
    Answer RRs: 2
    Authority RRs: 13
    Additional RRs: 14
    Queries
        storage.googleapis.com: type AAAA, class IN
    Answers
        storage.googleapis.com: type CNAME, class IN, cname storage.l.googleusercontent.com
        storage.l.googleusercontent.com: type AAAA, class IN, addr 2a00:1450:4014:80d::2010
    Authoritative nameservers
        com: type NS, class IN, ns h.gtld-servers.net
        com: type NS, class IN, ns c.gtld-servers.net
        com: type NS, class IN, ns k.gtld-servers.net
        com: type NS, class IN, ns b.gtld-servers.net
        com: type NS, class IN, ns d.gtld-servers.net
        com: type NS, class IN, ns f.gtld-servers.net
        com: type NS, class IN, ns g.gtld-servers.net
        com: type NS, class IN, ns m.gtld-servers.net
        com: type NS, class IN, ns i.gtld-servers.net
        com: type NS, class IN, ns j.gtld-servers.net
        com: type NS, class IN, ns e.gtld-servers.net
        com: type NS, class IN, ns l.gtld-servers.net
        com: type NS, class IN, ns a.gtld-servers.net
    Additional records
        storage.l.googleusercontent.com: type A, class IN, addr 216.58.201.112
        h.gtld-servers.net: type A, class IN, addr 192.54.112.30
        c.gtld-servers.net: type A, class IN, addr 192.26.92.30
        k.gtld-servers.net: type A, class IN, addr 192.52.178.30
        b.gtld-servers.net: type A, class IN, addr 192.33.14.30
        d.gtld-servers.net: type A, class IN, addr 192.31.80.30
        f.gtld-servers.net: type A, class IN, addr 192.35.51.30
        g.gtld-servers.net: type A, class IN, addr 192.42.93.30
        m.gtld-servers.net: type A, class IN, addr 192.55.83.30
        i.gtld-servers.net: type A, class IN, addr 192.43.172.30
        j.gtld-servers.net: type A, class IN, addr 192.48.79.30
        e.gtld-servers.net: type A, class IN, addr 192.12.94.30
        l.gtld-servers.net: type A, class IN, addr 192.41.162.30
        a.gtld-servers.net: type A, class IN, addr 192.5.6.30

after: (is working)

Domain Name System (response)
    [Request In: 1]
    [Time: 0.000746000 seconds]
    Transaction ID: 0x86fe
    Flags: 0x8180 Standard query response, No error
    Questions: 1
    Answer RRs: 2
    Authority RRs: 0
    Additional RRs: 0
    Queries
        storage.googleapis.com: type A, class IN
    Answers
        storage.googleapis.com: type CNAME, class IN, cname storage.l.googleusercontent.com
        storage.l.googleusercontent.com: type A, class IN, addr 216.58.201.112

And hexdump of package:

before: (got error)

0000   86 ea 80 1a 4b c3 9a cc 60 09 5f 0b 08 00 45 00  ....K...`._...E.
0010   02 2e 05 b1 00 00 40 11 5c 59 0a 24 01 0a 0a 24  ......@.\Y.$...$
0020   01 64 00 35 b2 cc 02 1a 18 e1 82 0a 81 80 00 01  .d.5............
0030   00 02 00 0d 00 0d 07 73 74 6f 72 61 67 65 0a 67  .......storage.g
0040   6f 6f 67 6c 65 61 70 69 73 03 63 6f 6d 00 00 01  oogleapis.com...
0050   00 01 c0 0c 00 05 00 01 00 00 0a af 00 1e 07 73  ...............s
0060   74 6f 72 61 67 65 01 6c 11 67 6f 6f 67 6c 65 75  torage.l.googleu
0070   73 65 72 63 6f 6e 74 65 6e 74 c0 1f c0 34 00 01  sercontent...4..
0080   00 01 00 00 00 63 00 04 d8 3a c9 70 c0 1f 00 02  .....c...:.p....
0090   00 01 00 01 52 3a 00 14 01 61 0c 67 74 6c 64 2d  ....R:...a.gtld-
00a0   73 65 72 76 65 72 73 03 6e 65 74 00 c0 1f 00 02  servers.net.....
00b0   00 01 00 01 52 3a 00 04 01 68 c0 70 c0 1f 00 02  ....R:...h.p....
00c0   00 01 00 01 52 3a 00 04 01 63 c0 70 c0 1f 00 02  ....R:...c.p....
00d0   00 01 00 01 52 3a 00 04 01 6b c0 70 c0 1f 00 02  ....R:...k.p....
00e0   00 01 00 01 52 3a 00 04 01 62 c0 70 c0 1f 00 02  ....R:...b.p....
00f0   00 01 00 01 52 3a 00 04 01 64 c0 70 c0 1f 00 02  ....R:...d.p....
0100   00 01 00 01 52 3a 00 04 01 66 c0 70 c0 1f 00 02  ....R:...f.p....
0110   00 01 00 01 52 3a 00 04 01 67 c0 70 c0 1f 00 02  ....R:...g.p....
0120   00 01 00 01 52 3a 00 04 01 6d c0 70 c0 1f 00 02  ....R:...m.p....
0130   00 01 00 01 52 3a 00 04 01 69 c0 70 c0 1f 00 02  ....R:...i.p....
0140   00 01 00 01 52 3a 00 04 01 6a c0 70 c0 1f 00 02  ....R:...j.p....
0150   00 01 00 01 52 3a 00 04 01 65 c0 70 c0 1f 00 02  ....R:...e.p....
0160   00 01 00 01 52 3a 00 04 01 6c c0 70 c0 6e 00 01  ....R:...l.p.n..
0170   00 01 00 00 ca ea 00 04 c0 05 06 1e c0 8e 00 01  ................
0180   00 01 00 00 ca ea 00 04 c0 36 70 1e c0 9e 00 01  .........6p.....
0190   00 01 00 00 ca ea 00 04 c0 1a 5c 1e c0 ae 00 01  ..........\.....
01a0   00 01 00 00 ca ea 00 04 c0 34 b2 1e c0 be 00 01  .........4......
01b0   00 01 00 00 ca ea 00 04 c0 21 0e 1e c0 ce 00 01  .........!......
01c0   00 01 00 00 ca ea 00 04 c0 1f 50 1e c0 de 00 01  ..........P.....
01d0   00 01 00 00 ca ea 00 04 c0 23 33 1e c0 ee 00 01  .........#3.....
01e0   00 01 00 00 ca ea 00 04 c0 2a 5d 1e c0 fe 00 01  .........*].....
01f0   00 01 00 00 ca ea 00 04 c0 37 53 1e c1 0e 00 01  .........7S.....
0200   00 01 00 00 ca ea 00 04 c0 2b ac 1e c1 1e 00 01  .........+......
0210   00 01 00 00 ca ea 00 04 c0 30 4f 1e c1 2e 00 01  .........0O.....
0220   00 01 00 00 ca ea 00 04 c0 0c 5e 1e c1 3e 00 01  ..........^..>..
0230   00 01 00 00 ca ea 00 04 c0 29 a2 1e              .........)..

after: (is working)

0000   86 ea 80 1a 4b c3 9a cc 60 09 5f 0b 08 00 45 00  ....K...`._...E.
0010   00 7e 07 00 00 00 40 11 5c ba 0a 24 01 0a 0a 24  .~....@.\..$...$
0020   01 64 00 35 93 32 00 6a 17 31 86 fe 81 80 00 01  .d.5.2.j.1......
0030   00 02 00 00 00 00 07 73 74 6f 72 61 67 65 0a 67  .......storage.g
0040   6f 6f 67 6c 65 61 70 69 73 03 63 6f 6d 00 00 01  oogleapis.com...
0050   00 01 c0 0c 00 05 00 01 00 00 0a 40 00 1e 07 73  ...........@...s
0060   74 6f 72 61 67 65 01 6c 11 67 6f 6f 67 6c 65 75  torage.l.googleu
0070   73 65 72 63 6f 6e 74 65 6e 74 c0 1f c0 34 00 01  sercontent...4..
0080   00 01 00 00 00 26 00 04 d8 3a c9 70              .....&...:.p

I presume that it connected with:

@ianlancetaylor ianlancetaylor changed the title Golang issue with DNS response > 512 bytes (cannot unmarshal DNS message) net: issue with DNS response > 512 bytes (cannot unmarshal DNS message) Jul 25, 2017

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2017

Have you tried the current Go release, 1.8.3?

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 25, 2017

Yes, first I caught this bug with kubeadm, it is compiled by 1.8.3:

# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.1", GitCommit:"1dc5c66f5dd61da08412a74221ecc79208c2165b", GitTreeState:"clean", BuildDate:"2017-07-14T01:48:01Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

After this, I installed golang from the ubuntu xenial repo, and tried to repeat it (as I reported)

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2017

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2017

Both of the hex dumps have prefixes before the message. As a result, neither is an RFC 1035 DNS message.

The first message does not match the text you provided, so lets focus on the second message which both matches the text and has the same problem.

The first two bytes of the DNS message should be the message ID (0x86fe). If you look closely, those two bytes don't show up until most of the way through line three. That is where the actual DNS message starts.

I am not sure what the prefix is, but it isn't compatible with the current DNS library.

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 25, 2017

Hi @iangudger, thanks for answer. But provided hex dump is dump of tcp-package, not only dns reply.
If you want I can provide you full dump of tcp session.

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 25, 2017

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2017

@kvaps, what do you mean by tcp-package?

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 25, 2017

@ianlancetaylor, I mean that this hex dump is contain the tcp/ip headers like source ip, dst ip addresses and other service information.

You can use wireshark for open cap-files attached to my previous message and see all session between dns-client and dns-server.

I can also attach the hex dump without tcp headers:

0000   5f fb 81 80 00 01 00 02 00 0d 00 0d 07 73 74 6f  _............sto
0010   72 61 67 65 0a 67 6f 6f 67 6c 65 61 70 69 73 03  rage.googleapis.
0020   63 6f 6d 00 00 01 00 01 c0 0c 00 05 00 01 00 00  com.............
0030   0a c1 00 1e 07 73 74 6f 72 61 67 65 01 6c 11 67  .....storage.l.g
0040   6f 6f 67 6c 65 75 73 65 72 63 6f 6e 74 65 6e 74  oogleusercontent
0050   c0 1f c0 34 00 01 00 01 00 00 00 75 00 04 d8 3a  ...4.......u...:
0060   c9 70 c0 1f 00 02 00 01 00 01 52 4c 00 14 01 69  .p........RL...i
0070   0c 67 74 6c 64 2d 73 65 72 76 65 72 73 03 6e 65  .gtld-servers.ne
0080   74 00 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 6a  t.........RL...j
0090   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 65  .p........RL...e
00a0   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 6c  .p........RL...l
00b0   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 61  .p........RL...a
00c0   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 68  .p........RL...h
00d0   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 63  .p........RL...c
00e0   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 6b  .p........RL...k
00f0   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 62  .p........RL...b
0100   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 64  .p........RL...d
0110   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 66  .p........RL...f
0120   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 67  .p........RL...g
0130   c0 70 c0 1f 00 02 00 01 00 01 52 4c 00 04 01 6d  .p........RL...m
0140   c0 70 c0 6e 00 01 00 01 00 00 ca fc 00 04 c0 2b  .p.n...........+
0150   ac 1e c0 8e 00 01 00 01 00 00 ca fc 00 04 c0 30  ...............0
0160   4f 1e c0 9e 00 01 00 01 00 00 ca fc 00 04 c0 0c  O...............
0170   5e 1e c0 ae 00 01 00 01 00 00 ca fc 00 04 c0 29  ^..............)
0180   a2 1e c0 be 00 01 00 01 00 00 ca fc 00 04 c0 05  ................
0190   06 1e c0 ce 00 01 00 01 00 00 ca fc 00 04 c0 36  ...............6
01a0   70 1e c0 de 00 01 00 01 00 00 ca fc 00 04 c0 1a  p...............
01b0   5c 1e c0 ee 00 01 00 01 00 00 ca fc 00 04 c0 34  \..............4
01c0   b2 1e c0 fe 00 01 00 01 00 00 ca fc 00 04 c0 21  ...............!
01d0   0e 1e c1 0e 00 01 00 01 00 00 ca fc 00 04 c0 1f  ................
01e0   50 1e c1 1e 00 01 00 01 00 00 ca fc 00 04 c0 23  P..............#
01f0   33 1e c1 2e 00 01 00 01 00 00 ca fc 00 04 c0 2a  3..............*
0200   5d 1e c1 3e 00 01 00 01 00 00 ca fc 00 04 c0 37  ]..>...........7
0210   53 1e                                            S.
0000   3c f9 81 80 00 01 00 02 00 00 00 00 07 73 74 6f  <............sto
0010   72 61 67 65 0a 67 6f 6f 67 6c 65 61 70 69 73 03  rage.googleapis.
0020   63 6f 6d 00 00 01 00 01 c0 0c 00 05 00 01 00 00  com.............
0030   04 f3 00 1e 07 73 74 6f 72 61 67 65 01 6c 11 67  .....storage.l.g
0040   6f 6f 67 6c 65 75 73 65 72 63 6f 6e 74 65 6e 74  oogleusercontent
0050   c0 1f c0 34 00 01 00 01 00 00 00 0a 00 04 ac d9  ...4............
0060   17 d0                                            ..
@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2017

@kvaps, I got confused when you were talking about TCP. The hex dumps you posted did not look like TCP DNS packets. I have confirmed from your packet capture that these messages were all exchanged over UDP, and they do look like UDP DNS packets.

I am actually working on a solution to this problem. I have confirmed that the problematic message parses just fine with the new DNS library (golang.org/x/net/dns/dnsmessage). I am working on converting the standard net package to use it for DNS resolution. I am aiming to get this in Go 1.10. Once that lands, this problem should go away.

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 25, 2017

@iangudger, No problem, thank you!

@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Jul 26, 2017

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@iangudger,

What's the conclusion? An issue on some DNS recursive server that doesn't support DNS transport selection including TCP (RFC 7766)? If so, What's your solution?

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@mikioh, good point, but this is still RFC 1035.

That longer packet is 530 bytes long with the UDP headers removed. The limit for UDP DNS messages is 512 bytes long. Well behaved DNS servers are supposed to truncate the message and set the truncated bit. See RFC 1035 section 4.2.1.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@iangudger,

I just wanted to know about the detail of "a solution to this problem." you mentioned above. Just guessing that we have four options: a) marking #6464 Go 1.10, b) making the existing DNS stub resolver relax to be able to grab a DNS message over 512 octets long on UDP transport (I'd rather not), c) returning more user friendly error values, d) other.

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@mikioh,

This should work with golang.org/x/net/dns/dnsmessage even without #6464 (at least in this case) because we will only parse out the answers and won't touch the missing part of the message.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@iangudger,

Sorry, I still don't get it. I think we need to focus on the behavior of builtin DNS stub resolver here instead of message parser implementation because RFC 1035 clearly states that the UDP message size limit, and subsequent RFCs for DNS transport never try to change the limit for various reasons. Just for clarification, my question is: Do you want to change the behavior of the existing DNS stub resolver in Go 1.10 even if that violates RFC 1035?

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@mikioh, is there a good argument for doing so?

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@iangudger,

is there a good argument for doing so?

Well, I'd like to ask what exactly means "for doing so."

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

@mikioh,

Is there a good reason to violate RFC 1035? For example, is there a later RFC that we should follow instead or is it commonly violated in a specific way?

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 28, 2017

You can read this RFC on larger DNS UDP size
https://tools.ietf.org/html/rfc6891#section-4.3

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 28, 2017

@kvaps,

Sure, but I don't think implementing RFC 6891 will help as the server did not appear to implement it. I reviewed the packet capture, and I did not see any evidence of an attempt to negotiate a larger datagram size.

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 29, 2017

@ianlancetaylor , this is just answer from my dns server manufacturer, about my message that their DNS Server is not responding by RFC 1035:

Go to DNS settings and set max-udp-packet-size to 512.

Hi, thanks, it is working! :)
May be would better have this value by default?

Hello,
Unfortunately it will not be set by default. We increased this value because nova days DNS packet exceeds 512bytes and lower setting caused problems receiving larger packets.

Excuse me,
Can you explain please, what do you mean by "nova days" words?

Hello,
Modern days when data far exceeds 512bytes.
You can read this RFC on larger DNS UDP size
https://tools.ietf.org/html/rfc6891#section-4.3

I should also add, that any other dns-client have no this problems unlike dns-client built in golang.
You can see, that it is working fine with GODEBUG=netdns=cgo

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2017

@kvaps,

As far as I can tell, the libc resolver makes no attempt to follow the RFCs on this matter:
https://github.com/lattera/glibc/blob/a2f34833b1042d5d8eeb263b4cf4caaea138c4ad/resolv/res_send.c#L1232

Using FIONREAD is inherently racy and using it reasonably correctly would probably not be possible with the net library. We could use 65536 byte buffers like libc's fallback behavior, but that does not seem like a good idea to me.

According to RFC 5966:

In the absence of EDNS0 (Extension Mechanisms for DNS 0) (see below),
the normal behaviour of any DNS server needing to send a UDP response
that would exceed the 512-byte limit is for the server to truncate
the response so that it fits within that limit and then set the TC
flag in the response header. When the client receives such a
response, it takes the TC flag as an indication that it should retry
over TCP instead.

This is the strategy used by Go's DNS resolver. You should suggest that your DNS server vendor follow the RFCs.

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 30, 2017

Yes, I see, but why libc uses so large buffer? - maybe there were reasons like this for do this?
In my opinion, the libraries should have uniform behaviour, and if you think that large buffer is not so good idea, we should to send this issue to libc for reduce the buffer.

For me, as user, this is really problem that all tools is working fine, but go written tools have the different look.
In addition, I killed a few hours for debugging the problem connected with this. kubernetes/kubeadm#359
I tried to download something from storage.googleapis.com using docker and kubeadm, and they say me x509: certificate has expired or is not yet valid, but any diagnostic tool (curl, openssl s_client, ping, telnet) return me that all is fine.
I could understand where is problem, only after detail analyze of packet capture, I found that go written tools tries to connect to my search domain from /etc/resolved.conf instead storage.googleapis.com. (I have wildcard setting like *.my-search-domain.tld in my DNS settings) That's really a problem, you must return an error here, unlike simple resolve it to wrong ip from my search domain.

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2017

@kvaps,

glibc is, in my opinion, not very well written and should not be copied. In some modern versions of glibc, if the 65536 byte buffer code path is taken, glibc will allocate two such buffers, double free one, and leak the other. Basically they always use the FIONREAD code path. Using FIONREAD in the Go DNS client will likely be non-trivial, and, in my opinion, is a very bad idea.

As far as I am concerned, the Go DNS client is working as intended here. I can tell you from experience that other popular DNS clients have this same issue. For the DNS server in golang.org/cl/51631, I originally just implemented a UDP server without truncation. DNS responses longer than 512 bytes did not work with the version of Node.js I was testing with. As a result, the version I sent out for review includes truncation for the UDP server and a TCP server. This seems to work well in practice, and I suggest that you request that your DNS server provider do the same.

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 30, 2017

Sure, I will ask them about it again.
But do I right understand, that this behavor is wrong, can you add an error here?

I found that go written tools tries to connect to my search domain from /etc/resolved.conf instead storage.googleapis.com. (I have wildcard setting like *.my-search-domain.tld in my DNS settings) That's really a problem, you must return an error here, unlike simple resolve it to wrong ip from my search domain.

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2017

@kvaps,

I am not sure that I follow. Could you please elaborate? What are your actual DNS records? What is happening, and what do you think should happen?

@kvaps

This comment has been minimized.

Copy link
Author

commented Jul 31, 2017

@iangudger , Sure.

  • I have domain domain.tld, and I have A record *.domain.tld with ip 1.2.3.4
  • Also on the client machines I have the next /etc/resolv.conf file:
nameserver 10.10.10.10
search domain.tld
  • If you try to resolve any address via go (for example storage.googleapis.com), and if answer of dns-server will be greater than 512b the go will not return any error. The go will resolve storage.googleapis.com to 1.2.3.4 that is wrong, because this ip is ip of my search domain.
@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 31, 2017

@kvaps,

That is because of #13281. If the response doesn't unpack, we ignore it. This could be fixed by only unpacking the header and not progressing to other search domains if we find a matching header. That will require golang.org/x/net/dns/dnsmessage as the current parser does not support incremental unpacking.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2017

Looks like we are on the same page. If we want to transmit/receive a DNS message over 512 octets:

Is my understanding correct?

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2017

@mikioh, that all sounds good to me.

The other option is that we might be able to get away with doing partial unpacking of large responses. For example, if all of the answers fit in 512 bytes but the total length is longer, we can just unpack the answers and ignore the truncated part. This shouldn't be relied on in the general case, but I think it will fix the case mentioned earlier in this issue if memory serves.

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Nov 29, 2017

@iangudger,

Yup, that might be an option as long as we cannot provide a one-fits-all solution (and we cannot.) I personally hope that the stuff in x/net/dns helps people under heterogeneous environments.

@gopherbot

This comment has been minimized.

Copy link

commented Mar 14, 2018

Change https://golang.org/cl/37879 mentions this issue: net: use golang.org/x/net/dns/dnsmessage for DNS resolution

gopherbot pushed a commit that referenced this issue Mar 15, 2018

net: use golang.org/x/net/dns/dnsmessage for DNS resolution
Vendors golang.org/x/net/dns/dnsmessage from x/net git rev
892bf7b0c6e2f93b51166bf3882e50277fa5afc6

Updates #16218
Updates #21160

Change-Id: Ic4e8f3c3d83c2936354ec14c5be93b0d2b42dd91
Reviewed-on: https://go-review.googlesource.com/37879
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@iangudger

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

@kvaps, can you retry with tip?

@kvaps

This comment has been minimized.

Copy link
Author

commented Mar 16, 2018

Hi, I was already changed my DNS server, sorry :)
Before was mikrotik one

@bvitale

This comment has been minimized.

Copy link

commented Mar 16, 2018

Here's a before/after with 1.9 and with bd85943.

Platform:

Linux xenial 4.13.0-37-generic #42~16.04.1-Ubuntu SMP Wed Mar 7 16:03:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Program:

package main

import (
    "fmt"
    "net"
)

func main() {
    addr, err := net.LookupHost("434313288222.dkr.ecr.us-east-1.amazonaws.com")
    fmt.Println(addr, err)
}

On 1.9:

bsv@xenial:~$ /usr/lib/go-1.9/bin/go version
go version go1.9.2 linux/amd64
bsv@xenial:~$ /usr/lib/go-1.9/bin/go build lookup.go
bsv@xenial:~$ ./lookup 
[] lookup 434313288222.dkr.ecr.us-east-1.amazonaws.com on 127.0.1.1:53: read udp 127.0.0.1:38519->127.0.1.1:53: i/o timeout
bsv@xenial:~$ GODEBUG=netdns=cgo ./lookup 
[52.1.70.15 52.20.176.179 52.20.8.105 52.21.22.157 52.2.44.95 52.3.111.158 52.3.140.172 52.22.226.253] <nil>

And with tip:

bsv@xenial:~$ ./go/bin/go version
go version devel +bd85943 Fri Mar 16 13:39:38 2018 +0000 linux/amd64
bsv@xenial:~$ ./go/bin/go build lookup.go
bsv@xenial:~$ ./lookup
[52.1.70.15 52.3.140.172 52.21.22.157] <nil>

The dig response for this name shows that it's over the limit..

<snip>
;; Query time: 64 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Fri Mar 16 11:15:19 EDT 2018
;; MSG SIZE  rcvd: 869
@iangudger

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

Well, I am pretty sure that the original issue should be resolved in the specific case mentioned in this issue (the message was over the UDP limit, but the part we cared about was within the limit). The general case to allow arbitrary length UDP responses would require doing platform specific syscalls to allow these RFC violating DNS responses. I don't think that is something we want to do.

@bvitale

This comment has been minimized.

Copy link

commented Mar 16, 2018

fwiw I make no commentary with my repro above, but have been following this issue since it affects us fairly frequently due to a misbehaving upstream DNS server. Far as I can tell it's resolved, but I can't comment on 'correctness' of the resolution.

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

The new incremental parsing will still fail if the wanted parts of the message fail to parse. The difference is that it will succeed if if parts of the message that it does not care about are unparsable. As a result, the incremental parsing should not affect correctness.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jun 27, 2018

As far as I can tell from the comments this issue is fixed. Please comment if you disagree. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.