Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Pure-Go DNS resolver does not properly Round-Robin DNS Names #13283

Closed
bmhatfield opened this issue Nov 17, 2015 · 14 comments

Comments

Projects
None yet
4 participants
@bmhatfield
Copy link

commented Nov 17, 2015

I was recently debugging an issue with an Amazon Elastic Load Balancer where our traffic was not being evenly balanced across ELB Availability Zones. AWS's ELB uses a number of DNS entries and low TTLs to balance "front-door" client traffic, expecting clients to properly round-robin the addresses. This technique works for a large number of clients on the internet.

Unfortunately, the new pure-Go DNS resolver in 1.5 appears to have some affinity to lower-numbered addresses when returning addresses in round-robin form. The cgo resolver does not exhibit this behavior.

After tracing through some of the code, I believe I have a good reproduction case for this problem. The ELB in question returns 6 IP addresses.

I have written a small program to demonstrate the affinity behavior:

package main

import (
    "fmt"
    "net"
)

func main() {
    for i := 0; i <= 50; i++ {
        addrs, err := net.LookupHost("REMOVED-ELB-HOSTNAME")

        if err != nil {
            fmt.Println(err)
        } else {
            fmt.Println(addrs)
        }
    }
}
ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=go
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]

An alternate form of this program highlights the issue:

ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=go
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
23.21.50.150:80 resolved 26 times
23.23.134.56:80 resolved 16 times
23.23.172.185:80 resolved 8 times

However, switching the resolver to cgo (on Ubuntu 12.04) causes the resolution to properly round-robin:

ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=cgo
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]

And again, the even resolution behavior is highlighted by an alternate form of this program:

ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=cgo
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
23.23.172.185:80 resolved 9 times
23.21.50.150:80 resolved 9 times
54.83.193.112:80 resolved 8 times
184.72.238.214:80 resolved 8 times

I believe the pure-Go DNS resolver should be updated to round-robin across all returned addresses.

@bmhatfield

This comment has been minimized.

Copy link
Author

commented Nov 17, 2015

I have an instinct that the problem lies in here (where returned addresses are being sorted) but I'm having a little trouble processing the code to understand the exact effects: https://golang.org/src/net/addrselect.go

@mdempsky

This comment has been minimized.

Copy link
Member

commented Nov 17, 2015

I'd suspect you're getting bit by "Rule 9: Use longest matching prefix". Does your server happen to have a 23.21.x.x address itself?

@bmhatfield

This comment has been minimized.

Copy link
Author

commented Nov 17, 2015

@mdempsky it does not - it is only aware of an RFC1918 address (10.x.x.x)

@mdempsky

This comment has been minimized.

Copy link
Member

commented Nov 17, 2015

That at least explains why it's favoring the 23.x.x.x addresses over the 58.x.x.x and 184.x.x.x addresses: 23 shares 3 leading 0 bits with 10, whereas 58 and 184 share only 2 and 1 leading 0 bits with 10, respectively.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Nov 17, 2015

It looks like glibc doesn't strictly follow RFC 3484/6724 for IPv4 addresses:

  /* Outside of subnets, as defined by the network masks,
     common address prefixes for IPv4 addresses make no sense.
     So, define a non-zero value only if source and
     destination address are on the same subnet.  */

See http://bazaar.launchpad.net/~vcs-imports/glibc/master/view/head:/sysdeps/posix/getaddrinfo.c#L1710

@bmhatfield

This comment has been minimized.

Copy link
Author

commented Nov 17, 2015

EDIT: I lost the race on this comment to @mdempsky - it was written before the comment about glibc :-)

Hrm. I'm not sure what to make of that.

Does the getaddrinfo/getnameinfo implementation on Linux not implement rule 9? Does it want a full octet match for a matching prefix? Or some other interpretation of the rule I am not understanding?

One thing I see in the RFC is this:

Rules 9 and 10 MAY be superseded if the implementation has other
   means of sorting destination addresses.  For example, if the
   implementation somehow knows which destination addresses will result
   in the "best" communications performance.

In practice, the behavior of routing by "best prefix" in this context is problematic, as it causes a significant amount of traffic to be pointed at a small subset of nodes that otherwise have no meaningful routing value over the others.

@bmhatfield

This comment has been minimized.

Copy link
Author

commented Nov 17, 2015

Ah @mdempsky I think I would agree with their interpretation that it doesn't make sense once you're outside of the subnet.

@mdempsky

This comment has been minimized.

Copy link
Member

commented Nov 17, 2015

It's unfortunate that UDPConn's don't have a way to discover their local IPNet, only their IP. It looks like to match glibc's behavior, we'll need to call InterfaceAddrs and find the enclosing IPNet that way. (Which is basically how glibc finds IPv4 prefix lengths anyway.)

Alternatively, we just skip Rule 9 for IPv4 addresses. I would suspect in practice it doesn't matter.

CC @bradfitz

@mdempsky

This comment has been minimized.

Copy link
Member

commented Nov 17, 2015

Actually, we can still apply it for RFC 1918 private networks (i.e., 10/8, 172.16/12, and 192.168/16) relatively easily.

@mdempsky mdempsky self-assigned this Nov 17, 2015

@mdempsky

This comment has been minimized.

Copy link
Member

commented Nov 17, 2015

@bmhatfield Are you able to test whether https://go-review.googlesource.com/#/c/16995/ fixes the problem for you?

(Unfortunately I have to head out for a bit, hence the incomplete CL.)

@bmhatfield

This comment has been minimized.

Copy link
Author

commented Nov 17, 2015

Yes, I can give it a shot.

@ianlancetaylor ianlancetaylor changed the title [1.5] Pure-Go DNS resolver does not properly Round-Robin DNS Names net: Pure-Go DNS resolver does not properly Round-Robin DNS Names Nov 17, 2015

@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Nov 17, 2015

@bmhatfield

This comment has been minimized.

Copy link
Author

commented Nov 17, 2015

I canary-deployed this to a single host, and I can confirm that it is now including the other 3 non-23.x IP addresses in the connections that it's making.

@gopherbot

This comment has been minimized.

Copy link

commented Nov 17, 2015

CL https://golang.org/cl/16995 mentions this issue.

@mdempsky mdempsky closed this in 4d4a266 Nov 17, 2015

@golang golang locked and limited conversation to collaborators Nov 16, 2016

@gopherbot

This comment has been minimized.

Copy link

commented Jan 6, 2017

CL https://golang.org/cl/34914 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.