Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: custom DNS-over-TLS Resolver doesn't work anymore #26573

Closed
artyom opened this issue Jul 24, 2018 · 10 comments

Comments

Projects
None yet
6 participants
@artyom
Copy link
Contributor

commented Jul 24, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.11beta2 darwin/amd64

Does this issue reproduce with the latest release?

Yes, tested on

go version devel +4f9ae7739b Tue Jul 24 14:29:38 2018 +0000 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/artyom/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/artyom/go"
GOPROXY=""
GORACE=""
GOROOT="/Users/artyom/Repositories/go"
GOTMPDIR=""
GOTOOLDIR="/Users/artyom/Repositories/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/lb/3rk8rqs53czgb4v35w_342xc0000gn/T/go-build210171313=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

https://play.golang.org/p/g1be5wdYdgy

This is an example of using Cloudflare's DNS-over-TLS. This code works in go1.10.3

What did you expect to see?

repro ¶ time ./repro-go1.10.3 
resolver dial returned
resolver dial returned
[{93.184.216.34 } {2606:2800:220:1:248:1893:25c8:1946 }]

real	0m0.417s
user	0m0.199s
sys	0m0.054s

What did you see instead?

repro ¶ time ./repro-go1.11beta2 
resolver dial returned
resolver dial returned
resolver dial returned
resolver dial returned
i/o timeout

real	0m10.057s
user	0m0.250s
sys	0m0.053s
1 repro ¶ time ./repro-go+4f9ae7739b 
resolver dial returned
resolver dial returned
resolver dial returned
resolver dial returned
i/o timeout

real	0m10.067s
user	0m0.242s
sys	0m0.054s

Caught this issue first on linux/arm binary built with go version go1.11beta2 darwin/amd64.

@artyom

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2018

I modified example to log (unencrypted) data passed over TLS connections: https://play.golang.org/p/Lhb8bOP7pw6

Here's what results look like:

repro ¶ go version
go version go1.10.3 darwin/amd64
repro ¶ go run repro.go 
resolver dial returned
resolver dial returned
conn 1 Write:
00000000  00 1d 50 45 01 00 00 01  00 00 00 00 00 00 07 65  |..PE...........e|
00000010  78 61 6d 70 6c 65 03 63  6f 6d 00 00 1c 00 01     |xample.com.....|
conn 2 Write:
00000000  00 1d 31 8d 01 00 00 01  00 00 00 00 00 00 07 65  |..1............e|
00000010  78 61 6d 70 6c 65 03 63  6f 6d 00 00 01 00 01     |xample.com.....|
conn 2 Read:
00000000  00 2d                                             |.-|
conn 2 Read:
00000000  31 8d 81 80 00 01 00 01  00 00 00 00 07 65 78 61  |1............exa|
00000010  6d 70 6c 65 03 63 6f 6d  00 00 01 00 01 c0 0c 00  |mple.com........|
00000020  01 00 01 00 00 0d e3 00  04 5d b8 d8 22           |.........].."|
conn 1 Read:
00000000  00 39                                             |.9|
conn 1 Read:
00000000  50 45 81 80 00 01 00 01  00 00 00 00 07 65 78 61  |PE...........exa|
00000010  6d 70 6c 65 03 63 6f 6d  00 00 1c 00 01 c0 0c 00  |mple.com........|
00000020  1c 00 01 00 00 09 2a 00  10 26 06 28 00 02 20 00  |......*..&.(.. .|
00000030  01 02 48 18 93 25 c8 19  46                       |..H..%..F|
[{93.184.216.34 } {2606:2800:220:1:248:1893:25c8:1946 }]

This is for tip:

repro ¶ go version
go version devel +4f9ae7739b Tue Jul 24 14:29:38 2018 +0000 darwin/amd64
repro ¶ go run repro.go 
resolver dial returned
resolver dial returned
conn 2 Write:
00000000  17 ca 01 00 00 01 00 00  00 00 00 00 07 65 78 61  |.............exa|
00000010  6d 70 6c 65 03 63 6f 6d  00 00 1c 00 01           |mple.com.....|
conn 1 Write:
00000000  ec 77 01 00 00 01 00 00  00 00 00 00 07 65 78 61  |.w...........exa|
00000010  6d 70 6c 65 03 63 6f 6d  00 00 01 00 01           |mple.com.....|
resolver dial returned
resolver dial returned
conn 4 Write:
00000000  ac 10 01 00 00 01 00 00  00 00 00 00 07 65 78 61  |.............exa|
00000010  6d 70 6c 65 03 63 6f 6d  00 00 01 00 01           |mple.com.....|
conn 3 Write:
00000000  96 09 01 00 00 01 00 00  00 00 00 00 07 65 78 61  |.............exa|
00000010  6d 70 6c 65 03 63 6f 6d  00 00 1c 00 01           |mple.com.....|
i/o timeout
exit status 1
@artyom

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2018

Bisected this to 672729e

@artyom

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2018

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

That code is broken. It always returns a TCP connection even if a UDP one is requested. The TCP and UDP DNS protocols are different.

@artyom

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2018

@iangudger from the second paragraph of the doc for net.Resolver.Dial I assumed that UDP wire protocol is only used if Dial returned net.PacketConn, which tls.Conn does not implement, so RFC 7766 path expected here.

    // If the Conn returned is also a PacketConn, sent and received DNS
    // messages must adhere to RFC 1035 section 4.2.1, "UDP usage".
    // Otherwise, DNS messages transmitted over Conn must adhere
    // to RFC 7766 section 5, "Transport Protocol Selection".
    // If nil, the default dialer is used.

What am I missing here?

@iangudger

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2018

Wow, that is dumb. I will throw together a patch to fix compatibility.

@andybons andybons added this to the Go1.11 milestone Jul 24, 2018

@andybons

This comment has been minimized.

Copy link
Member

commented Jul 24, 2018

Assigning as release-blocking due to it being a regression.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jul 24, 2018

Wow, that is dumb.

Or by design. 🤷

@gopherbot

This comment has been minimized.

Copy link

commented Jul 24, 2018

Change https://golang.org/cl/125735 mentions this issue: net: fix handling of Conns created by Resolver.Dial

@gopherbot gopherbot closed this in 5f5402b Jul 24, 2018

@mikioh

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2018

Wow, that is dumb.
Or by design.

I think this is one of the design flaws of the net package API surface, unfortunately, from the beginning. Looks like representing various characteristics in a single interface, like net.Conn, is not good for people who don't care about the hidden hierarchy on the interface. Surely, we need a good solution, at least for application-layer transport API still increasing complexity by adding fancy features; security (TLS 1.3, 1.4 or above), multipath (MPTCP, QUIC), newly flow and congestion controls and platform-dependent control knobs, in Go 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.