Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
net: TestDialerDualStack fails when net.inet.tcp.blackhole=2 #12052
$ pkg version -e go
This is the stock ports version used to bootstrap the build.
$ uname -m
With a freshly cloned ~/golang:
~/golang$ git checkout go1.5rc1
goroutine 166 [running]:
goroutine 1 [chan receive]:
goroutine 17 [syscall, locked to thread]:
goroutine 185 [IO wait]:
goroutine 195 [IO wait]:
I can reproduce this also using release-branch.go1.5 and on two different machines.
Yes it does with a few warnings i think are clang specific:
~/golang/src$ go test -test.short net
/usr/local/go/src/net/cgo_unix.go:53:31: warning: unknown attribute 'gcc_struct' ignored [-Wattributes]
cc: warning: argument unused during compilation: '-pthread'
Hardware is an i7 with 32GB RAM:
$ sysctl hw.model
This is a snapshot while building go:
The 100% idle lines are actually while the net tests are running. I also redid the test and made sure nothing is blocked by the firewall.
The process hanging around until the timeout hits seems to be this:
root 71283 0,0 0,0 46212 15896 11 S+ 1:44pm 0:00,33 /tmp/go-build754929443/net/http/_test/http.test -test.short=true -test.timeout=3m0s
Attaching truss to this process shows a lots of repeats of the following:
$ truss -f -s 255 -p 71283 -d
I used the freshly compiled go tool version to do these tests instead of the bootstrap version:
~/golang/src/net$ GOROOT=/root/golang ~/golang/bin/go test -c
~/golang/src/net/http$ GOROOT=/root/golang ~/golang/bin/go test -c
There are a lot of http tests taking >600 secs. But i think this is a local problem as i can't reproduce them on another machine running the exact same OS version. So the only common failure is the TestDialerDualStack test from the net test suite.
Doing a tcpdump on the very long running http tests reveals countless packets 8128 bytes in length filled with 'a's. But as said on antoher machine the http test suite runs fine without a failure...
I uploaded the full transcript of the net and http tests, sans interface addresses, here:
Debugging this a bit further i extracted the dialClosedPort function from dial_test.go into a small standalone program and using this and tcpdump i saw that there were no RST packets sent.
$ sysctl net.inet.tcp.blackhole
This usually prevents our SYN flooded servers from responding with a RST flood which is a good thing. Sadly this is a global option and not per interface, so this also happens on the loopback device.
This fixes the original problem. It seems this test was introduced with 1.5 so it never blew up before.
I still get the http test errors but as i only get them on one of two machines i'll consider them a local problem for now.