Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: net.Dialer.Dial() doesn't respect .KeepAlive #31490

Open
networkimprov opened this issue Apr 16, 2019 · 5 comments

Comments

@networkimprov
Copy link

commented Apr 16, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12 linux/amd64

Does this issue reproduce with the latest release?

Haven't tried 1.12.1+

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/lib/go"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build601953146=/tmp/go-build -gno-record-gcc-switches"

What did you do/see?

For a TLS connection that's been dropped, c.Read() returns a net.Error with e.Timeout()==true after ~18 minutes, apparently due to KeepAlive failure. The connection is closed by the server while the client (laptop running VMWare with Linux) is suspended. Error string:
"read tcp n.n.n.n:p->n.n.n.n:p: read: connection timed out"

The note on 5bd7e9c (discussed in #23459) says the default for net.Dialer keepalive failure is just 2-3min. With an explicit KeepAlive, I also see odd waits:

net.Dialer{KeepAlive: 30 * time.Second} 18min
net.Dialer{KeepAlive: 25 * time.Second} 18min
net.Dialer{KeepAlive: 10 * time.Second} 16min
net.Dialer{KeepAlive: 5 * time.Second} <1min, but at least once 18min

Measurements aren't precise; 16-18 minutes could be 1000 or 1024 seconds. Code has:

   aDlr := net.Dialer{Timeout: 3*time.Second} // default KeepAlive
   aCfgTls := tls.Config{InsecureSkipVerify: true}
   aConn, err := tls.DialWithDialer(&aDlr, "tcp", "host:port", &aCfgTls)
   if err != nil { return err }
   err = aConn.Write(...) // <256 bytes
   // brief exchange of short messages
   aLen, err := aConn.Read(...)

Also filed #31449 to report that the error due to keepalive doesn't comply with the docs re connection timeout.

cc @FiloSottile @bradfitz

@bradfitz

This comment has been minimized.

Copy link
Member

commented Apr 16, 2019

tls.DialWithDialer literally calls net.Dialer.Dial, so I don't think there's anything TLS-specific here.

@FiloSottile

This comment has been minimized.

Copy link
Member

commented Apr 16, 2019

Yeah, I looked and while net/http used to mess with the keep-alives, crypto/tls does not.

Can you figure out the KEEPCNT value on your system?

@FiloSottile FiloSottile changed the title tls: DialWithDialer() doesn't respect .KeepAlive net: net.Dialer.Dial() doesn't respect .KeepAlive Apr 16, 2019

@networkimprov

This comment has been minimized.

Copy link
Author

commented Apr 16, 2019

I have the standard defaults:

$ cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
$ cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
@networkimprov

This comment has been minimized.

Copy link
Author

commented Apr 16, 2019

A likely factor... After resuming the client, its date jumps forward to the current time.

On my system, that happens after 0-10 mins (it isn't immediate due to a VMWare or Linux driver issue). But that should cause a max wait of 13mins for the default KeepAlive.

@julieqiu julieqiu added this to the Go1.13 milestone Apr 22, 2019

@networkimprov

This comment has been minimized.

Copy link
Author

commented Jun 23, 2019

cc @mikioh

@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.