Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crypto/ssh: ssh.Dial can hang indefinitely #15113

a-robinson opened this issue Apr 4, 2016 · 6 comments

x/crypto/ssh: ssh.Dial can hang indefinitely #15113

a-robinson opened this issue Apr 4, 2016 · 6 comments


Copy link

@a-robinson a-robinson commented Apr 4, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
  2. What operating system and processor architecture are you using (go env)?
    linux, amd64
  3. What did you do?
    Called ssh.Dial()
  4. What did you expect to see?
    I'd expect ssh.Dial() to return either an opened client or an error within some reasonable amount of time, ideally no more than a few minutes.
  5. What did you see instead?
    ssh.Dial() has been hung for more than three hours.

We ran into this in Kubernetes. We aren't doing anything special--just SSHing to a known host and port over TCP with a specified user and public key. It appears as though the underlying problem is that a TCP connection was established (i.e. net.Dial() returned a connection), but then establishing an SSH connection using the underlying TCP connection hung. That means that #14941 wouldn't help here, since that just specifies a timeout for net.Dial(), not for establishing the client SSH connection after that.

Stack trace from the stuck goroutine:

goroutine 85851 [IO wait, 178 minutes]:
net.(*pollDesc).Wait(0xc20ae2e840, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc20ae2e840, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).Read(0xc20ae2e7e0, 0xc20c38b2d0, 0x1, 0x1, 0x0, 0x7f20600e52a0, 0xc20c38b2d8)
    /usr/src/go/src/net/fd_unix.go:242 +0x40f
net.(*conn).Read(0xc20b7add18, 0xc20c38b2d0, 0x1, 0x1, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/net.go:121 +0xdc
io.ReadAtLeast(0x7f20600eb598, 0xc20b7add18, 0xc20c38b2d0, 0x1, 0x1, 0x1, 0x0, 0x0, 0x0)
    /usr/src/go/src/io/io.go:298 +0xf1
io.ReadFull(0x7f20600eb598, 0xc20b7add18, 0xc20c38b2d0, 0x1, 0x1, 0x40, 0x0, 0x0)
    /usr/src/go/src/io/io.go:316 +0x6d, 0xc20b7add18, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/ +0x167, 0xc20b7add18, 0xc20c38b2c0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/ +0x2f1*connection).clientHandshake(0xc20cdb9100, 0xc20bdf6380, 0x11, 0xc20b56e1e0, 0x0, 0x0)
    /go/src/ +0x132, 0xc20b7add18, 0xc20bdf6380, 0x11, 0xc20b56e140, 0x0, 0x0, 0x0, 0xe, 0x0, ...)
    /go/src/ +0x140, 0x3, 0xc20bdf6380, 0x11, 0xc20b56e140, 0x11, 0x0, 0x0)
    /go/src/ +0xf9*SSHTunnel).Open(0xc20d0490e0, 0x0, 0x0)
    /go/src/ +0xbc*SSHTunnelList).createAndAddTunnel(0xc208228360, 0xc2085c6720, 0xe)
    /go/src/ +0x331
created by*SSHTunnelList).removeAndReAdd
    /go/src/ +0x3ae

cc @cjcullen

Copy link

@hanwen hanwen commented Apr 6, 2016

I don't think the SSH spec says anything about what constitutes "reasonable" timeout. The HTTP library also defaults to no timeouts.

ssh.Dial is a convenience method, and you can use NewClientConn with a TCP connection tweaked to your desire

I assume you know that the remote end is probably not an SSH server, or else it would hang on the version exchange.

Copy link

@a-robinson a-robinson commented Apr 6, 2016

Thanks @hanwen.

Just to make sure I'm on the same page, if we open the connection ourselves and customize it, the io.ReadFull call will give up depending on how we configure it. Is setting a hard deadline on the connection with SetDeadline() rather than a timeout on creating the ClientConn the only option? Setting a hard deadline on a long-lived connection SSH connection feels wrong to me, but I'm probably misunderstanding something. Would the recommended pattern be to set the deadline before establishing the ClientConn, then unset it afterwards?

Copy link

@hanwen hanwen commented Apr 7, 2016

SSH has support for keepalive. OpenSSH sends them, and the Go client can also issue them. Just send a "" request, either global or per-channel every once in a while. This ensures somethign gets sent over the connection so it appears live.

Copy link

@minusnine minusnine commented Jul 21, 2016

@hanwen Now that Go 1.7 is out, what do you think about having a new version of NewClientConn (and maybe Dial) support a context.Context?

The caller can do everything the library could do, but it's messy, and this would be a nice API for timing out/canceling session initialization.

Copy link

@hanwen hanwen commented Jul 22, 2016

Can you come up with a comprehensive list of things you want in this -ehrm- context? All of net/http/ goes through http.Request which provides a convenient place to add a context, but that is not the case for SSH, and I'd hate to have to duplicate all functions to provide Context flavors.

Copy link

@dsnet dsnet commented Mar 1, 2017

Apparently, this functionality has been adding in

@dsnet dsnet closed this Mar 1, 2017
@golang golang locked and limited conversation to collaborators Mar 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.