Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crypto/ssh: client.NewSession can hang indefinitely #26643

Open
mborsz opened this Issue Jul 27, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@mborsz
Copy link

mborsz commented Jul 27, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.9.3

Does this issue reproduce with the latest release?

I'm not able to verify this.

What operating system and processor architecture are you using (go env)?

linux, amd64

What did you do?

In kubernetes e2e we are using ssh to fetch logs from kubernetes nodes.
In kubernetes/kubernetes#66609 we see that it quite frequently hangs for ~90 minutes in client.NewSession call (the stacktrace is there).

Relevant code is available here: https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/log_size_monitoring.go#L245

What did you expect to see?

Attempt to create NewSession should finish with error if node doesn't respond to ssh connection.

What did you see instead?

Attempt to create NewSession hung for ~90 minutes.

Relevant stacktraces are available in:

@gopherbot gopherbot added this to the Unreleased milestone Jul 27, 2018

ymmt2005 added a commit to cybozu-go/cke that referenced this issue Oct 24, 2018

[agent] set deadline for SSH connection
golang.org/x/crypto/ssh has some known issues that block clients
indefinitely when SSH server dies.

Ref: golang/go#26643
     golang/go#21420

To workaround the problem, this commit creates TCP connection
to the server by itself then passes it to ssh.NewClientConn to
control the underlying TCP connection directly.

It adds deadlines against the TCP connection before any SSH
activity.  It also enables TCP keepalive with short period.
@agnivade

This comment has been minimized.

Copy link
Member

agnivade commented Jan 7, 2019

/cc @hanwen

@hanwen

This comment has been minimized.

Copy link
Contributor

hanwen commented Jan 7, 2019

What is the problem here? Your stack trace suggests it's waiting for the remote end to acknowledge the SSH session. If you want timeouts, you should implement them separately.

Arguably, the SSH package should support contexts to do this neatly, but I think it might be an invasive change, API wise.

@mborsz

This comment has been minimized.

Copy link
Author

mborsz commented Jan 7, 2019

Thanks for the response!

What is the problem here? Your stack trace suggests it's waiting for the remote end to acknowledge the SSH session. If you want timeouts, you should implement them separately.
The problem is that there is no way (AFAIK) to prevent openChannel from blocking for hours in case the remote end never acks the SSH session.

I would like to see some timeout mechanism there. Could you hint how can I implement timeout there?

Arguably, the SSH package should support contexts to do this neatly, but I think it might be an invasive change, API wise.

@ymmt2005

This comment has been minimized.

Copy link

ymmt2005 commented Jan 7, 2019

@mborsz
We could avoid the problem by making a raw TCP net.Conn first and wrapping it
with ssh.NewClientConn. You can set any deadline to the raw connection before
calling SSH methods.

https://github.com/cybozu-go/cke/pull/81/files is the fix.

@hanwen

This comment has been minimized.

Copy link
Contributor

hanwen commented Jan 7, 2019

The SSH state machine doesn't support timeouts on a channel level. The only thing you can do is tear down the entire SSH connection if there is an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.