Description
step ssh proxycommand hangs for approximately 60 seconds when the SSH server closes the connection before the client has closed stdin. This occurs when the server rejects the connection mid-session (e.g. PAM account check failure after certificate authentication succeeds), sending a SSH2_MSG_USERAUTH_BANNER followed by a disconnect.
The process eventually exits due to an OS-level timeout, but the hang makes it appear to the user that the connection is still in progress.
Suspected Cause
The deadlock is in proxyDirect / proxyDirectWithIO in command/ssh/proxycommand.go:
var wg sync.WaitGroup
wg.Add(1)
go func() {
io.Copy(conn, os.Stdin) // goroutine 1: blocks reading stdin
conn.CloseWrite()
wg.Done()
}()
wg.Add(1)
go func() {
io.Copy(os.Stdout, conn) // goroutine 2: exits when server closes
conn.CloseRead()
wg.Done()
}()
wg.Wait() // waits for both — never returns
When the server closes the TCP connection:
- Goroutine 2 exits and calls
conn.CloseRead()
- Goroutine 1 is blocked reading from
os.Stdin
os.Stdin is a pipe from the SSH client process, which hasn't closed because it's waiting for the ProxyCommand to exit
- The ProxyCommand is waiting for both goroutines — deadlock
Calling os.Stdin.Close() from goroutine 2 does not reliably interrupt a blocked read() syscall on macOS when stdin is a pipe.
Reproduction
// Start a TCP server that sends data and immediately closes
ln, _ := net.Listen("tcp", "127.0.0.1:0")
go func() {
conn, _ := ln.Accept()
conn.Write([]byte("hello"))
conn.Close()
}()
// Simulate a stdin that never closes (SSH client waiting for ProxyCommand)
stdinR, _ := io.Pipe() // write end intentionally left open
// This hangs indefinitely
proxyDirectWithIO("127.0.0.1", port, stdinR, io.Discard)
Fix
Return as soon as either goroutine completes. When the server closes, the process exits and the OS reclaims the blocked goroutine. This is safe — the ProxyCommand's only job is to proxy bytes; once one side closes, there is nothing more to do.
done := make(chan struct{}, 2)
go func() {
io.Copy(conn, in)
conn.CloseWrite()
done <- struct{}{}
}()
go func() {
io.Copy(out, conn)
conn.CloseRead()
done <- struct{}{}
}()
<-done
return nil
Test
A regression test is included in the linked PR that fails before the fix and passes after.
Environment
- macOS arm64 (Apple Silicon)
step installed via Homebrew
- OpenSSH 9.9
Description
step ssh proxycommandhangs for approximately 60 seconds when the SSH server closes the connection before the client has closed stdin. This occurs when the server rejects the connection mid-session (e.g. PAM account check failure after certificate authentication succeeds), sending aSSH2_MSG_USERAUTH_BANNERfollowed by a disconnect.The process eventually exits due to an OS-level timeout, but the hang makes it appear to the user that the connection is still in progress.
Suspected Cause
The deadlock is in
proxyDirect/proxyDirectWithIOincommand/ssh/proxycommand.go:When the server closes the TCP connection:
conn.CloseRead()os.Stdinos.Stdinis a pipe from the SSH client process, which hasn't closed because it's waiting for the ProxyCommand to exitCalling
os.Stdin.Close()from goroutine 2 does not reliably interrupt a blockedread()syscall on macOS when stdin is a pipe.Reproduction
Fix
Return as soon as either goroutine completes. When the server closes, the process exits and the OS reclaims the blocked goroutine. This is safe — the ProxyCommand's only job is to proxy bytes; once one side closes, there is nothing more to do.
Test
A regression test is included in the linked PR that fails before the fix and passes after.
Environment
stepinstalled via Homebrew