x/crypto/ssh: deadlock during key renegotiation #18439

Closed
vito opened this Issue Dec 27, 2016 · 20 comments

Comments

Projects
None yet
6 participants
@vito

vito commented Dec 27, 2016

What version of Go are you using (go version)?

1.7.x - though in this case the more relevant bit is probably the sha of x/crypto/ssh: a8a4eed

What operating system and processor architecture are you using (go env)?

Linux, amd64

What did you do?

Establish a long-lived connection used for SSH tunneling, including sending arbitrary amounts of data over the wire (e.g. ~200MB sent every now and then via the same connection).

What did you expect to see?

SSH traffic should continue unhindered over an arbitrary amount of time with arbitrary amounts of data flowing over the connection.

What did you see instead?

I've observed that after some time all communication over SSH will hang. The connection is still "live" in the TCP sense, but SSH-level keepalives just block waiting for a reply.

I suspect this is actually space-related, and not time, however; the default of 1GB of transfer for key renegotiation sounds very likely given the stack inspection below.

Debugging

Context

(You can skip this if you just want the nitty-gritty.)

This package is used in a CI system wherein all workers register themselves with the master via a custom-built SSH server. They'll open a reverse SSH tunnel so that they can be used from a private network - the SSH server will act as a proxy to the workers. These SSH connections are arbitrarily long lived, and will have a large amount of data transferred over them as various CI resources (i.e. git repos, build artifacts) are streamed to and from the workers.

Procedure

In my case both ends (client and server) are implemented using this package. I suspect this to be a server-side issue as when bouncing the server all clients will immediately notice this and reconnect. I've also looked at the stack dump of the server and (I think) found a smoking gun for the deadlock.

Code sleuthing

I observed that for every hanging client there was one goroutine stuck in chan send for the duration that the worker had been gone. The following is my theory as to how that got there:

  • (*handshakeTransport).readOnePacket and (*handshakeTransport).writePacket check if the number of bytes exceed t.config.RekeyThreshold, which defaults to 1 gigabyte
  • this means that after 1GB of data transfer, the crypto/ssh package will initiate a new key exchange and set t.sentInitMsg to a non-nil value
  • (*handshakeTransport).writePacket has a loop on t.sentInitMsg != nil using a condition variable (i.e. it's waiting for that to become nil, so the kex has finished, or for there to be a write error that will cause it to give up)
  • t.sentInitMsg only becomes nil in a call to (*handshakeTransport).readOnePacket
  • the only caller of readOnePacket is (*handshakeTransport).readLoop
  • in our stack, we observed readLoop is blocked on writing to the incoming channel, so it cannot call readOnePacket
  • interestingly, incoming is a buffered channel (length 16) - however in this case that didn't help, so the buffer must be full, or no one can read from it
  • who reads from t.incoming? (*handshakeTransport).readPacket does.
  • who calls readPacket? (*mux).onePacket
  • who calls onePacket? (*mux).loop - which isn't too interesting though since it literally just keeps calling that method as long as it works. so let's go back into onePacket and see what it's doing after reading the packet, as that'd be why it's not reading the next one.
  • in our case, it was in the call to (*channel).handlePacket - so what's that doing?
  • in our case, it's blocked on a call to (*channel).sendMessage. things are getting interesting here: our read loop is now writing.
  • following the stack down, this ends up back in (*handleTransport).writePacket! which is dependent on us reading to succeed, as readOnePacket is what completes the key exchange.

@vito vito referenced this issue in concourse/concourse Dec 27, 2016

Closed

Workers stop heartbeating #750

@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Jan 4, 2017

Contributor

/cc @agl

Contributor

rsc commented Jan 4, 2017

/cc @agl

@rsc rsc added this to the Unplanned milestone Jan 4, 2017

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 9, 2017

Contributor

can you send the stack trace dumps? Your description makes it hard to follow what is going on.

If you think key exchanges are to blame, you can make them happen more often by tuning down the rekeying data threshold.

Can you confirm that you are servicing all channels that the SSH library returns?

Contributor

hanwen commented Jan 9, 2017

can you send the stack trace dumps? Your description makes it hard to follow what is going on.

If you think key exchanges are to blame, you can make them happen more often by tuning down the rekeying data threshold.

Can you confirm that you are servicing all channels that the SSH library returns?

@databus23

This comment has been minimized.

Show comment
Hide comment
@databus23

databus23 Jan 9, 2017

Regarding your question about servicing all channels. I think thats the case:

Regarding your question about servicing all channels. I think thats the case:

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 9, 2017

Contributor

if this is open source code, can you recreate the deadlock and SIGQUIT the process and send the resulting trace?

Contributor

hanwen commented Jan 9, 2017

if this is open source code, can you recreate the deadlock and SIGQUIT the process and send the resulting trace?

@vito

This comment has been minimized.

Show comment
Hide comment
@vito

vito Jan 9, 2017

@hanwen Sure - I'll configure a lower threshold and see if it happens more quickly, and then grab a stack dump.

In the meantime you can find one linked to from here here: concourse/concourse#750 (comment) - they'll be the ones from "web", not "worker", though "worker" will show the client-side so that may be of interest too.

vito commented Jan 9, 2017

@hanwen Sure - I'll configure a lower threshold and see if it happens more quickly, and then grab a stack dump.

In the meantime you can find one linked to from here here: concourse/concourse#750 (comment) - they'll be the ones from "web", not "worker", though "worker" will show the client-side so that may be of interest too.

@databus23

This comment has been minimized.

Show comment
Hide comment
@databus23

databus23 Jan 9, 2017

@hanwen Be sure to take one prefixed with concourse-web-issue-750- because I placed some stackdumps from other issues there as well.

databus23 commented Jan 9, 2017

@hanwen Be sure to take one prefixed with concourse-web-issue-750- because I placed some stackdumps from other issues there as well.

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 9, 2017

Contributor

studied your explanation for a bit more, and I think you are right: the deadlock is in the SSH package. To avoid deadlocks, we separate the reading and writing sides (see also https://codereview.appspot.com/62180043/).

Unfortunately, the spec requires that on reading a channelClose message, we should confirm it by sending a channelClose message back, so there is no way to make the read and write side independent.

I will have to think for a bit on how to fix this.

Contributor

hanwen commented Jan 9, 2017

studied your explanation for a bit more, and I think you are right: the deadlock is in the SSH package. To avoid deadlocks, we separate the reading and writing sides (see also https://codereview.appspot.com/62180043/).

Unfortunately, the spec requires that on reading a channelClose message, we should confirm it by sending a channelClose message back, so there is no way to make the read and write side independent.

I will have to think for a bit on how to fix this.

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 9, 2017

Contributor

probably, we could have a

queuedPackets [][]byte

which can be written while a kex is in progress. We could then either let all writes succeed directly, or special case channelClose. The former will yield more testing coverage, but may not report I/O errors in the right place. The latter will be very hard to test for, since we'd have to make a test case that repros this hang.

Contributor

hanwen commented Jan 9, 2017

probably, we could have a

queuedPackets [][]byte

which can be written while a kex is in progress. We could then either let all writes succeed directly, or special case channelClose. The former will yield more testing coverage, but may not report I/O errors in the right place. The latter will be very hard to test for, since we'd have to make a test case that repros this hang.

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 9, 2017

Contributor

I tried one simple minded solution, which didn't work. I need to think this through better. For now, I recommend jacking up the rekey interval to something really high.

The 1Gb default is suggested by the original RFC spec which is from 2006, and is exceedingly conservative. We're 10 years later now, so that number can probably safely be multiplied with 100 or 1000.

Contributor

hanwen commented Jan 9, 2017

I tried one simple minded solution, which didn't work. I need to think this through better. For now, I recommend jacking up the rekey interval to something really high.

The 1Gb default is suggested by the original RFC spec which is from 2006, and is exceedingly conservative. We're 10 years later now, so that number can probably safely be multiplied with 100 or 1000.

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 9, 2017

Contributor

summary of deadlock:

  1. some write to writePacket triggers a kex. All writes will now block.
  2. the remote end is slow or sending lots of data. We happen to read a chanClose message before the remote end can send a kex message back.
  3. the chanClose tries to echo a chanClose with writePacket, this blocks which kills the read loop.
Contributor

hanwen commented Jan 9, 2017

summary of deadlock:

  1. some write to writePacket triggers a kex. All writes will now block.
  2. the remote end is slow or sending lots of data. We happen to read a chanClose message before the remote end can send a kex message back.
  3. the chanClose tries to echo a chanClose with writePacket, this blocks which kills the read loop.
@databus23

This comment has been minimized.

Show comment
Hide comment
@databus23

databus23 Jan 9, 2017

@hanwen Thank for taking a deeper look into this. Regarding your suggested workaround of increasing the re-key interval by a factor of 100 or more: I'm not sure I understand the logic why its now safe to increase the re-key interval. Briefly looking at the spec it seems the main concern when choosing the recommended interval was the computational complexity of asymmetric encryption. Wouldn't this be an argument to decrease the re-key interval 10 years later with computers being a lot faster then when the spec was devised? In general isn't it at least a theoretical security concern to increase the interval by huge amounts as an attacker can observe way more data encrypted with the same key material?

@hanwen Thank for taking a deeper look into this. Regarding your suggested workaround of increasing the re-key interval by a factor of 100 or more: I'm not sure I understand the logic why its now safe to increase the re-key interval. Briefly looking at the spec it seems the main concern when choosing the recommended interval was the computational complexity of asymmetric encryption. Wouldn't this be an argument to decrease the re-key interval 10 years later with computers being a lot faster then when the spec was devised? In general isn't it at least a theoretical security concern to increase the interval by huge amounts as an attacker can observe way more data encrypted with the same key material?

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 10, 2017

Contributor

from what I recall, there was no real justification given for the 1 Gb number (it's also just recommended). I put in because that's what the RFC said, but it seems ridiculously low, compared to throughput of the network nowadays, and the delay of doing a new handshake.

Contributor

hanwen commented Jan 10, 2017

from what I recall, there was no real justification given for the 1 Gb number (it's also just recommended). I put in because that's what the RFC said, but it seems ridiculously low, compared to throughput of the network nowadays, and the delay of doing a new handshake.

@databus23

This comment has been minimized.

Show comment
Hide comment
@databus23

databus23 Jan 10, 2017

@hanwen You are right that the original rfc4253 did not provide detailed background how the recommended value was choosen. But rfc4344 Section 3 and 6.1 provides more detail on the re-keying interval and while I can't fully assess the provided math it definitely states that expanding the interval has negative implications on the security of the connection and the recommended value should be used.

databus23 commented Jan 10, 2017

@hanwen You are right that the original rfc4253 did not provide detailed background how the recommended value was choosen. But rfc4344 Section 3 and 6.1 provides more detail on the re-keying interval and while I can't fully assess the provided math it definitely states that expanding the interval has negative implications on the security of the connection and the recommended value should be used.

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 11, 2017

Contributor

thanks for that pointer. You shouldn't encrypt more than 4G packets because the counter wraps around.

L=128 for AES and block size is 16, so you should be OK for rekeying at 16 *2^32 = 64G.

Upping the interval was intended as a short-term workaround. The CL at https://go-review.googlesource.com/c/35012/ fixes this more reliably, hopefully.

Contributor

hanwen commented Jan 11, 2017

thanks for that pointer. You shouldn't encrypt more than 4G packets because the counter wraps around.

L=128 for AES and block size is 16, so you should be OK for rekeying at 16 *2^32 = 64G.

Upping the interval was intended as a short-term workaround. The CL at https://go-review.googlesource.com/c/35012/ fixes this more reliably, hopefully.

@databus23

This comment has been minimized.

Show comment
Hide comment
@databus23

databus23 Jan 11, 2017

@hanwen: Awesome, thanks for taking this on so quickly. Looking forward to the proper fix.

I have one more question: From your comments above it seems like the deadlock can be only caused by a channel close message while a key exchange in in-flight? Is that really the case? Because in the CI system where this is happening I wouldn't expect channel close message as each workers set up a single persistent connection that is not teared down until one of both ends shut down.

@hanwen: Awesome, thanks for taking this on so quickly. Looking forward to the proper fix.

I have one more question: From your comments above it seems like the deadlock can be only caused by a channel close message while a key exchange in in-flight? Is that really the case? Because in the CI system where this is happening I wouldn't expect channel close message as each workers set up a single persistent connection that is not teared down until one of both ends shut down.

@hanwen

This comment has been minimized.

Show comment
Hide comment
@hanwen

hanwen Jan 11, 2017

Contributor

the stack traces I got for this bug were pointing to the line where closeChannel message was being sent.

But it could potentially happen in any function that responds to global/channel requests.

Contributor

hanwen commented Jan 11, 2017

the stack traces I got for this bug were pointing to the line where closeChannel message was being sent.

But it could potentially happen in any function that responds to global/channel requests.

@vito

This comment has been minimized.

Show comment
Hide comment
@vito

vito Jan 11, 2017

vito commented Jan 11, 2017

@databus23 databus23 referenced this issue in concourse/concourse Jan 16, 2017

Merged

bump x/crypto to fix ssh deadlock #880

@hanwen

This comment has been minimized.

Show comment
Hide comment
Contributor

hanwen commented Jan 16, 2017

fixed in golang/crypto@2e74c77

@hanwen hanwen closed this Jan 16, 2017

@vito

This comment has been minimized.

Show comment
Hide comment
@vito

vito Jan 18, 2017

@hanwen I'm now seeing handshakes fail occasionally with the following message:

input_userauth_error: bad message during authentication: type 20

Type 20 being KEXINIT. Maybe that's firing concurrently with the handshake or something? It appears to be flaky.

I can open another issue if you'd like.

vito commented Jan 18, 2017

@hanwen I'm now seeing handshakes fail occasionally with the following message:

input_userauth_error: bad message during authentication: type 20

Type 20 being KEXINIT. Maybe that's firing concurrently with the handshake or something? It appears to be flaky.

I can open another issue if you'd like.

@kardianos

This comment has been minimized.

Show comment
Hide comment
@kardianos

kardianos Jan 18, 2017

Contributor

@vito closed issues are not tracked. Open a new issue please.

Contributor

kardianos commented Jan 18, 2017

@vito closed issues are not tracked. Open a new issue please.

@golang golang locked and limited conversation to collaborators Jan 18, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.