Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

writes to half-closed streams stall when sendWindow is exhausted #133

Open
slingamn opened this issue Aug 19, 2024 · 1 comment
Open

writes to half-closed streams stall when sendWindow is exhausted #133

slingamn opened this issue Aug 19, 2024 · 1 comment

Comments

@slingamn
Copy link

I'm working on an application that uses yamux for proxying. I ran into a case where the application leaks goroutines. Here's a simplified test case: https://gist.github.com/slingamn/1dafab6141e03d27a3f51bcbbcdb9972

This test case sets up a client and a server. The client sends 10 messages to the server over a *Stream. The server reads 5 messages, then closes its side of the stream. After this happens, the client side continues writing messages until sendWindow is exhausted. Then it gets stuck here:

yamux/stream.go

Lines 227 to 232 in 8bd691f

select {
case <-s.sendNotifyCh:
goto START
case <-timeout:
return 0, ErrTimeout
}

Since the other side has stopped reading, it will not send control messages that could unblock sendNotifyCh. So unless a write deadline has been set, Write() blocks until StreamCloseTimeout (5 minutes by default).

I naively tried to fix this by adding streamRemoteClose to the existing list of states (streamLocalClose, streamClosed) that cause writes to preemptively fail:

yamux/stream.go

Lines 180 to 184 in 8bd691f

case streamLocalClose:
fallthrough
case streamClosed:
s.stateLock.Unlock()
return 0, ErrStreamClosed

This broke TestHalfClose, from which I understand that the current behavior is expected, at least in part. But if this is expected, then I'm not sure what the in-band mechanism is for signaling that the other side of the Stream has gone away. Is there an existing API in yamux to tell the other side to stop sending? (For comparison, (*net.TCPConn).Close() from the read side will cause writes to fail with EPIPE.)

Thanks very much for your time.

@schmichael
Copy link
Member

Thanks for filing an issue! This is a tricky one!

I believe the existing code is correct, but I absolutely concede this is difficult to understand behavior. I think the key lies in 2.3.6 Stream Half-Close of the SPDY spec:

When one side of the stream sends a frame with the FLAG_FIN flag set, the stream is half-closed from that endpoint. The sender of the FLAG_FIN MUST NOT send further frames on that stream. When both sides have half-closed, the stream is closed.

In other words Stream.Close is not semantically similar to TCPConn.Close but rather to TCPConn.CloseWrite.

To put it in protocol terms:

  • To set a FIN flag: yamux/Stream.Close or net/TCPConn.CloseWrite
  • To set a RST flag: net/TCPConn.Close

So in your example when the Server closes its stream, that does not imply the Client should consider the Stream closed. The Server stream closing only implies the Server will no longer send data.

This begs the question:

Should yamux differentiate Stream.Close and Stream.CloseWrite? SPDY has a RST_STREAM control frame to abort streams, but I don't see that implemented in yamux.

Yamux only implements the RST flag on window update frames, but perhaps we should treat that like SPDY's? As far as I can tell the only place that Yamux RSTs individual Streams is when a timeout is hit.

Workaround: Read Closers

There is a workaround you should use while we consider the above:

go func() {
	_, err = cStream.Read([]byte{})
	if err != nil {
		log.Printf("Client.Stream.Read failed: %v", err)
		cStream.Close()
	}
}()

I added the above to a fork of your original gist, and it fixes the deadlock-until-timeout.

Since yamux only exposes a way for streams to indicate they will no longer send data, you can start a goroutine whose express purpose is to read from the server and detect that state. Calling Stream.{Close,Read,Write} concurrently this way is safe. Stream.Read returns an error (io.EOF in the case of a clean Stream.Close on the other end), and then the Stream.Close notifies the blocked Stream.Write that the stream has been locally closed (meaning closed for sending).

You can see Nomad implementing this pattern in places like the log streaming RPC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants