streams: pipe is destructed by 'end' event from destination #2524

Closed
wants to merge 1 commit into
from

Projects

None yet

8 participants

@koichik

'end' event is emitted from a readable stream, not a writable stream (see also API docs). However, Stream.pipe() destructs a pipeline by 'end' event from destination that is a writable stream (source). This behavior causes a problem when duplex stream (i.e. net.Socket) with allowHalfOpen: true is used.

For example, here is three streams:

var reader = ...; // readable stream
var writer = ...; // writable stream
var socket = net.connect({port: 80, allowHalfOpen: true}); // duplex stream

Constructs two pipelines:

reader.pipe(socket); // pipeline (A)
socket.pipe(writer); // pipeline (B)
+--------+  (A)  +--------+
| reader | ----> |        |
+--------+       |        |
                 | socket |
+--------+  (B)  |        |
| writer | <---- |        |
+--------+       +--------+

When the socket receives a FIN packet, socket emits 'end' event. Then, Stream.pipe() destructs not only pipeline (B), but also pipeline (A). Please note that socket is still opened and writable.

After that, if reader emits 'data' event, the data is not written to socket. And if reader emits 'end' event, socket.end() is never called. It makes resource (fd) leaks.

I think that Stream.pipe() should not destruct pipeline by 'end' event from destination.

Please review.

@isaacs

@mikeal thoughts on this?

@mikeal
Node.js Foundation member

pipe's to HTTP and other streams don't emit "close", only "end", so this patch will break for streams that aren't sockets or files.

@mikeal
Node.js Foundation member

i guess "break" might be a strong word, i doubt it'll make them throw exceptions, it's just likely to cause memory leaks and other strange problems that might occur if you never teardown the pipe listeners.

@koichik

@isaacs, @mikeal - Thanks for your comments. Let's continue the discussion as a part of Streams2 (#1681).
However, what status is that? Is its target v0.8? :)

@isaacs

@mikeal The problem is that it's listening to the dest stream's "end" event, but when the dest and src are the same, or when you have something like:

socket  // A
  .pipe(filter) // B
  .pipe(socket) // C

Note that A and C are the same object. So, the socket that we're reading from (A) emits "end" to say that there's no more data coming, and this tears down the A -> B pipe.

However, this also means that the "end" event is emitted by the "dest" object in the B -> C pipe. Since we're tearing down on the dest "end" event, this means that the B -> C pipe will be torn down, even though the filter (B) might not be done emitting data.

I'm tempted to just take @koichik's patch here, but I'm wary because we've never gotten around to doing the full-scale inventory and cleanup of all our pipes and streams in core. Since it is a semantic change (albeit probably a correct one), I'm leaning towards putting this in no earlier than 0.8, so that we have a little bit of time to clean up stream usage internally first. I'd hate to break anyone's program that's "working by accident" because of this odd edge.

@mikeal
Node.js Foundation member

@isaacs i don't think the patch is complete as is. we can't leave pipe listeners hooked up indefinitely, which is what this does for half the streams in core.

before this goes in we either need to get the other core streams to emit "close" and message everyone in userland to do the same, or we need to get some better logic in pipe to understand when the source is piped back to the destination.

if you want to fix this bug in the short term, make "close" a writable stream only event. that way even if you route it back to itself "close" only relates to the writable end of the pipe loop.

@koichik

before this goes in we either need to get the other core streams to emit "close"

net.Socket, fs.WriteStream, http/https.ClientRequest, http/https.ServerResponse, tls.CleartextStream and tty.WriteStream already emit a 'close'. Probably, we need only fix zlib.Zlib.

if you want to fix this bug in the short term, make "close" a writable stream only event.

It will fix a symmetrical bug of this. The problem is that there is no event when Writable Stream (destination) is no longer writable.
example:

var reader = ...; // readable stream
var socket = net.connect({port: 80, allowHalfOpen: true}); // duplex stream
reader.pipe(socket);
socket.end();

In this case, because the socket is still opened and readable, 'close' is not emitted. So, Stream.pipe() does not tear down the pipe. @mikeal's suggestion solves this. However, 'close' means "the underlying file descriptor has been closed". Do we change this and do introduce a new event (@isaacs suggests 'destroy') with this meaning?

'end', 'close', 'destroy', 'abort', 'error'... Welcome to event hell :)

@isaacs

How do you guys feel about postponing this issue for further review and redesign in 0.8, so that we can address it from the ground up?

@felixge

How do you guys feel about postponing this issue for further review and redesign in 0.8, so that we can address it from the ground up?

Works for me. I agree that we need to have a deeper discussion about Streams and API changes at some point.

@koichik

@isaacs - I'm okay.

@substack

I just ran into this unexpected behavior with a duplex stream. Here's an idea to make this patch much more compatible with existing code:

if (!dest.readable) dest.on('end', cleanup);

dest is already expected to be writable, but if it is also readable then the 'end' events are presumed to be handled by the readable side.

Edit: published this as a separate module since I want this behavior for things https://github.com/substack/duplex-pipe

@dominictarr

+1 to substacks suggestion.

@bnoordhuis
Node.js Foundation member

James' proposal seems sensible enough to me. I'm generally +1.

One potential issue is that if the source stream never emits 'end' or 'close' (unlikely but perhaps not quite impossible), the event listeners on the destination stream never get cleaned up. Not sure how much of an issue that is in real life.

@dominictarr

The core streams all emit 'close' eventually, including zlib.
If the source emits 'end', 'close', or 'error' that will trigger cleanup also.
It's possible that some userland streams do not eventually emit close,
but I have certainly been advocating always emitting 'close' for several months now.

Maybe if someone is using {end: false} in production, hmm, but that should still cleanup when the source end/closes/errors.

@dominictarr

@mikeal reading through http.js looks like it does emit close https://github.com/joyent/node/blob/master/lib/http.js#L1307

@substack

I just ran into this bug again. It seems this bug is more serious than I had previously thought. If you have a duplex stream you will run into this bug. Raynos/duplexer#2

Duplex streams are basically fundamentally broken in node right now. This is probably causing inexplicable bugs in production code right now.

@mikeal
Node.js Foundation member

@dominictarr is that on the end of an http response or is it when the socket is closed?

@isaacs

What writable streams emit 'end'? I think this is a bug in the pipe() function.

HTTP Responses emit 'finish', not 'end'.
Zlib classes emit 'end' when there's no more output; this has nothing to do with the writable end. (Though, that can never happen until all the input has been consumed, of course.)
Sockets emit 'end' when no more 'data' events will occur; this has nothing to do with the writable direction.
FS writable streams don't emit 'end'. They emit 'close' when the fd is closed.

Are there any I'm missing? What's the reason to not just remove that line?

Update: If there are no compelling reasons posted here within a week, I'm taking @koichik's patch in master, and we can deal with the consequences. It doesn't make any tests fail, and I don't currently see any reason why we shouldn't take it.

@jjrdn

@isaacs I agree that is is a bug in the pipe function, but it looks like HTTP responses now emit 'end' instead of 'finish'. See 790d651

After looking at this issue, it seems obvious that readable and writable streams cannot share the same event names, so that should probably be changed back to 'finish', right?

@koichik

Closed by 016afe2. Thanks, @isaacs.

@jjrdn - Right, 790d651 is reverted by 836a06f.

@koichik koichik closed this Oct 13, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment