New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sockets can emit `close` before `end`, breaking #479. #490

Closed
vqvu opened this Issue Apr 28, 2016 · 8 comments

Comments

Projects
None yet
2 participants
@vqvu
Collaborator

vqvu commented Apr 28, 2016

In this week's installment "Node Streams Incompatibility", it turns out that Sockets can emit end after close even though the Readable streams docs say that

The [close] event indicates that no more events will be emitted, and no further computation will occur.

Emphasis mine.
#479 was written under the assumption that a close means no more data to emit. It will end the wrapper stream if it ever receives close without having seen end before. This seems compatible with the Readable docs.

However, for a Socket, close means the client disconnected. There may still be buffered data previously received that has not yet been emitted. In fact, it is possible to receive end after close (see below). The end result is that if you are slow at consuming the incoming data, and the client disconnects before you're done, the Highland stream will end, unpipe from the socket, and you lose any pending data.

Now, there's a reason why #479 tries to end the wrapper when it gets . It is possible to get close without end (see below). If we don't close, the wrapper stream will never end.

So now we have a case where we can't end on close or we lose data and a case where we must end on close or we hang. What should we do?

I'm of the opinion that it is better to hand (visible though difficult to debug) than to throw away data (possibly not visible). Especially since the former case seems to require an explicit call to destroy by the user.

Thoughts?


Close before end in Sockets

I discovered this when messing around with sockets from the discussion in #489. Start this server.

var i = 0;
var dest = _();
var server = net.createServer((socket) => {
    var id = i++;

    console.log(`Client ${id}: connected`);
    socket.on('end', () => console.log(`Client ${id}: ended`))
        .on('close', () => console.log(`Client ${id}: closed`));

    socket.setEncoding('utf-8');

    _(socket)
        .doto(x => {
            console.log(`Client ${id}: received '${x.trim()}'`);
            if (x === 'consume\r\n') {
                console.log(`Client ${id}: consuming`);
                dest.each(() => {});
            }
        })
        .pipe(dest, {end: false});
});

server.listen(8080, () => console.log('server up.'));

In this example, dest starts out blocked, simulating slowness. In a different terminal, use telnet to send some data.

$ telnet localhost 8080
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
abc
def
ghi
^]

telnet> Connection closed.

Close the connection, you should see the server print out

server up.
Client 0: connected
Client 0: received 'abc'
Client 0: closed

Open a different connection and unblock the stream by sending the word consume.

$ telnet localhost 8080
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
consume
^]

telnet> Connection closed.

You will see this under v2.7.4

Client 1: connected
Client 1: received 'consume'
Client 1: consuming
Client 0: received 'def'
Client 0: received 'ghi'
Client 0: ended         <-- `end` event from the first client!
Client 1: ended
Client 1: closed

and this under master. Note all the missing data from the first client.

server up.
Client 0: connected
Client 0: received 'abc'
Client 0: closed
Client 1: connected
Client 1: received 'consume'
Client 1: consuming
Client 1: ended
Client 1: closed

Close without end

Destroying an FS stream emits close but not end.

// The file here must be larger than the highWaterMark.
var read = fs.createReadStream('some-large-file');

var dest = _(read);

read.on('close', () => {
    console.log('closed');
    dest.done(() => console.log('I am done.'));
});

read.on('end', () => console.log('end'));

dest.pull((err, x) => {
    console.log("Received chunk.");
    read.destroy();
});

// Under master
// => Received chunk.
// => closed
// => I am done.

// Under 2.7.4
// => Received chunk.
// => closed

@vqvu vqvu added the discussion label Apr 28, 2016

vqvu added a commit to vqvu/highland that referenced this issue May 6, 2016

@vqvu

This comment has been minimized.

Show comment
Hide comment
@vqvu

vqvu May 6, 2016

Collaborator

Since silent dropping data seems way worse, I've gone ahead and reverted the relevant parts of the changes from #479. See 76d13ec. This is released as v2.8.1.

Feel free to reopen if anyone has anything to say.

Collaborator

vqvu commented May 6, 2016

Since silent dropping data seems way worse, I've gone ahead and reverted the relevant parts of the changes from #479. See 76d13ec. This is released as v2.8.1.

Feel free to reopen if anyone has anything to say.

@vqvu vqvu closed this May 6, 2016

@ronag

This comment has been minimized.

Show comment
Hide comment
@ronag

ronag May 6, 2016

@vqvu I was waiting for this fix. 'end' or 'error' isn't called when e.g. http requests to a server are aborted by the client... only 'close'...

ronag commented May 6, 2016

@vqvu I was waiting for this fix. 'end' or 'error' isn't called when e.g. http requests to a server are aborted by the client... only 'close'...

@vqvu

This comment has been minimized.

Show comment
Hide comment
@vqvu

vqvu May 6, 2016

Collaborator

Yeah...I'm not sure what we can do about it, honestly. We can't support both the IncomingMessage close behavior and the Socket close behavior at the same time. I guess it's too much to ask for node Readables to have consistent semantics...

The best we can do here, is to add some sort of new _.fromReadable method that takes an options object that defines how to proceed. So the IncomingMessage usage would be something like _.fromReadable(req, {endOnClose: true}). We could also add it directly to the _ constructor if I can figure out a way to describe the usage without being too confusing.

I assume something like that would work for you?

Collaborator

vqvu commented May 6, 2016

Yeah...I'm not sure what we can do about it, honestly. We can't support both the IncomingMessage close behavior and the Socket close behavior at the same time. I guess it's too much to ask for node Readables to have consistent semantics...

The best we can do here, is to add some sort of new _.fromReadable method that takes an options object that defines how to proceed. So the IncomingMessage usage would be something like _.fromReadable(req, {endOnClose: true}). We could also add it directly to the _ constructor if I can figure out a way to describe the usage without being too confusing.

I assume something like that would work for you?

@ronag

This comment has been minimized.

Show comment
Hide comment
@ronag

ronag May 26, 2016

@vqvu: Yes, that would work... Right now I do this:

    const stream = new PassThrough()
    req.on('error', err => stream.emit('error', err))
    req.on('close', err => stream.emit('end'))
    req.pipe(stream)
    const stream2 = _(stream)

ronag commented May 26, 2016

@vqvu: Yes, that would work... Right now I do this:

    const stream = new PassThrough()
    req.on('error', err => stream.emit('error', err))
    req.on('close', err => stream.emit('end'))
    req.pipe(stream)
    const stream2 = _(stream)
@ronag

This comment has been minimized.

Show comment
Hide comment
@ronag

ronag May 27, 2016

Can't we solve the socket case by testing whether more data can be retrieved after 'close' and if not 'end'.

ronag commented May 27, 2016

Can't we solve the socket case by testing whether more data can be retrieved after 'close' and if not 'end'.

@vqvu

This comment has been minimized.

Show comment
Hide comment
@vqvu

vqvu May 27, 2016

Collaborator

I'm not sure how. I don't know of any API that lets us check for buffered data.

Collaborator

vqvu commented May 27, 2016

I'm not sure how. I don't know of any API that lets us check for buffered data.

@ronag

This comment has been minimized.

Show comment
Hide comment
@ronag

ronag commented Jul 3, 2016

Not sure if this is any help? https://github.com/jshttp/on-finished

@vqvu

This comment has been minimized.

Show comment
Hide comment
@vqvu

vqvu Jul 4, 2016

Collaborator

That code does inspection on specific http request/response data structures, which we can't really do, since it doesn't apply to all Readables. However, it does give me an idea. See #505.

We can simply punt the issue to the user entirely by allowing the second argument to be an arbitrary handler which can be as simple or complex as necessary. People can then delegate this behavior to a library like on-finish if they'd like (e.g., _(req, onFinish)).

Collaborator

vqvu commented Jul 4, 2016

That code does inspection on specific http request/response data structures, which we can't really do, since it doesn't apply to all Readables. However, it does give me an idea. See #505.

We can simply punt the issue to the user entirely by allowing the second argument to be an arbitrary handler which can be as simple or complex as necessary. People can then delegate this behavior to a library like on-finish if they'd like (e.g., _(req, onFinish)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment