http: Do not buffer writes to a destroyed socket #4775
Conversation
As a user of this OutgoingMessage, what is the best way to detect that your underlying socket is gone? |
Maybe it should raise EPIPE instead of being quiet like it is now. Otherwise, you'll just be leaking (much smaller) req objects instead of half a million long array of buffers. |
Sending up some kind of error would be nice, but that would also be a change in behavior. I'm happy to fix this in our application code with a simple check for socket.writable or something like that. Is that the best way to do this? |
Well, as I look into this further, it seems like it's actually a pretty subtle thing. If the connection gets destroyed at just the right time, you'll get no error at all, and it'll treat it as a "connecting" request. But, if it's destroyed just a bit later, then you'll get an EPIPE when you try to write to it, because the client won't have gotten the message yet. If it's destroyed a bit sooner, during the first request, then it'll trigger the createHangUpError() function, which will raise ECONNREFUSED on the client (like you probably would expect). So, this only happens when a socket is reused, and the connection gets dropped in between them. (Otherwise, you'd see it every time any of your servers goes down ever.) Maybe the correct fix is to just create a hangup error, and call it a day, since that's the behavior that happens most of the time in v0.8 when a socket dies anyway, and we could make the case that ignoring socket destruction is just a simple bug, and we should treat it like as if it happened while we were listening. |
@mranney Thoughts on 727cad5? This way, it just raises a connection hangup error no matter when it happens, which is probably more correct anyway. |
I agree that it seems more correct, so it's probably better. |
Prior to v0.10, Node ignored ECONNRESET errors in many situations. There *are* valid cases in which ECONNRESET should be ignored as a normal part of the TCP dance, but in many others, it's a very relevant signal that must be heeded with care. Exacerbating this problem, if the OutgoingMessage does not have a req.connection._handle, it assumes that it is in the process of connecting, and thus buffers writes up in an array. The problem happens when you reuse a socket between two requests, and it is destroyed abruptly in between them. The writes will be buffered, because the socket has no handle, but it's not ever going to GET a handle, because it's not connecting, it's destroyed. The proper fix is to treat ECONNRESET correctly. However, this is a behavior/semantics change, and cannot land in a stable branch. Fix nodejs#4775
Looks great. I love it. @isaacs, You're my hero. |
Prior to v0.10, Node ignored ECONNRESET errors in many situations. There *are* valid cases in which ECONNRESET should be ignored as a normal part of the TCP dance, but in many others, it's a very relevant signal that must be heeded with care. Exacerbating this problem, if the OutgoingMessage does not have a req.connection._handle, it assumes that it is in the process of connecting, and thus buffers writes up in an array. The problem happens when you reuse a socket between two requests, and it is destroyed abruptly in between them. The writes will be buffered, because the socket has no handle, but it's not ever going to GET a handle, because it's not connecting, it's destroyed. The proper fix is to treat ECONNRESET correctly. However, this is a behavior/semantics change, and cannot land in a stable branch. Fix #4775
Landed in v0.8. |
@isaacs It seems like this change is causing some people's "working" apps to die from socket hang up errors. I just saw a user report it at senchalabs/connect#750 while I was searching to see if any people were having the same issue as myself. For myself, I had to temporarily revert back to 0.8.19 while I figure out how to handle these. I am getting the socket hang up errors in a server application using express and and hang up will occur on the first write to a client. The main issue is that a lot of the clients are sadly within networks which have monitoring equipment that is trying to block clients from accessing the application. Since the application is working over SSL, this monitoring equipment just injects Do you have any suggestions, perhaps? |
@dougwilson Listening for error events should solve the problem, unless express is getting in the way somehow. On 0.8.19 you are probably leaking memory, but maybe that's better than the arms race that it sounds like you are already dealing with. |
@mranney Thanks, but AFAIK the memory leak was only for http clients, not servers. At least in my situation, it wasn't leaking, as since I have been able to work around it now by listening for errors high up and just reverting to the pre-0.8.20 behavior of "swallowing" |
Man, HTTP is hard. |
@dougwilson Can you provide a standalone example script/test that shows the bad behavior? If it is actually not a leak in the server case, and is causing new unwanted errors, then we will fix it, and a failing test will make that much easier. However, like @mranney, I suspect that you were actually leaking memory before, and ought to be listening to error events anyway. |
So I ended up using the following simple code which does not error pre-0.8.20 and does afterwards: // node --trace_gc --expose_gc
var http = require('http');
var server = http.createServer(function (req, res) {
res.once('error', function (error) {
// Only encountered in 0.8.20
console.error('sock: ' + error);
});
res.write('client, destroy connection\n\n');
setTimeout(function () {
res.write(Buffer(65536));
res.write(Buffer(65536));
res.write(Buffer(65536));
res.end();
process.nextTick(gc);
}, 2000);
});
setInterval(function () {
console.error('RSS: %d', process.memoryUsage().rss);
}, 2000);
server.listen(3000); And I do indeed see a memory leak pre-0.8.20 and not in 0.8.20. I repeatedly ran Anyway, thanks, I'm listening to error events now on |
I see. So, the memory leak definitely affects servers as well as clients. Potentially much worse, in fact. Check out this script: https://gist.github.com/isaacs/4973669 Run that script with node v0.8.19, and you'll see that it's much worse than just crisply emitting an error: the server keeps writing data and filling up the output buffer until it runs out of memory relatively quickly (or at least, would, if it was not just writing the same exact buffer repeatedly). There's no universe in which that kind of behavior is not a bug. An ECONNRESET on a socket that you're writing to is clearly an unexpected, erroneous event. v0.8.19 will raise a hangup error if the ECONNRESET occurs at other times, but ignores it once the 'request' event is emitted. Thereafter, it treats reset sockets as if they were in the process of connecting (which is, for servers, even more wrong!) It does pain me that the behavior changed. But choosing between leaking memory to slow death, vs emitting an error event (and crash your program if unhandled) with a clear and evident cause, it's a no-brainer in terms of overall stability and debuggability. Add error event handlers to every object you touch, or use domains. You can use the cluster module, and something like the |
Thanks. I am using clusters, so crashes were only taking out workers, but typically a worker would be handling lots of requests at once, so one client hanging up on the worker would cut off lots of other user's requests, which sucked. I mainly wanted to bring this up and hear what you though, so thanks a lot, I appreciate it! |
can docs be added about http://nodejs.org/api/http.html#http_class_http_serverresponse |
Doc patch welcome. |
This fix fills me with sadness. Here is why:
Prior to v0.10, Node ignored ECONNRESET errors in many situations.
There are valid cases in which ECONNRESET should be ignored as a
normal part of the TCP dance, but in many others, it's a very relevant
signal that must be heeded with care.
Exacerbating this problem, if the OutgoingMessage does not have a
req.connection._handle, it assumes that it is in the process of
connecting, and thus buffers writes up in an array.
The problem happens when you reuse a socket between two requests, and it
is destroyed abruptly in between them. The writes will be buffered,
because the socket has no handle, but it's not ever going to GET a
handle, because it's not connecting, it's destroyed.
The proper fix is to treat ECONNRESET correctly. However, this is a
behavior/semantics change, and cannot land in a stable branch. So, the
full-of-sad bandaid fix is to not put data into the output buffer if the
socket is destroyed, and also remove anything that is in the output
buffer when the HTTP request sees that it closes.
/cc @dpacheco @bcantrill @mranney @piscisaureus