New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections stay in CLOSE_WAIT with Node 0.10.x #180

Closed
nicokaiser opened this Issue Apr 10, 2013 · 35 comments

Comments

Projects
None yet
10 participants
@nicokaiser
Contributor

nicokaiser commented Apr 10, 2013

After upgrading to Node 0.10 my servers keep accumulating TCP connections in the CLOSE_WAIT state until it hits the max-files limit.

Example server (based on the ws README example:

var WebSocketServer = require('ws').Server
  , wss = new WebSocketServer({port: 8080});

wss.on('connection', function(ws) {
  console.log('connection');
  ws.send('something');
  ws.on('close', function() {
    console.log('close');
    //ws.upgradeReq.socket.destroy();
  });
  ws.on('end', function() {
    console.log('end');
  });
});
  • Run the above server
  • Open Chrome console and execute new WebSocket('ws://localhost:8080/'); several times (I could not reproduce this issue with wscat).
  • Hit reload (all the connections close, you see some "close" message in the server output
  • lsof | grep CLOSE_WAIT lists all the closed connections in CLOSE_WAIT state
  • Terminate the server
  • lsof | grep CLOSE_WAIT is empty

If you modify the server and comment in the destroy line, there are no CLOSE_WAIT connections.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Apr 10, 2013

Contributor

websocket.io does not seem to have this problem, as all sockets are destroyed (!) after close: https://github.com/LearnBoost/websocket.io/blob/master/lib/socket.js#L76

(while this is not nice at all, this workarounds the above problem)

Contributor

nicokaiser commented Apr 10, 2013

websocket.io does not seem to have this problem, as all sockets are destroyed (!) after close: https://github.com/LearnBoost/websocket.io/blob/master/lib/socket.js#L76

(while this is not nice at all, this workarounds the above problem)

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Apr 11, 2013

Contributor

The problem is Socket.end() called in cleanupWebsocketResource does not really close the socket. The socket stays half-open forever.

https://github.com/einaros/ws/blob/master/lib/WebSocket.js#L680

Maybe this should be if (!error) this._socket.destroy(); as we cannot hope for broken clients properly closing the connection (at least Chrome does not seem to do so)...

Contributor

nicokaiser commented Apr 11, 2013

The problem is Socket.end() called in cleanupWebsocketResource does not really close the socket. The socket stays half-open forever.

https://github.com/einaros/ws/blob/master/lib/WebSocket.js#L680

Maybe this should be if (!error) this._socket.destroy(); as we cannot hope for broken clients properly closing the connection (at least Chrome does not seem to do so)...

@3rd-Eden

This comment has been minimized.

Show comment
Hide comment
@3rd-Eden

3rd-Eden Apr 11, 2013

Member

@nicokaiser did the change you suggested in #181 resolve this?

Member

3rd-Eden commented Apr 11, 2013

@nicokaiser did the change you suggested in #181 resolve this?

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Apr 11, 2013

Contributor

@3rd-Eden no, #181 did not help here. The destroy() call in #181 only kills broken connections, in this case the end() method is called (which does not end the socket).

Contributor

nicokaiser commented Apr 11, 2013

@3rd-Eden no, #181 did not help here. The destroy() call in #181 only kills broken connections, in this case the end() method is called (which does not end the socket).

@3rd-Eden

This comment has been minimized.

Show comment
Hide comment
@3rd-Eden

3rd-Eden Apr 11, 2013

Member

@nicokaiser according to the docs, destroy should only be called when an error occurred. So I'm wondering if there's something else that keeps these sockets open (which would be a leak?).

Member

3rd-Eden commented Apr 11, 2013

@nicokaiser according to the docs, destroy should only be called when an error occurred. So I'm wondering if there's something else that keeps these sockets open (which would be a leak?).

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Apr 11, 2013

Contributor

Yes, I know. So there seems to be a leak. Can you test the server I posted above with Chrome as WebSocket client?

I'll try to write a test for the ws module, but the tests seem broken for me (DEPTH_ZERO_SELF_SIGNED_CERT)

Contributor

nicokaiser commented Apr 11, 2013

Yes, I know. So there seems to be a leak. Can you test the server I posted above with Chrome as WebSocket client?

I'll try to write a test for the ws module, but the tests seem broken for me (DEPTH_ZERO_SELF_SIGNED_CERT)

@3rd-Eden

This comment has been minimized.

Show comment
Hide comment
@3rd-Eden

3rd-Eden Apr 11, 2013

Member

I'm not running 0.10 yet, i'm waiting until the dreadfull crypto performance is fixed.

Member

3rd-Eden commented Apr 11, 2013

I'm not running 0.10 yet, i'm waiting until the dreadfull crypto performance is fixed.

@einaros

This comment has been minimized.

Show comment
Hide comment
@einaros

einaros Apr 25, 2013

Contributor

Hrm. Is this a node-in-general type issue? Have you seen any discussion on it elsewhere?

Contributor

einaros commented Apr 25, 2013

Hrm. Is this a node-in-general type issue? Have you seen any discussion on it elsewhere?

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Apr 25, 2013

Contributor

I don't know. Did not find anything usable in other bug reports, and I did not find any information if the Socket.end() behavior has changed since 0.8.

Contributor

nicokaiser commented Apr 25, 2013

I don't know. Did not find anything usable in other bug reports, and I did not find any information if the Socket.end() behavior has changed since 0.8.

@tarunkohli

This comment has been minimized.

Show comment
Hide comment
@tarunkohli

tarunkohli Apr 25, 2013

I'm facing a similar issue in my app. Node 0.10 HTTP Server(read: NOT WebSocketServer) starts leaving the connection in CLOSE_WAIT stage when I simulate a load of more than 1000 concurrent users. I'm using ab(apache bench) to simulate the concurrent requests to our node.js app.

I initially thought that it was a problem with our app leaking memory while accessing a Redis cluster and MySQL databases but then I tested this behavior with just a "Hello World" app and results into similar behavior.

tarunkohli commented Apr 25, 2013

I'm facing a similar issue in my app. Node 0.10 HTTP Server(read: NOT WebSocketServer) starts leaving the connection in CLOSE_WAIT stage when I simulate a load of more than 1000 concurrent users. I'm using ab(apache bench) to simulate the concurrent requests to our node.js app.

I initially thought that it was a problem with our app leaking memory while accessing a Redis cluster and MySQL databases but then I tested this behavior with just a "Hello World" app and results into similar behavior.

@3rd-Eden

This comment has been minimized.

Show comment
Hide comment
@3rd-Eden

3rd-Eden Apr 25, 2013

Member

There aren't any incompatible changes described in the upgrade list as well; https://github.com/joyent/node/wiki/API-changes-between-v0.8-and-v0.10

@isaacs are you aware of any issues like this?

Member

3rd-Eden commented Apr 25, 2013

There aren't any incompatible changes described in the upgrade list as well; https://github.com/joyent/node/wiki/API-changes-between-v0.8-and-v0.10

@isaacs are you aware of any issues like this?

@isaacs

This comment has been minimized.

Show comment
Hide comment
@isaacs

isaacs Apr 25, 2013

@tarunkohli @3rd-Eden This sounds like potentially a bug in node itself. Can you please post your example program and test method in a new issue over at https://github.com/joyent/node/issues?

isaacs commented Apr 25, 2013

@tarunkohli @3rd-Eden This sounds like potentially a bug in node itself. Can you please post your example program and test method in a new issue over at https://github.com/joyent/node/issues?

@3rd-Eden

This comment has been minimized.

Show comment
Hide comment
@3rd-Eden

3rd-Eden May 9, 2013

Member

Was skimming through the node-redis changelog today and noticed; NodeRedis/node_redis#419 so i'm wondering if changing our stream.end calls to stream.destroySoon would resolve this.

@nicokaiser could you see if https://github.com/einaros/ws/blob/master/lib/WebSocket.js#L688 -> destroySoon resolves it for you?

Member

3rd-Eden commented May 9, 2013

Was skimming through the node-redis changelog today and noticed; NodeRedis/node_redis#419 so i'm wondering if changing our stream.end calls to stream.destroySoon would resolve this.

@nicokaiser could you see if https://github.com/einaros/ws/blob/master/lib/WebSocket.js#L688 -> destroySoon resolves it for you?

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser May 9, 2013

Contributor

@3rd-Eden net.Socket does not have destroySoon, does it?

Changing the socket.end calls to socket.destroy seems to solve the problem, however completely destroying the socket instead of cleanly ending them is not the nicest way, as @einaros said somewhere else...

Contributor

nicokaiser commented May 9, 2013

@3rd-Eden net.Socket does not have destroySoon, does it?

Changing the socket.end calls to socket.destroy seems to solve the problem, however completely destroying the socket instead of cleanly ending them is not the nicest way, as @einaros said somewhere else...

@3rd-Eden

This comment has been minimized.

Show comment
Hide comment
@3rd-Eden

3rd-Eden May 9, 2013

Member

@nicokaiser They do, but it's an internal method; https://github.com/joyent/node/blob/master/lib/net.js#L410-L418 what it does it try to end nicely when possible or destroy it when it's allowed to.

Member

3rd-Eden commented May 9, 2013

@nicokaiser They do, but it's an internal method; https://github.com/joyent/node/blob/master/lib/net.js#L410-L418 what it does it try to end nicely when possible or destroy it when it's allowed to.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser May 15, 2013

Contributor

@3rd-Eden don't know if it's a good idea to rely on internal node methods here...

Contributor

nicokaiser commented May 15, 2013

@3rd-Eden don't know if it's a good idea to rely on internal node methods here...

@3rd-Eden

This comment has been minimized.

Show comment
Hide comment
@3rd-Eden

3rd-Eden May 15, 2013

Member

I know, but forcefully destroying a socket isn't either.

Twitter: @3rdeden
Github: @3rd-Eden

On Wed, May 15, 2013 at 9:53 AM, Nico Kaiser notifications@github.com
wrote:

@3rd-Eden don't know if it's a good idea to rely on internal node methods here...

Reply to this email directly or view it on GitHub:
#180 (comment)

Member

3rd-Eden commented May 15, 2013

I know, but forcefully destroying a socket isn't either.

Twitter: @3rdeden
Github: @3rd-Eden

On Wed, May 15, 2013 at 9:53 AM, Nico Kaiser notifications@github.com
wrote:

@3rd-Eden don't know if it's a good idea to rely on internal node methods here...

Reply to this email directly or view it on GitHub:
#180 (comment)

@isaacs

This comment has been minimized.

Show comment
Hide comment
@isaacs

isaacs May 17, 2013

Using destroySoon is not a good idea. We've been planning for a long time to remove or rename that function, because it's really confusing. But the fact that it fixes the error is good information.

Again, can anyone point me at a minimized test case where this happens reliably? There's mention of such a thing in this thread, but I haven't been able to reproduce it. I'd really like to get this fixed properly.

isaacs commented May 17, 2013

Using destroySoon is not a good idea. We've been planning for a long time to remove or rename that function, because it's really confusing. But the fact that it fixes the error is good information.

Again, can anyone point me at a minimized test case where this happens reliably? There's mention of such a thing in this thread, but I haven't been able to reproduce it. I'd really like to get this fixed properly.

@brianseeders

This comment has been minimized.

Show comment
Hide comment
@brianseeders

brianseeders May 20, 2013

I happened across this thread while trying to diagnose a similar issue.

FYI, this is also happening for me, except using http.request. All of my connections sit around in CLOSE_WAIT and it seems that I have to forcibly close them. So, maybe it is indeed a problem in node itself?

brianseeders commented May 20, 2013

I happened across this thread while trying to diagnose a similar issue.

FYI, this is also happening for me, except using http.request. All of my connections sit around in CLOSE_WAIT and it seems that I have to forcibly close them. So, maybe it is indeed a problem in node itself?

@isaacs

This comment has been minimized.

Show comment
Hide comment
@isaacs

isaacs May 20, 2013

@brianseeders I'm sure that it is. Do you have a test case that can reproduce it? Are you the client or the server?

If you'er the client, what server are you connecting to? If you're the server, where are the connections coming from?

I'd love to fix this, but don't have anything more than anecdotal reports, so it's pretty hard to track down.

isaacs commented May 20, 2013

@brianseeders I'm sure that it is. Do you have a test case that can reproduce it? Are you the client or the server?

If you'er the client, what server are you connecting to? If you're the server, where are the connections coming from?

I'd love to fix this, but don't have anything more than anecdotal reports, so it's pretty hard to track down.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser May 20, 2013

Contributor

@isaacs The initial issue report can reproduce the problem.

I did not manage to reproduce the problem with node clients, maybe because node always closes connections cleanly. However if you use the code I posted and make some WebSocket connections from Chrome, then Quit Chrome, the connections stay in CLOSE_WAIT until I restart the server.

Contributor

nicokaiser commented May 20, 2013

@isaacs The initial issue report can reproduce the problem.

I did not manage to reproduce the problem with node clients, maybe because node always closes connections cleanly. However if you use the code I posted and make some WebSocket connections from Chrome, then Quit Chrome, the connections stay in CLOSE_WAIT until I restart the server.

@brianseeders

This comment has been minimized.

Show comment
Hide comment
@brianseeders

brianseeders May 20, 2013

I am the client. I'm connecting to 100s of different hosts (with 10-40 connections concurrent, depending on how I set it). I'm processing URLs coming out of Twitter's stream, so it's a lot of different hosts.

The problem actually ends up showing up as DNS errors for me, because DNS lookups don't work. That is because no more connections can be made, even DNS lookups apparently, because of reaching the open files limit.

I am trying to put together a test case right now.

brianseeders commented May 20, 2013

I am the client. I'm connecting to 100s of different hosts (with 10-40 connections concurrent, depending on how I set it). I'm processing URLs coming out of Twitter's stream, so it's a lot of different hosts.

The problem actually ends up showing up as DNS errors for me, because DNS lookups don't work. That is because no more connections can be made, even DNS lookups apparently, because of reaching the open files limit.

I am trying to put together a test case right now.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser May 20, 2013

Contributor

Example client.js that destroys the socket, leading to a CLOSE_WAIT connection:

var WebSocket = require('ws');

var ws = new WebSocket('ws://localhost:8080');

ws.on('message', function(data, flags) {
    // the server.js will send us "something" on connection,
    // so we close the connection immediately after we receive this

    //ws.close();
    ws._socket.destroy();
    // as we are not a nice client, we destroy the socket,
    // without telling the server via WebSocket protocol
});

To reproduce the problem, run server.js (the code in the initial issue report), and run client.js several times.

The server will notice that the connections are closed (the close event is fired), but the all the connections stay in CLOSE_WAIT.

If the client closes the connection with ws.close() everything works fine, but we cannot assume every client does this (Chrome does not).

Contributor

nicokaiser commented May 20, 2013

Example client.js that destroys the socket, leading to a CLOSE_WAIT connection:

var WebSocket = require('ws');

var ws = new WebSocket('ws://localhost:8080');

ws.on('message', function(data, flags) {
    // the server.js will send us "something" on connection,
    // so we close the connection immediately after we receive this

    //ws.close();
    ws._socket.destroy();
    // as we are not a nice client, we destroy the socket,
    // without telling the server via WebSocket protocol
});

To reproduce the problem, run server.js (the code in the initial issue report), and run client.js several times.

The server will notice that the connections are closed (the close event is fired), but the all the connections stay in CLOSE_WAIT.

If the client closes the connection with ws.close() everything works fine, but we cannot assume every client does this (Chrome does not).

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser May 20, 2013

Contributor

As @3rd-Eden mentioned, changing the socket.end() in https://github.com/einaros/ws/blob/master/lib/WebSocket.js#L693 to socket.destroySoon() fixes the problem.

But I agree on @isaacs using internal functions is not the best way, let's better fix node's socket.end behavior.

Contributor

nicokaiser commented May 20, 2013

As @3rd-Eden mentioned, changing the socket.end() in https://github.com/einaros/ws/blob/master/lib/WebSocket.js#L693 to socket.destroySoon() fixes the problem.

But I agree on @isaacs using internal functions is not the best way, let's better fix node's socket.end behavior.

@brianseeders

This comment has been minimized.

Show comment
Hide comment
@brianseeders

brianseeders May 20, 2013

I've created a gist: https://gist.github.com/brianseeders/4f93615c1182e7d76956

The IP in that file is a server I spun up, so feel free to execute this against it.

The problem seems to depend on the size of the response. Small responses did not seem to produce the issue, but a larger one does.

Run this code, and check lsof -l | grep -c "162.209.11.173.*CLOSE_WAIT". You will begin seeing EMFILE errors once the number of connections begins to reach the open files limit.

There's not an explicit call that must be made to end the request, is there? I looked through the documentation and could not find one.

brianseeders commented May 20, 2013

I've created a gist: https://gist.github.com/brianseeders/4f93615c1182e7d76956

The IP in that file is a server I spun up, so feel free to execute this against it.

The problem seems to depend on the size of the response. Small responses did not seem to produce the issue, but a larger one does.

Run this code, and check lsof -l | grep -c "162.209.11.173.*CLOSE_WAIT". You will begin seeing EMFILE errors once the number of connections begins to reach the open files limit.

There's not an explicit call that must be made to end the request, is there? I looked through the documentation and could not find one.

@qzaidi

This comment has been minimized.

Show comment
Hide comment
@qzaidi

qzaidi May 27, 2013

For the record, we are also seeing this issue in 0.10.3. We don't use websockets, use cluster+redis+express.

qzaidi commented May 27, 2013

For the record, we are also seeing this issue in 0.10.3. We don't use websockets, use cluster+redis+express.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser May 31, 2013

Contributor

I'm sorry, I still cannot reproduce this with net.Socket (without ws).

@einaros can you have a look at the example above? ws seems to use a special combination of callbacks that trigger this problem...

Contributor

nicokaiser commented May 31, 2013

I'm sorry, I still cannot reproduce this with net.Socket (without ws).

@einaros can you have a look at the example above? ws seems to use a special combination of callbacks that trigger this problem...

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Jun 5, 2013

Contributor

... same for node 0.10.10 ...

Contributor

nicokaiser commented Jun 5, 2013

... same for node 0.10.10 ...

@piscisaureus

This comment has been minimized.

Show comment
Hide comment
@piscisaureus

piscisaureus commented Jun 6, 2013

Does nodejs/node-v0.x-archive@e0519ac fix it for anyone?

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Jun 6, 2013

Contributor

@piscisaureus Seems like this fixes the CLOSE_WAIT problem.

However there is another thing: when I start the server.js and then client.js it works fine (apart from the CLOSE_WAIT problem for node < 0.10.11). When I start client.js again and again, after some time (maybe 2 or 3 times) the client does not get data from the server.

Don't know if this is a node bug or something else, would be nice if anyone could try to reproduce this. I'll try this on node 0.8 tomorrow.

Contributor

nicokaiser commented Jun 6, 2013

@piscisaureus Seems like this fixes the CLOSE_WAIT problem.

However there is another thing: when I start the server.js and then client.js it works fine (apart from the CLOSE_WAIT problem for node < 0.10.11). When I start client.js again and again, after some time (maybe 2 or 3 times) the client does not get data from the server.

Don't know if this is a node bug or something else, would be nice if anyone could try to reproduce this. I'll try this on node 0.8 tomorrow.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Jun 7, 2013

Contributor

Yes, nodejs/node-v0.x-archive@e0519ac seems to restore the behavior of node 0.8.x in this case.

Contributor

nicokaiser commented Jun 7, 2013

Yes, nodejs/node-v0.x-archive@e0519ac seems to restore the behavior of node 0.8.x in this case.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Jun 14, 2013

Contributor

This seems to be fixed with Node 0.10.11. @brianseeders can you confirm this?

Contributor

nicokaiser commented Jun 14, 2013

This seems to be fixed with Node 0.10.11. @brianseeders can you confirm this?

@mikestaszel

This comment has been minimized.

Show comment
Hide comment
@mikestaszel

mikestaszel Oct 4, 2013

I'm seeing this on Node 0.10.4, would like to know if bumping up to 0.10.11 or even 0.10.20 would help.

mikestaszel commented Oct 4, 2013

I'm seeing this on Node 0.10.4, would like to know if bumping up to 0.10.11 or even 0.10.20 would help.

@nicokaiser

This comment has been minimized.

Show comment
Hide comment
@nicokaiser

nicokaiser Oct 4, 2013

Contributor

@crimsonredmk For me upgrading to Node 0.10.11 helped, so I close this issue.

Contributor

nicokaiser commented Oct 4, 2013

@crimsonredmk For me upgrading to Node 0.10.11 helped, so I close this issue.

@nicokaiser nicokaiser closed this Oct 4, 2013

@mkamioner

This comment has been minimized.

Show comment
Hide comment
@mkamioner

mkamioner Aug 5, 2015

I think this fixes the issue: #540

mkamioner commented Aug 5, 2015

I think this fixes the issue: #540

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment