High memory usage (or memleak?) #43

Closed
nicokaiser opened this Issue Mar 21, 2012 · 103 comments

Projects

None yet
@nicokaiser
Contributor

Hi all!

I created a small example for the memory leak my ws (or WebSocket.IO) server has.

This small server only accepts WebSocket connections and sends one message.
Client count and memory usage is displayed every 2 seconds.

The client open 10,000 connections, then every minute 2,000 new connections are openend and 2,000 are closed, so the count stays at about 10,000.

https://gist.github.com/2152933

Memory Graph

The graph is the output of server.js (run on a EC2 "large" instance with node 0.6.13 and ws 0.4.9).

  • Server startup: RSS is about 13 MB (0 clients)
  • When the first 10,000 clients are connected, RSS = 420 MB
  • During the next 10 minutes clients come and go (see client.js), RSS grows to 620 MB
  • heap usage stay stable
  • Client is stopped =(> RSS falls to 220 MB waited 1 minute for GC)
  • After 1 minute, client ist started again => RSS jumps to 440 MB (10,000 clients)
  • During the next minutes, RSS grows again up to 630 MB (after 10 minutes)
  • Client is stopped => RSS falls to 495 MB
  • When I start the client again, RSS usage seems stable (at least compared to the two runs before).

Questions:

  1. 400 MB for 10.000 clients (with no data attached) is much. But I see JS as an interpreted language is not that memory optimized than C. BUT, why does opening and closing 20.000 connections (during the 10 minute period) consume another 200 MB?
  2. The process is at about 30% CPU, so the GC has the chance to kick in (and, ws uses nextTick, so the GC really has a chance)
  3. Why is he GC unable to free the memory after the second run? Can't be the Buffer/SlowBuffer problem (fragmented small Buffers in different 8k SlowBuffers), as there are no Buffers used anymore...
  4. Why does the RSS usage remain pretty stable after the first two runs?

The effect is the same (but slower) with only 1,000 client connections per minute, but things get even worse when I don't stop the client after 10 minutes. Our production server runs with about 30-40% CPU constantly (30k connections), 1-5% at night (1-2k connections), but is never completely idle. The growing of the RSS usage never seems to stop.

On the production server, RSS grows until node crashes (0.4.x crashes at 1 GB) or the process gets killed by the system (0.6 supports more than 1 GB).

I'll try two things tomorrow:

  • The same setup with Node 0.4 (as 0.4 seems much better in regards of memory consumption than 0.6), and
  • with different WebSocket libraries, e.g. "websock", which is not 100% stable but only consumes one third (still leaking though), and a variant of WebSocket.IO with the original Socket.IO (not ws!) hybi parsers, see my "nows" branch.

The setup is a default Amazon EC2 large instance, so this must be an issue for anyone who runs a WebSocket server using ws (or must likely also WebSocket.IO with ws receivers) with some traffic. I refuse to believe Node is not capable of serving this.

Or am I missing something?

Nico

@nicokaiser
Contributor

Update: the 60-minute version is still running, I'll post the results tomorrow. RSS seems to stabilize around 900 MB if I don't stop the clients. Come on, 900 MB (500 of which is garbage!)?!

@nicokaiser
Contributor

This is 60 minutes with 1 minute pause after each 60 minute run:

ws 60 minutes

Maybe the "leak" is no leak but very very high memory consumption...

@einaros
Contributor
einaros commented Mar 22, 2012

Great report, Nico. You are onto something when you say that each connection consumes a lot of memory. I pool rather aggressively, in order to ensure high speed. The 'ws' library is written to be as fast as possible, and will at present outperform all other node.js websocket libraries, for small data amounts as well as large.

That said, I will deal with this, to push the memory use down. An alternative is to introduce a configuration option which allows the user to favor either speed or memory consumption.

In your test above, are you sending any data for the 10k clients? How large would you say the largest packet is?

@nicokaiser
Contributor

This is WebSocket.IO 10-minute (like the very first chart) with Node 0.6.
It's even worse than ws, as it never returns the memory during idle phases.

wsio-test-10min

@nicokaiser nicokaiser closed this Mar 22, 2012
@nicokaiser nicokaiser reopened this Mar 22, 2012
@nicokaiser
Contributor

@einaros The client ist the one in the Gist – the client sends 10k at the beginning of each connection.

A configuration option would be amazing! I'll observe the production server, which is running Node 0.6 right now, if memory usage rise ends somewhere ;)

I understand that much memory is needed to ensure fast connection handling, however the memory should be freed after some (idle?) time...

@einaros
Contributor
einaros commented Mar 22, 2012

For 10k users sending an initial 10k (in a single payload), the consumption would be at least 10.000 * 10k.

I agree that the pools should be released. Adding a set of timers to do that without negatively affecting the performance for high connection counts will require thorough consideration before being implemented.

Thank you for your continued work on this. We'll get to the bottom of it :)

@nicokaiser
Contributor

Well, I open 10,000 clients at the startup, and then open and close 2,000 random clients every minute. So there should be 10,000 clients all the time, but about 20,000 "connection" events (and thus, 20,000 initial 10k messages received) during the 10 minutes.

Isn't the pool released after a client disappers?

@einaros
Contributor
einaros commented Mar 22, 2012

Each pool is released as the client closes, yes.

@nicokaiser
Contributor

@einaros then it must be something different (or poor GC), because after each 10 minute period all clients are disconnected, so there should be no buffer left (so even the shared SlowBuffers should be freed by V8)...

@nicokaiser
Contributor

For comparison, this is WebSocket.IO without ws (backported the original Socket.IO hybi modules) and without client tracking (only clientsCount):

wsionows-test-10min

GC seems to run from time to time, but does not catch everything. Maybe interesting for @guille

@einaros
Contributor
einaros commented Mar 22, 2012

@nicokaiser, I wrote the "old" websocket.io hybi parsers as well, so this is all my thing to fix in either case :)

@einaros
Contributor
einaros commented Mar 30, 2012

I've found a couple of issues now, which takes care of leaks visible through the v8 profiler's heap dump. With my recent changes, I can spawn up 5k clients to a server, send some data, then disconnect the users, and the resulting heap dump will look more or less as it did before any clients connected.

I can't yet get the rss to fall back to a reasonable size, though, but I'm looking into it.

@3rd-Eden
Member

And you are not leaking any buffers? As they are shows in RSS and not in the V8 heap

On Friday, March 30, 2012 at 10:29 AM, Einar Otto Stangvik wrote:

I've found a couple of issues now, which takes care of leaks visible through the v8 profiler's heap dump. With my recent changes, I can spawn up 5k clients to a server, send some data, then disconnect the users, and the resulting heap dump will look more or less as it did before any clients connected.

I can't yet get the rss to fall back to a reasonable size, though, but I'm looking into it.


Reply to this email directly or view it on GitHub:
#43 (comment)

@einaros
Contributor
einaros commented Mar 30, 2012

@3rd-Eden, for js Buffer/SlowBuffer instances, they should show up in the heap dump. I figured it may be due to allocations in my native extensions, but the results were exactly the same with those disabled.

@nicokaiser
Contributor

@einaros that sounds great! That's exactly the kind of issues I get – growing rss that don't fall back (the difference between heapTotal and rss grow until the process crashes.

@einaros
Contributor
einaros commented Mar 30, 2012

I can't help but think that the remaining issues are caused by something within node leaking native resources, probably triggered by something I am doing. And I'm guessing that something is the buffer pool. Now the question is how the buffers are handled / used wrong, and how they wind up leaking native allocation blocks.

@nicokaiser
Contributor

On the production system, I disabled BufferPools (by using new Buffer - like BufferPool would do with initial size 0 and no growing/shrinking strategy) and validation (by, well, not validating utf8), but this does not solve the problem.

So it must be either the buffertool (mask) (I cannot test the JS version from the Windows port, as I'm using Node 0.4) or something else...

@einaros
Contributor
einaros commented Mar 30, 2012

@nicokaiser, I'll have another look at the bufferutil and see what I can find.

@3rd-Eden
Member

That could certainly be the issue here, as there also countless reports that socket.io on 0.4 leaks less memory than socket.io on node 0.6.
So it could be that node is leaking buffers somewhere as well.

On Friday, March 30, 2012 at 10:48 AM, Einar Otto Stangvik wrote:

I can't help but think that the remaining issues are caused by something within node leaking native resources, probably triggered by something I am doing. And I'm guessing that something is the buffer pool. Now the question is how the buffers are handled / used wrong, and how they wind up leaking native allocation blocks.


Reply to this email directly or view it on GitHub:
#43 (comment)

@einaros
Contributor
einaros commented Mar 30, 2012

@nicokaiser, I disabled my native extensions again, to no avail.

Will go pick a fight with the buffers now.

@nicokaiser
Contributor

Another observation:

  • When I make the server do something expensive (e.g. a long for-loop, or sending all connected client objects over Repl) while there are many clients (and many connection open/close operations), the rss tends to be higher after the expensive (blocking) operation. So something might get queued up and never freed...
@einaros
Contributor
einaros commented Mar 30, 2012

Well it is definitely the buffers. The question is how / why they are released by v8, but the native backing isn't.

@nicokaiser
Contributor

Is there a way to completely avoid the Buffers and e.g. use Strings for testing (I know this is highly inefficient)?

@einaros
Contributor
einaros commented Mar 30, 2012

Well I'm not saying it's my buffers, just node Buffer buffers. I'll just have to find if this is something specific to what I'm doing, or that it's an actual node bug.

@nicokaiser
Contributor

@einaros yes, but on a high-traffic websocket server, ws is the place that naturally generates many many Buffer objects. So if we could avoid this (by using strings and making some inefficient, slow operations with them), we could test if the memory problem disappears then.

@einaros
Contributor
einaros commented Mar 30, 2012

You can't avoid Buffer objects for binary network operations.

@3rd-Eden
Member

Node will transform strings to Buffer automatically anyways, so even if we didn't do binary network operations it would still be impossible (if I remember correctly)

On Friday, March 30, 2012 at 11:13 AM, Nico Kaiser wrote:

@einaros yes, but on a high-traffic websocket server, ws is the place that naturally generates many many Buffer objects. So if we could avoid this (by using strings and making some inefficient, slow operations with them), we could test if the memory problem disappears then.


Reply to this email directly or view it on GitHub:
#43 (comment)

@nicokaiser
Contributor

Hm, ok, none of the optimizations had any effect on the very high memory usage – I assume Node is failing to release unused Buffer memory:

http://cl.ly/1R442g3t2d1T3S152s0i

Do you think compiling node for ia32 (instead of x64) might change something?

@nicokaiser
Contributor

Seems like the garbage collector in Node 0.6 is the cuprit:

https://gist.github.com/2362575

  • install weak, ws modules
  • run server: node --expose_gc server.js (to force GC – works without but takes longer then)
  • run client a few times: node client.js
  • watch the difference between used objects and GC'ed objects...
@nicokaiser
Contributor

... works as expected with Node 0.4.12, but 0.6 seems not to collect all objects but keeps a random count of objects referenced somewhere. This is bad...

@nicokaiser
Contributor

... btw, this is no ws problem (websock for example has exactly the same result), but I don't know how to debug this in node ...

@3rd-Eden
Member

Maybe @mraleph could give us some information on why this happening :) I have seen simular changes with node 0.4 before.

@nicokaiser
Contributor

@3rd-Eden you mean node 0.6? In 0.4 everything works fine for me...

@3rd-Eden
Member

Yeh, there have been countless reports on the Socket.IO mailing list before that node 0.6 suffers from bigger memory leaks than 0.4.

@nicokaiser you could try compiling Node.js with the latest V8 which also uses the new incremental GC (I don't think node 0.6 has that yet)

@mraleph
mraleph commented Apr 11, 2012

@3rd-Eden the only suggestion I have at the moment is to repair node-inspector (if it is still broken) and figure out what retains those sockets using heap snapshots.

@nicokaiser
Contributor

@3rd-Eden same for node 0.7.7 (v8 3.9.24.7).

The incremental GC is in v8 3.6.5 (node 0.6.14 has 3.6.6) with some improvements in 3.7:
http://code.google.com/p/v8/source/browse/trunk/ChangeLog

So this problem will likely stay there in 0.8...

@c4milo
c4milo commented Apr 11, 2012

@mraleph node-inspector is still outdated. I wrote a module recently to use webkit built-in devtools. It should help them to find this issue.

https://github.com/c4milo/node-webkit-agent

@nicokaiser
Contributor

@c4milo I'm still trying to get node-webkit-agent running. It's using ws itself, won't this distort the statistics (as I'm trying to debug ws...)?

@c4milo
c4milo commented Apr 11, 2012

uh oh, maybe, although you will find it out analyzing the snapshots. In my tests running your scripts, server.js seems pretty stable, it didn't go over 28M, client.js is the one that dies. FWIW, I'm using node 0.6.14 and ws 0.4.12 to run those tests.

@c4milo
c4milo commented Apr 11, 2012

or maybe I'm missing something, client.js seemed stable too, never went over 35M and I took 8 samples, roughly the same I did with server.js. BTW I'm using these scripts: https://gist.github.com/2152933. Also I think client didn't even died it just reached 10 minutes.

@nicokaiser
Contributor

@c4milo can you try these scripts? https://gist.github.com/2362575

The server fails to garbage collect some of the socket objects. On the live server (with many thousand connections per minute) this leads to growing memory consumption and makes the process crash after it consumed all RAM.

@c4milo
c4milo commented Apr 11, 2012

@nicokaiser I just profiled the server and only leaked like 1k

FWIW I'm in OSX.

connected: 0, not garbage collected: 0
connected: 200, not garbage collected: 200
connected: 200, not garbage collected: 200
connected: 200, not garbage collected: 200
connected: 200, not garbage collected: 200
connected: 200, not garbage collected: 200
connected: 0, not garbage collected: 13

second run:
connected: 200, not garbage collected: 213
connected: 200, not garbage collected: 213
connected: 200, not garbage collected: 213
connected: 200, not garbage collected: 213
connected: 200, not garbage collected: 213
connected: 0, not garbage collected: 26
connected: 0, not garbage collected: 26
connected: 0, not garbage collected: 26
connected: 0, not garbage collected: 26

@nicokaiser
Contributor

@c4milo: yeah but the question is, in your output, where does the "13" come from? why are there 13 objects not being caught by the GC?

@nicokaiser
Contributor

... or is this the reference counting errors @bnoordhuis mentions in nodejs/node-v0.x-archive@e9dcfd4 ?

@bnoordhuis

@nicokaiser: It's not related. That was about the event loop aborting prematurely.

@c4milo
c4milo commented Apr 11, 2012

@nicokaiser in devtools you can compare the snapshots and it will show you the constructors involved in the growth

@c4milo
c4milo commented Apr 11, 2012

this is how it looks like: http://i.imgur.com/odYea.png . I can help you set up node-webkit-agent if you want @nicokaiser

@c4milo
c4milo commented Apr 11, 2012

I just saw your messages in irc @nicokaiser, sorry, new irssi installation and I don't have growl notifications yet. Anyway, this link may help you to interpret the results you're getting in the webkit devtools front-end.

@nicokaiser
Contributor

Ok, I found it. It's the defineProperty functions here https://github.com/einaros/ws/blob/master/lib/WebSocket.js#L51

If I replace those by regular properties like this.bytesReceived (and don't access bytesReceived through the getter but directly), everything works well without leaks. I'll check this tomorrow as I also have to go now...

@nicokaiser
Contributor

@einaros: why are you using Object.defineProperty(this, ... instead of e.g. this.bytesReceived = ...?

@rauchg
Contributor
rauchg commented Apr 12, 2012

I think he's trying to make the properties non-enumerable. Maybe for spec compliance, but it's definitely not an object you would iterate the keys of.

@nicokaiser
Contributor

@guille Or, to protect the keys and prevent them to be changed. However if I change Object.defineProperty(this, 'bytesReceived', ... to this.bytesReceived = ... all WebSocket objects seem to be garbage collected.

I'll try to file a node.js bug on this...

@einaros
Contributor
einaros commented Apr 12, 2012

Ah, all this fancy talk while I'm sleeping. I need to get my alarm clock hooked up to Github somehow.

@nicokaiser You actually confirmed that it doesn't leak without the bytesReceived property? That's, at first glance, completely haywire.

@nicokaiser
Contributor

@einaros No, removing only bytesReceived does not help. I need to replace all Object.defineProperty in the constructor (and use the properties directly in the code). I assume this has something to do with 'self' being referenced in the getters:

nodejs/node-v0.x-archive#3092
https://gist.github.com/2364840

@einaros
Contributor
einaros commented Apr 15, 2012

Where do we stand on this now? Any news from your servers, @nicokaiser?

@nicokaiser
Contributor

I installed ws 0.4.13 on the servers, it seems at least a bit better now. The crashes have gone (default error handlers) and the memory does not grow that fast.

However it's still growing over time, which I account to inefficient Node garbage collector (see node #3097 and #3092 and v8 #2073). I'll keep observing this...

@crickeys

what happens if you run node without crankshaft on 0.6.* like this
node --nocrankshaft script.js

@nicokaiser
Contributor

@crickeys no improvement with nocrankshaft, I added this switch some days ago.
The memory still is not freed – especially the gap between "idle server after many connections" (800mb memory) and "idle server after restart" (50-100mb memory) is too high:

http://cl.ly/0Z3r0g1y2V412D3A1Q2b

(I restarted the server by hand in order to see some improvements I made, but nothing worked. If I keep it running for some days it looks like stairs and crashes when the ram is full (or starts to swap, which is really bad)).

@crickeys

What happens if you expose the garbage collector and call gc() every few seconds in a setIntrrval?

@nicokaiser
Contributor

@crickeys calling gc() every minute does not help at all, now trying calling it every 10 seconds – but I don't expect any change...
Again, it's mainly RSS memory growing, so I suspect node failing to release buffer memory properly (across all node versions 0.4, 0.6, 0.7). I don't know if this is my application (which I reviewed very carefully) or ws or node. I don't know how to profile RSS memory consumption (heap dumps from webkit-devtools-agent do not help here)...

@crickeys

I've been trying to debug a similar problem in my app for months. If the same code runs in node 4 all is well but as soon as it goes to 6 I have a slow rss memory leak as well.

@nicokaiser
Contributor

@crickeys Currently the app leaks in 0.4 and 0.6 (but maybe the http/net/ws implementation for 0.4 is buggy).

But some months ago I had the same (virtually NO leak at all on 0.4, leak at 0.6). As the current server serves about 20k concurrent clients with 2.000 connection handshakes/closes every minute, node leaks about 100mb per hour, which is very bad for a server I don't want to restart every few hours (as 20k clients would have to reconnect)...

@crickeys

Thats the same type of issue I'm running into. My app is socket.io and I don't think these ws libraries on on the server side yet. Hoping for a common solution. Very hard to debug memory leaks in node :(

@nicokaiser
Contributor

Socket.IO uses a different websocket implementation (early predecessor of ws). I tried to abstract its implementation and use it in WebSocket.IO, but this combination also leaked, as well as the (completely different) websock implementation.

So I don't think it's a problem of those libraries or our applications, it must be node... but I cannot prove or help the node devs to find and fix the leaks.

@einaros
Contributor
einaros commented Apr 17, 2012

The parsers in socket.io are pulled from ws. The rest is specific to
s.io.
From: Nico Kaiser
Sent: 4/17/2012 17:05
To: Einar Otto Stangvik
Subject: Re: [ws] High memory usage (or memleak?) (#43)
Socket.IO uses a different websocket implementation (early predecessor
of ws). I tried to abstract its implementation and use it in
WebSocket.IO, but this combination also leaked, as well as the
(completely different) websock implementation.

So I don't think it's a problem of those libraries or our
applications, it must be node... but I cannot prove or help the node
devs to find and fix the leaks.


Reply to this email directly or view it on GitHub:
#43 (comment)

@c4milo
c4milo commented Apr 24, 2012

I have the same issue in production as @nicokaiser. V8 heap looks fine and stable now but the RSS memory keeps growing. I guess it's gdb and valgrind time. :/

@einaros
Contributor
einaros commented Apr 24, 2012

Do note that the RSS rise is normal in itself. Unless the process actually crashes, this is all the same issue @nicokaiser reported to the v8 team.

On Apr 24, 2012, at 6:35 PM, Camilo Aguilar wrote:

I have the same issue in production as @nicokaiser. V8 heap looks fine and stable now but the RSS memory keeps growing. I guess it's gdb and valgrind time. :/


Reply to this email directly or view it on GitHub:
#43 (comment)

@c4milo
c4milo commented Apr 24, 2012

hm, I doesn't seem normal in my case, v8 heap is 12m and stays around that
value upon every request whereas the RSS size is 200M and keeps growing.
That definitely is a leak in C/C++ land.
On Apr 24, 2012 1:23 PM, "Einar Otto Stangvik" <
reply@reply.github.com>
wrote:

Do note that the RSS rise is normal in itself. Unless the process
actually crashes, this is all the same issue @nicokaiser reported to the v8
team.

On Apr 24, 2012, at 6:35 PM, Camilo Aguilar wrote:

I have the same issue in production as @nicokaiser. V8 heap looks fine
and stable now but the RSS memory keeps growing. I guess it's gdb and
valgrind time. :/


Reply to this email directly or view it on GitHub:
#43 (comment)


Reply to this email directly or view it on GitHub:
#43 (comment)

@einaros
Contributor
einaros commented Apr 24, 2012

See the other notices I've made about the RSS in this thread. :)

@c4milo
c4milo commented Apr 24, 2012

yeah, I'm not saying it's a ws thing given that I'm not using it in my server. It seems to be nodejs C/C++ land leaking somewhere. I'll try to dig further once I can have some spare time.

@einaros
Contributor
einaros commented Apr 24, 2012

What I mean is that high RSS isn't necessarily a sign of a memory leak. It may just be the allocator not immediately releasing pages. If and when the OS needs the memory for other processes, the RSS will fall again.

@c4milo
c4milo commented Apr 24, 2012

It make sense what you are saying but it isn't just normal that a really small http process is holding 1.9G of RSS memory with only 7 clients connected sending messages every second. Something is broken somewhere. The service has been up for about 1 week, it hasn't crashed but it is still holding 1.9G in RSS. It's just no normal.

@crickeys

Has anyone tried using this to debug?
https://github.com/Jimbly/node-mtrace

@nicokaiser
Contributor

@crickeys I think (!) mtrace only traces heap allocation.

As @einaros said, currently the "leak" seems not that bad – memory in my server processes went from ~200mb (with 20k clients) to 1.5gb, suddenly fell to 1.2gb for no reason and now rises again. I'll observe if it crashes or swaps.

However I agree @crickeys that this is not normal, the obviously unused memory should really be released by node...

@einaros
Contributor
einaros commented Apr 26, 2012

Any word on your other memleak reports yet, @nicokaiser?
From: Nico Kaiser
Sent: 4/26/2012 10:36
To: Einar Otto Stangvik
Subject: Re: [ws] High memory usage (or memleak?) (#43)
@crickeys I think (!) mtrace only traces heap allocation.

As @einaros said, currently the "leak" seems not that bad – memory in
my server processes went from ~200mb (with 20k clients) to 1.5gb,
suddenly fell to 1.2gb for no reason and now rises again. I'll observe
if it crashes or swaps.

However I agree @crickeys that this is not normal, the obviously
unused memory should really be released by node...


Reply to this email directly or view it on GitHub:
#43 (comment)

@crickeys

could it be at all related to this?
nodejs/node-v0.x-archive#3179

@c4milo
c4milo commented Apr 26, 2012

that's a different scenario

@crickeys

not if the underlying code is using the http module

@c4milo
c4milo commented Apr 26, 2012

In this issue we already got rid of the v8 heap leak, the problem is with the RSS weird growth and retention. In issue nodejs/node-v0.x-archive#3179 besides that it's using an external module, @mikeal request, and the client portion of the HTTP module as opposed of the server. The leak is happening in the v8 heap space.

@nicokaiser
Contributor

Nothing new about the v8 bug: http://code.google.com/p/v8/issues/detail?id=2073 (but maybe this was something different).

I believe, there are some leaks in node itself, in v8 and in node's http module. Although they are in an acceptable range (as I wrote, currently it's quite ok for me), it's not nice... I think for node 0.9 a rewrite of the http module is planned, so let's wait and see.

@vvo
vvo commented Apr 26, 2012

Just a quick note the nodejs/node-v0.x-archive#3179 issue does no more use @mikeal request module, just plain http module.

@nicokaiser Are you saying that my issue is already discussed here : http://code.google.com/p/v8/issues/detail?id=2073 ?

thanks

@crickeys
crickeys commented May 7, 2012

What version of openssl are you using for these tests? Have you tried with node using a shared openssl 1.0 or greater?

@nicokaiser
Contributor

@crickeys can this make a different if I'm not using https? (only some crypt functions)

I'm using the bundled libssl (0.9.8 I think).

@crickeys
crickeys commented May 8, 2012

according to this: nodejs/node-v0.x-archive#2653 yes
Apparently the crypt function rely on openssl. However, I tried doing this and still have a gradual memory leak with socket.io and websockets enabled. Those also seem to use the crypt functions, but this didn't seem to help there :(

@crickeys
crickeys commented May 8, 2012

Actually, I may not have properly updated libcrypto eventhough I successfully built node with an openssl 1.0.0i. It appears the older 0.9.8 libcrypto is being used by my node. Let me keep playing with this, you may want to try it too as I really don't know what I'm doing here :)

@einaros
Contributor
einaros commented May 11, 2012

@nicokaiser Have you had any memory related crashes lately?

I think it's time to summarize where we are at with this again. With the latest node version and latest ws version - are anyone seeing any actual crashes due to out of memory errors?

@nicokaiser
Contributor

@einaros We changed some things in the infrastructure (split the server, sticky sessions with HAproxy), which causes each server not to have more than ~10.000 connections. The last 2 weeks look like this for one of the servers (node 0.6.16, ws 0.4.13), memory and clients:

Memory is at about 1.4 GB per server, with about 9k clients.

I'll try to keep the servers running (needed to restart them because of an update after these screenshots) to see if memory rises above 2 GB (when swapping kicks in).

@einaros
Contributor
einaros commented May 11, 2012

The fact that it doesn't keep growing beyond the numbers seen there could suggest that there isn't actually any leakage anymore. At the same time, there could be a tiny leakage (anywhere), which over a greater extent of time than measured here could cause a crash.

At some point I need to build a stress testing environment to see how it reacts to sustained loads from more than 50k clients..

@nicokaiser
Contributor

Update: the installed version I mentioned adds socket.destroy to every socket.end, see #64. Don't know if this makes a difference; I can change one of the servers to use vanilla ws 0.4.14 next week.

@3rd-Eden 3rd-Eden referenced this issue in sockjs/sockjs-node May 13, 2012
Closed

Memory usage under load #62

@einaros
Contributor
einaros commented Jun 14, 2012

Since this hasn't been brought back up, I'm (temporarily) concluding that the issue has been covered by the fixes as of late. The lingering memory seen in graphs right now are due to free'd memory not being released immediately.

Should anyone actually get a process crash due to running out of memory, we'll have to revisit it. Judging from the lack of talk here lately, that hasn't been the case in quite some time.

@einaros einaros closed this Jun 14, 2012
@nicokaiser
Contributor

Confirmed. The process takes lots of memory over time, but this seems like "node.js is lazy to clean up memory". A server that actually uses the memory it gets is ok as long as it does not take more than available.

@heha heha referenced this issue in socketio/socket.io Oct 8, 2012
Closed

memory leak #1015

@nodenstuff

Confirmed that its happening again. Server eventually crashes after running out of RAM.

node 0.8.9
socket.io 0.9.10
Express 3.0.0.0rc4

@kanaka
Contributor
kanaka commented Oct 8, 2012

I can confirm that our server uses more and more memory over time and eventually crashes when the memory use gets high. The more connections and data transferred the quicker the crash happens.

node 0.8.9
ws 0.4.21
AWS t1.micro running Ubuntu 12.04.1

@einaros
Contributor
einaros commented Oct 8, 2012

@nodenstuff, thanks for reporting. Since nothing much has been changed in ws over the last few months, I'm not immediately jumping to the conclusion that this is caused by the ws parsers. If there wasn't a crashing leak four months ago, but there is today, that's a regression in another component - or a breaking change in node which ws will have to be updated to work with.

@kanaka, are you using socket.io?

@nicokaiser, how has your situation been lately? Any crashes?

@kanaka
Contributor
kanaka commented Oct 8, 2012

@einaros, no socket.io, it's a very simple server that just uses the ws module directly:

https://github.com/n01se/1110/blob/6c90e0efc3a4afeb099f79d18d471a5936de1d3e/server.js

@nicokaiser
Contributor

@einaros No crashes, but this may be because we reduced the load from our servers and 2 GB memory seem to be enough.

No I cannot check this at the moment, but I'm pretty sure, that, while Node allocates 1.5 GB memory, one process will crash if I start another process on this server that uses more than 512 MB.

Fact is that Node, starting from 0.6.x, fails to return unused memory to the kernel, but keeps the RSS (!!) allocated. Which maybe no problem if this Node process is the only thing that runs on the server (and thus is allowed to eat all of its memory), but it's not nice and definitely a bug.

@nicokaiser
Contributor
@nicokaiser
Contributor
@heha
heha commented Oct 10, 2012

I have just tried with NodeJS 0.9.2 (with a little modification to socket.io 0.9.10 because it cannot find socket.io.js client). This problem still going on. Seem like only way to solve this is use node 0.4.12. :( I want to use native cluster more but seem like I do not have other choice...

@nodenstuff

I enabled flashsocket on 0.9.10 and I didn't run out of ram today. Might just be a fluke. But currently hold steady around 2GB total in use, that includes redis and everything else running on the server. I will update if that changes.

@nicokaiser
Contributor
@heha
heha commented Oct 21, 2012

Good News on rewrite buffer:
https://groups.google.com/forum/?fromgroups=#!topic/nodejs/Vjg7-VHGrnk

Branch:
https://github.com/joyent/node/tree/crypto-buffers

Bad news:
This branch cannot run with MySQL at all. It showed me Access denied even username&password already correct. A hack to fix MySQL issue need and hope it can fix this memory issue.

@baryshev baryshev referenced this issue in nodejs/node-v0.x-archive Oct 30, 2012
Closed

RSS memory leak on highload #4217

@heha
heha commented Jan 4, 2013

I just setup stud to terminate ssl and now node do not need to process ssl anymore. Now, my memory consumption decrease about 3 times. Maybe this bug relate to using wss via https?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment