New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Echo server runs out of RAM #15
Comments
I'm not sure what you mean, but I could also use educating on stream stuff I've probably missed. A call like |
These might be helpful: Basically if you just read from the read stream and write immediately to the write the stream, as you do in your echo example, then there is a danger that the write stream buffer starts to grow without bound, if the system reads faster than it can write. Remember - node.js writes are queued in a buffer and the actual write occurs some time later. To fix this, you need some kind of flow control. The way it works is as follows: you write until the write queue is full then, you tell the read stream to pause, you also set an event handler on the write stream that gets called when the write queue has drained, at which point you resume the read stream. By doing this you can guarantee that the write queue buffer will never exceed a certain size and thus you can avoid running out of RAM. This is such a common thing to do that node.js provides helper functions for this - util.pump and stream.pipe, so you don't have to write all that code yourself. You'll notice that the node.js echo example on their website uses pump, it does not manually write the buffer received to the output: var server = net.createServer(function (socket) { If your websocket implemented read and write stream you'd be able to do the same with websockets, and avoid running out of RAM, something like: var server = http.createServer(); server.addListener('upgrade', function(request, socket, head) { server.listen(8080); HTH :) |
With WebSockets, even in an echo server, the bytes coming in are different from those going out. e.g. the incoming message is masked while the outgoing one, though it contains the same text payload, is not. i.e. instead of just a series of It sounds like I need to do something with the underlying stream object, i.e. this thing: https://github.com/faye/faye-websocket-node/blob/master/lib/faye/websocket.js#L36 On the |
If you implement the functions: pause() as per the streams api, node.js will take care of the actual pumping for you |
pause - is easy - just call pause on the underlying socket that you are reading the data from Those are the key things. You may need a few other bits and pieces Basically you need to implement both ReadableStream and WritableStream http://nodejs.org/api/stream.html Btw - I am not a node expert, but I implemented something similar here https://github.com/purplefox/vert.x |
I think I've been thinking about this all wrong: The library's abstraction model currently looks like this:
My initial reading of this is that I should implement this:
But I was confused: what if the user is only writing to the socket, and the 'source' of those writes is some other channel entirely, for example incoming HTTP requests, or AMQP messages? There's no point stopping writing to my TCP internally since I'll just have to buffer the messages myself in RAM behind the WebSocket API. What I now think you're suggesting is this:
i.e. expose the Stream API to the user on the WebSocket object, so then it can be used with Node's stream utils such that Node will stop reading from whatever input channel is causing writes to pile up on the WebSocket output channel. Is this correct? My only concern with this is adding to the standard WebSocket interface, but if this is the standard way of dealing with data flow in Node then so be it. |
Partly my confusion stems from the fact I assume streams were all about byte sequences. TCP emits a stream of bytes, but then WebSocket chunks these into discrete text messages. On the way out, WebSocket takes discrete text messages and converts them into byte sequences to go out over TCP. Is this behaviour compatible with the Stream API? What about higher-level constructs? e.g. 'writing' to a Faye messaging client is |
Yes, that's it. In other places I have simply implemented the Stream interface and left it at that, but if you want to keep your websockets API then perhaps consider having some equivalent for flow control in that API too; e.g., return false if you can't write to the underlying socket. NB this is the reason I argued for using the |
Yes, interop is s big part of faye-websocket. The fact that it implements the standard API means I can reuse client-side socket code easily. Fortunately browsers seems to return |
With "invented their own API" I was taking a dig at socket.io .. Following the client-side websockets API is a different story. Yep, returning true/false on send and maintaining the bufferedAmount value* is good enough to build a *That is, buffering sends when the underlying socket is "blocked". Maybe you already do this. |
In node core all streams are streams of bytes, or strings, but this is only because node only implements low level IO, There is nothing about the Stream API that prevents you sending any type of object down the pipe. |
Looking at this again after someone else reported memory leaks. How are you generating the load to make the server run out of memory? Do you know where the memory leak comes from? |
Sorry for my english, but i'll try to describe setup errors and solutions. |
@asp24 Ideally I'd like a script that creates load on the echo server that replicates the memory leak. All attempts I've tried so far have failed to produce excessive memory usage so I need someone else to give me some code to reproduce the problem. |
@jcoglan this is problem because users are real. But I am ready to give any debug information you need. |
@asp24 As I said, I need a script that creates load against the example echo server, that makes the server leak memory. I've run scripts that open thousands of sockets and fire messages at it as fast as possible, and memory usage remains low for me. |
@jcoglan what if tests always check good scenario? In real world network is very unstable. Maybe leak is there? |
@squaremo Have you found any good WebSocket implementations that correctly adhere to the Stream API? |
Shoe is sockjs-node with the addition of a client-side Stream API. SockJS's Connection class is a Stream, but it does not support flow control. However, I suppose I shouldn't start worrying about that until a real-world workload demonstrates that it is a problem. |
hmm, I thought that WebSocke#send() returns false if the buffer is full, which is what you need. |
Adding stream support to this library has been on my mind for a while, just haven't had time for it. My current plan is to make I don't think this is possible right now because I also have a question about symmetry: |
I've not, but then I've not really been looking ..
I don't understand what this entails -- what's "wrap" here, how would it look to someone using it?
In the scenario in which a WebSocket is both Stream and er, WebSocket, I imagine |
you'd use it like this stream1.pipe(faye).pipe(stream2)
Correct, send and write would be synonyms,
|
Perhaps this is unrelated, but I was debugging a memory leak on my faye servers and found that there was a large buffer that was hanging around in memory. Tracing through its retainers, I found it inside a Faye.Engine.Connection instance. The underlying socket has its readyState set to 3 (CLOSED), but the connection was never deleted from the _connections object. The socket was hanging around in memory forever with a buffer that hand been filling up with stuff that would never be delivered. I could not determine what was causing the root issue (closed socket but no flush/disconnect message being sent), but was able to reproduce eventually by having clients rapidly reload a page that had a connection being made. Eventually (was no reliable way to reproduce it instantly) one of the connections would stick around forever. I am trying to mitigate the issue by checking the proxy's connection list for closed websockets in the engine's GC routine, and calling flush on the connection if it is marked as closed and eligible for garbage collection. I'll let you know if this relieves any memory pressure. |
@pselden4 The proxy that sits between the Server and Engine should already by listening for |
Yes, it isn't properly disconnecting in all cases, and the buffer is sticking around in memory forever. I just implemented an additional reaping process that flushes the closed ones and I'll monitor to see if relieves the leak or not. It didn't make sense to me that the out of RAM issue is the result of just using buffers and not streams because you'd have to be pumping gigabytes through the system at one time to cause the buffers to grow that big. My hypothesis is that it leaks the connections (which contain the socket, which holds the buffer). I'll file a bug with faye. |
My reaping process has not significantly affected memory consumption (still gradual growth that never goes down). It's safe to say this is not the root cause of the issue. |
In case you were working on this now -- changes for node 0.10 streams on the horizon! http://blog.nodejs.org/2012/12/21/streams2/ |
So, with apologies for how long this has taken, I just pushed some changes that make this library support streams. This is done in two parts:
I'd like a few of you to begin trying this new version out and let me know how it goes. Thanks to everyone who gave me pointers on how streams work. |
The 0.5.0 release of this library, combines with the new |
@jcoglan Did you use Node.JS v0.10 "new style" streams or "old style" streams? It looks like the latter -- but actually I think that's OK, because I'm not sure new style streams really work with framed data. For instance, there's nothing stopping the Readable implementation from trying to concatenate the chunks in its read buffer, so far as I can tell. |
They're old streams, which 0.10 still supports. |
The echo server example under any significant load exceeds available RAM very quickly.
This appears to be because the example writes the received websocket frame directly back out without doing any flow control.
The normal way to implement this in a scalable way using node is to use util.pump or stream.pipe, however the Fay WebSocket object does not appear to be usable in this way.
Shouldn't the Websocket implement read and write streams as is normal with most node.js objects which read and write data?
http://nodejs.org/api/stream.html
The text was updated successfully, but these errors were encountered: