-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate memory exhaustion on webserver with high rate, LARGE binary data #192
Conversation
…rver buffer from filling up when attached to slow clients.
Wouldn't this be just as big (of not bigger) a problem for high-frequency non-BSON data? I'm not quite comfortable with adding another place where messages can be dropped (there's already the send and receive queues, this adds another one), but I'm not sure there is a better way - it would be a lot more work to not read new messages from the topic as long as tornado is blocked, right? |
On Wednesday, August 5, 2015, Nils Berg notifications@github.com wrote:
Possibly. Right now I know that many people might not be using the BSON
Maybe not a lot more work, but I'm not sure that's it's functionally
|
binary = type(message)==bson.BSON | ||
IOLoop.instance().add_callback(partial(self.write_message, message, binary)) | ||
if topic == None or not binary: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive me if these comments are too pedantic. Relatively new to community PRs and code reviews =)
if topic is None or not binary
would be more Pythonic.
Please see if this helps your problem: #203 |
While Issue #203 may help prolong runtime, the fact that you have to set this manually on the server side does not completely fix it. In a case where a client still cannot consume the websocket data fast enough, the server side buffer will increase without bound. A skilled programmer could fiddle with delay_between_messages but that shouldn't be required. The simple "queue size 1" solution that I proposed in Issue #192 "adapts" throughput to the client's consumption speed. |
This would result in the client losing data without any warning on either end. |
Agreed. I preferred that for my streaming application over coming back the next day to a core dump on my server. |
rosbridge should not assume that For example, assume that rosbridge is streaming both uuid_msgs/UniqueID(small but binary format) and visualization_msgs/Marker(large but non-binary format) to the same client. Then it would start to block uuid messages even if the socket is actually stucked by marker messages. To solve the problem, how about using depthcloud_encoder and web_video_server to stream the pointcloud data? I think it is more make sense to separate out the large and dense message to use another channel. |
Depth cloud encoder seems to be hard coded for Kinect. It doesn't support denser larger clouds from stereo devices. It also is not lossless. I agree with all your arguments against my fix as a general solution, but currently when sending 250 MB or more a second, if the client does not consume quickly enough (perhaps because of slow 3D rendering of large data), it does not take long before the server fills up the computer's memory and crashes the machine. That was not acceptable in my application, where as dropping some frames when attached to a slow client was acceptable (since the client couldn't consume them anyway). |
@pbeeson can you confirm that the server memory overflows if and only if there is a slow client? If so, I would suggest to utilise throttle_rate and queue_length to throttle down the message sending to a slow client. See rosbridge specification 3.4.4 Subscribe section it describes how to use them. Also you can check the ThrottleMessageHandler and QueueMessageHandler implementation to see how will manage message sending to the client. throttle_rate and queue_size are configurable in roslibjs.Topic.subscribe If the server memory overflows even if there is no slow client, let me know. I will check the subscriber logic in rosbridge_library to see if something is handled inappropriately. |
|
This cannot be merged anyway. I have created issue(#212) to follow up the problem. We will continue discussion there. Closing this PR. |
In using rosbridge, I was passing VERY dense, LARGE pointclouds at 20Hz. I noticed that after a while, the tornado write buffer for the websocket was monotonically increasing because the web client wasn't pulling data as fast as it was written.
This simple change addresses this by creating blocking binary data on a topic until the tornado websocket buffer is flushed out. This keeps the tornado buffer from monotonically increasing and filling up system memory until the entire machine crashes.