-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xmlrpccpp spins forever when no more file handles #914
Comments
This is preventing my company from using the ROS profiling tools, and it's rather annoying. Does someone have time to take a look at this? |
I'm sorry to say, but I don't have time right now, and I wouldn't count on any of the maintainers looking into it before the end of the year, but I could be wrong about that. Maybe someone out in the community could dig into it, especially since you set up a reproducible example (thanks for that btw). Since you spent the time to make the reproducible example and make a suggestion of where the issue might be, I guess you guys spent some time trying to fix it and are stumped? Unfortunately, I'm not very familiar with the XMLRPC code that is embedded (😞) in I assume it's not an option for your company to spend more time to dig into the issue and find the solution? If so that's a shame, because it would help your company and everyone else in the ROS community! Alternatively, you could raise the visibility of this by posting it on http://discourse.ros.org/ and seeing if anyone else finds it annoying enough to help try and fix it. |
Well my exploration of ROS's xmlrpc for SROS has been mainly been confined to rospy, but I just tested example, and was able to reproduce the issue on my local machine: $ apt-cache policy ros-kinetic-ros-comm
ros-kinetic-ros-comm:
Installed: 1.12.6-0xenial-20161026-193907-0700
Candidate: 1.12.6-0xenial-20161026-193907-0700
Version table:
*** 1.12.6-0xenial-20161026-193907-0700 500
500 http://packages.ros.org/ros/ubuntu xenial/main amd64 Packages
100 /var/lib/dpkg/status Needed to make a small change to the cmake to compile: @trainman419 , I tried uncommenting out that line you pointed to in this dockerfile.
Here was my build steps:
|
We don't have a lot of time to work on this either. I may try to fix this on my company's copy of ros_comm, but the last time I submitted a pull request to ros_comm was not a pleasant experience, so I'm not motivated to submit that fix upstream. |
I'm sorry to hear that. If there's anything I can do to motivate you (or your company) to do so when and if you come up with a fix, let me know. |
I reproduced the problem today (Thanks @trainman419 for the reproducible example!). Basically I see here two different issues:
The first problem is another instance of #610, the second is not yet addressed and I'll look into this issue in the coming days. I've an open pull request for #610 in #834, but it's not yet ready to be merged as discussed with @dirk-thomas. Any contribution on #834 is highly welcome and I can confirm that it fixes the first of the two problems (i.e. you are no longer running out of available file descriptors with this fix). |
[xmlrpcpp] close socket when connection can't be accepted. Fixes #914
[xmlrpcpp] close socket when connection can't be accepted. Fixes ros#914
Add integration tests for the XmlRpcServer interacting with an XmlRpcClient in the same process, particularly around the behavior when the server process runs out of available file handles. Update the XmlRpcServer with two different mitigations for file handle exhaustion (Fixes ros#914, replaces ros#960 and ros#977): * Measure the number of free file handles and reject incoming connections if the pool of free file descriptors is too small. This actively rejects incoming clients instead of waiting for complete file handle exhaustion, which would leave clients in a pending state until a file descriptor becomes free. * If accept fails due to complete file handle exhaustion, temporarily stop calling accept on this socket. This prevents a busy-loop where poll() believes the listening socket is readable, but accept() fails to allocate a file descriptor and leaves the socket in a readable state.
Add integration tests for the XmlRpcServer interacting with an XmlRpcClient in the same process, particularly around the behavior when the server process runs out of available file handles. Update the XmlRpcServer with two different mitigations for file handle exhaustion (Fixes ros#914, replaces ros#960 and ros#977): * Measure the number of free file handles and reject incoming connections if the pool of free file descriptors is too small. This actively rejects incoming clients instead of waiting for complete file handle exhaustion, which would leave clients in a pending state until a file descriptor becomes free. * If accept fails due to complete file handle exhaustion, temporarily stop calling accept on this socket. This prevents a busy-loop where poll() believes the listening socket is readable, but accept() fails to allocate a file descriptor and leaves the socket in a readable state.
* Tests and bug fixes for XmlRpcServer Add integration tests for the XmlRpcServer interacting with an XmlRpcClient in the same process, particularly around the behavior when the server process runs out of available file handles. Update the XmlRpcServer with two different mitigations for file handle exhaustion (Fixes #914, replaces #960 and #977): * Measure the number of free file handles and reject incoming connections if the pool of free file descriptors is too small. This actively rejects incoming clients instead of waiting for complete file handle exhaustion, which would leave clients in a pending state until a file descriptor becomes free. * If accept fails due to complete file handle exhaustion, temporarily stop calling accept on this socket. This prevents a busy-loop where poll() believes the listening socket is readable, but accept() fails to allocate a file descriptor and leaves the socket in a readable state. * Add test depend on boost * Run astyle
When a ROS node runs out of available file handles, it can go into an infinite spin loop if it gets an incoming xmlrpc request.
We found this when executing the rosprofiler nodes against our system, so this is partially blocking us from running rosprofiler.
I've put a small test program that demonstrates this: https://github.com/trainman419/ros_infinite_loop ; compile and run the test.launch and
top
to see the loop node spin at 100%I've done some very informal profiling and it looks like the program is spending all of its time in
ros::CallbackQueue::callAvailable
.It also looks like the socket close here should not be commented out, and this may be contributing to leaked sockets:
ros_comm/utilities/xmlrpcpp/src/XmlRpcServer.cpp
Line 157 in ea13207
cc @Knio
The text was updated successfully, but these errors were encountered: