New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timer hanging and high CPU load when using MultiThreadedExecutor #1223
Comments
I have the same issue, +1 for this |
From our weekly issue triage meeting: Assigning @clalancette so he can make a high level tracking issue about known issues with Python's executors. Also, we think this issue will not make progress without staffing of an engineer, which we don't currently have, or a dedicated community member to investigate and make a good suggestion on what to change and why. This is because executor related issues tend to be very nuanced and complicated to work on. So we're also assigning this the "help wanted" label with that in mind. |
Hi @JasperTan97 I recommend you could run py-spy to see where the high CPU load comes from. This helps immensely. Thanks |
@KKSTB, it looks like the waiting for ready callbacks are the issue. Do you get the same on your end? |
@JasperTan97 i ran py-spy on my ubuntu ROS iron machine and got this result: And
This shows the 4 worker threads were mostly idle, which makes sense because there is basically nothing to do inside subscriber and timer node. The main thread was instead very very busy retrieving tasks for the 4 worker threads to do. It seems such workload of retrieving and distributing tasks and gathering results at high frequency (500Hz) is marginal for one core. Therefore the rate slows down considerably. Although I have no clue why your single core is much slower (I can achieve 3XX-4XX Hz on my i7-9750H). As for single threaded executor, I can achieve 500Hz. The CPU utilization was half of multi threaded case. I have to push to 2500Hz before the rate starts to drop. I believe the problem has to do with the efficiency of transferring tasks from main thread to worker threads, relative to the actual useful tasks done in the worker thread. |
Bug report
Required Info:
Operating System: Ubuntu 22.04
Installation type: Binaries
Version or commit hash: Iron
DDS implementation: eProsima’s Fast DDS (the default)
Client library (if applicable): rclpy
CPU info (if needed):
Steps to reproduce issue
My publisher:
my subscriber:
And my main function:
Expected behavior (which is what I get when using the SingleThreadedExecutor)
Actual behavior
Additional information
So similar issues have been brought up with rclcpp, but I have not seen any comments made about rclpy. I found an issue here:ros2/rclcpp#1487, with other people also reporting something like: ros2/rclcpp#1618 and the fix is ros2/rclcpp#1516 and then ros2/rclcpp#1692.
Aside from the timer callback hanging (I assume it to be with after following some tic toc), my CPU load becomes really high using the MultiThreadedExecutor, while the SingleThreadedExecutor does not cause any noticeable CPU load. I have also tried using both the
MutuallyExclusiveCallbackGroup
andReentrantCallbackGroup
with no change in behaviour.I am not sure if my QOS settings are the problem, or this is an issue intrinsic to python (because of GIL or etc.) but either a more suitable example for how to use the MultiThreadedExecutor could be provided (if my usage is wrong), or the ROS wiki pages should reflect that this significant problem exists (if no fix is possible).
Thank you for helping!
The text was updated successfully, but these errors were encountered: