Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare ros2 pub sub cpu usage to ros2 #3

Open
lucasw opened this issue Jan 28, 2019 · 6 comments
Open

Compare ros2 pub sub cpu usage to ros2 #3

lucasw opened this issue Jan 28, 2019 · 6 comments

Comments

@lucasw
Copy link
Owner

lucasw commented Jan 28, 2019

/lucasw/imgui_ros#67 and https://answers.ros.org/question/312964/ros2-megapixel-image-pubsub-cpu-usage-is-very-high/

lucasw added a commit that referenced this issue Jan 28, 2019
@lucasw
Copy link
Owner Author

lucasw commented Jan 28, 2019

1024 x 1024 x 20Hz

ROS1

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                         
22688 lucasw    20   0  815816  52712  38136 S  12.9  0.3   0:04.56 pub_test                        
22735 lucasw    20   0  476552  16704   9804 S   3.3  0.1   0:01.08 sub_test         

Thread breakdown:

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                          
22688 lucasw    20   0  815816  52752  38136 S 12.0  0.3   0:30.30 pub_test                         
22697 lucasw    20   0  815816  52752  38136 S  3.3  0.3   0:09.35 pub_test                         
22698 lucasw    20   0  815816  52752  38136 S  0.0  0.3   0:00.08 pub_test                         
22699 lucasw    20   0  815816  52752  38136 S  0.0  0.3   0:00.00 pub_test                         
22700 lucasw    20   0  815816  52752  38136 S  0.0  0.3   0:00.08 pub_test                         
22701 lucasw    20   0  815816  52752  38136 S  0.0  0.3   0:00.62 pub_test  
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                          
22735 lucasw    20   0  476552  16704   9804 S  2.0  0.1   0:06.21 sub_test                         
22744 lucasw    20   0  476552  16704   9804 S  2.0  0.1   0:06.40 sub_test                         
22745 lucasw    20   0  476552  16704   9804 S  0.0  0.1   0:00.08 sub_test                         
22746 lucasw    20   0  476552  16704   9804 S  0.0  0.1   0:00.01 sub_test                         
22747 lucasw    20   0  476552  16704   9804 S  0.0  0.1   0:00.09 sub_test                         
22748 lucasw    20   0  476552  16704   9804 S  0.0  0.1   0:00.03 sub_test 

ROS2

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                         
23876 lucasw    20   0  869956  70964  40512 S  12.3  0.4   0:02.23 pub_test                        
23877 lucasw    20   0  497304  28548  12044 S   9.3  0.2   0:01.62 sub_test    

Publishing looks about the same but subscribing is 3x.

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                          
23889 lucasw    20   0  873028  74188  40572 S  8.0  0.5   0:06.42 pub_test                         
23876 lucasw    20   0  873028  74188  40572 S  4.0  0.5   0:03.38 pub_test                         
23884 lucasw    20   0  873028  74188  40572 S  0.0  0.5   0:00.00 pub_test                         
23885 lucasw    20   0  873028  74188  40572 S  0.0  0.5   0:00.06 pub_test                         
23886 lucasw    20   0  873028  74188  40572 S  0.0  0.5   0:00.00 pub_test                         
23887 lucasw    20   0  873028  74188  40572 S  0.0  0.5   0:00.00 pub_test                         
23888 lucasw    20   0  873028  74188  40572 S  0.0  0.5   0:00.09 pub_test 
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                          
23882 lucasw    20   0  500376  31776  12108 S  5.0  0.2   0:07.02 sub_test                         
23877 lucasw    20   0  500376  31776  12108 S  2.7  0.2   0:03.90 sub_test                         
23879 lucasw    20   0  500376  31776  12108 S  2.0  0.2   0:02.70 sub_test                         
23878 lucasw    20   0  500376  31776  12108 S  0.0  0.2   0:00.00 sub_test                         
23880 lucasw    20   0  500376  31776  12108 S  0.0  0.2   0:00.00 sub_test                         
23881 lucasw    20   0  500376  31776  12108 S  0.0  0.2   0:00.00 sub_test                         
23883 lucasw    20   0  500376  31776  12108 S  0.0  0.2   0:00.00 sub_test  

@lucasw
Copy link
Owner Author

lucasw commented Jan 28, 2019

30 Hz

26 - 30

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
27841 lucasw    20   0  821960  58916  38364 S  20.6  0.4   0:24.76 ros1_pub_test                    
27842 lucasw    20   0  476548  16696   9804 S   5.3  0.1   0:06.91 ros1_sub_test  
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
28283 lucasw    20   0  873028  73836  40480 S  17.2  0.5   0:04.42 ros2_pub_test                    
28284 lucasw    20   0  497304  28448  11956 S  12.3  0.2   0:03.18 ros2_sub_test  

40 Hz

36 - 40

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                          
28630 lucasw    20   0  815816  52628  38228 S  28.5  0.3   0:10.05 ros1_pub_test                    
28631 lucasw    20   0  476548  16620   9724 S   7.6  0.1   0:02.74 ros1_sub_test  
29064 lucasw    20   0  873028  74372  41008 S  22.5  0.5   0:11.45 ros2_pub_test                    
29065 lucasw    20   0  497304  28624  12124 R  16.9  0.2   0:08.48 ros2_sub_test  

50 Hz

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                         
30732 lucasw    20   0  815816  52596  38004 S  35.1  0.3   1:15.80 ros1_pub_test                   
30733 lucasw    20   0  479624  19676   9616 S   9.6  0.1   0:21.22 ros1_sub_test  
30506 lucasw    20   0  872640  73436  40648 S  29.2  0.4   0:09.94 ros2_pub_test                    
30507 lucasw    20   0  499992  31456  12348 S  20.9  0.2   0:07.08 ros2_sub_test  

Maybe what I'm seeing is just a 4-6% extra constant cpu cost in the subscriber, higher frame rates don't cause the extra cost to scale up. There is some shifting of cpu cost from publisher to subscriber also.

That may be acceptable, may need to be aggressive about reducing frame rate and resolution where not needed for visualization and elsewhere, or disabling some subscribers when not needed entirely in a high frame rate system.

@lucasw
Copy link
Owner Author

lucasw commented Jan 28, 2019

Resolution

2048 x 2048

20 Hz

31193 lucasw    20   0  849608  61828  38132 S  36.9  0.4   0:08.84 ros1_pub_test                   
31194 lucasw    20   0  485764  35048   9652 R   8.6  0.2   0:02.09 ros1_sub_test  
32035 lucasw    20   0  897216  97584  40276 R  31.5  0.6   0:06.61 ros2_pub_test                   
32036 lucasw    20   0  515348  46392  11964 S  17.9  0.3   0:03.53 ros2_sub_test   

40 Hz

32584 lucasw    20   0  837320  67572  38284 R  66.4  0.4   0:17.13 ros1_pub_test                   
32585 lucasw    20   0  498056  47496   9688 S  17.3  0.3   0:04.38 ros1_sub_test 
32480 lucasw    20   0  921792 122296  40440 R  57.5  0.7   0:10.57 ros2_pub_test                   
32481 lucasw    20   0  539928  71312  12340 S  32.9  0.4   0:05.92 ros2_sub_test 

@lucasw
Copy link
Owner Author

lucasw commented Jan 28, 2019

Multiple subscribers

If cpu costs are shifted to subscribers in ros2, this could be bad in a 1 to many pub sub scenario.

1024 x 1024 x 40

  326 lucasw    20   0  815816  52516  38036 S  33.1  0.3   0:07.21 ros1_pub_test                   
  327 lucasw    20   0  479624  19776   9708 S   9.3  0.1   0:01.91 ros1_sub_test                   
  328 lucasw    20   0  479624  19788   9724 S   9.3  0.1   0:01.83 ros1_sub_test  
 1606 lucasw    20   0  873028  73816  40460 S  26.9  0.5   0:02.89 ros2_pub_test                   
 1607 lucasw    20   0  497304  28640  12132 S  17.9  0.2   0:01.96 ros2_sub_test                   
 1608 lucasw    20   0  497304  28756  12256 S  17.6  0.2   0:01.92 ros2_sub_test 

So the cpu cost shifted from pub to sub and the extra overhead is multiplied by the number of subscribers.

lucasw added a commit that referenced this issue Jan 28, 2019
lucasw added a commit that referenced this issue Jan 28, 2019
lucasw added a commit that referenced this issue Jan 28, 2019
lucasw added a commit that referenced this issue Jan 28, 2019
@lucasw
Copy link
Owner Author

lucasw commented Jan 29, 2019

Release mode doesn't make difference, probably because the cpu usage is in ros2/rtps libraries already built (or not) in release mode.

@lucasw
Copy link
Owner Author

lucasw commented Mar 24, 2019

https://answers.ros.org/question/319218/how-does-ros2-implement-its-network-design/

Basically, the idea is that DDS's configuration options allow it to be many things between TCP and simple UDP, including a more flexible version of TCP, which in turn allows you to fine tune your communication settings to better work on lossy networks.

This comes at the cost of complexity and some performance (TCP on the local host is really good), but should allow knowledgeable users to get good results in more situations.

[bold emphasis mine]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant