Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call conversion functions directly #10

Merged
merged 12 commits into from
Sep 18, 2018
Merged

Call conversion functions directly #10

merged 12 commits into from
Sep 18, 2018

Conversation

martins-mozeiko
Copy link
Contributor

@martins-mozeiko martins-mozeiko commented Aug 20, 2018

This makes generated convert to and from function to call C functions directly instead of going through Python capsule API - that currently requires importing module, which can lead to a lot of imports for nested types in array. Now conversion functions are placed in separate libraries that link to each other. Python module libraries link to these new libraries.

See #9 for more details.

Sorry, I cannot provide any specific performance numbers right now, I can try to do that a bit later.

I have tested this change on Ubuntu 16 and Windows.

Connects to #9

@tfoote tfoote added the in review Waiting for review (Kanban column) label Aug 20, 2018
@mikaelarguedas
Copy link
Member

A run of CI on all platforms. It looks like the libraries of the dependencies cannot be found on MacOS.

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

if(WIN32)
target_link_libraries(${_target_name_lib} "${_pkg_base}/Lib/${_pkg_filename}.lib")
else()
target_link_libraries(${_target_name_lib} "${_pkg_base}/lib/lib${_pkg_filename}.so")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On macos the generated libraries have the .dylib extension and not .so. This seems to be the reason it fails to build

Copy link
Member

@mikaelarguedas mikaelarguedas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @martins-mozeiko got the improvement!

The code looks good to me overall, there are 2 main things to fixup before rerunning CI:

  • library extension for macos
  • generate a single "*_python" library per package

More details below

add_dependencies(
${_target_name_lib}
${rosidl_generate_interfaces_TARGET}${_target_suffix}
${rosidl_generate_interfaces_TARGET}__rosidl_typesupport_c
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the dependency on the typesupport_c lib needed for the new libraries?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along the same lines, if we don't need to depend on specific typesupport, we may also be able to generate a single _python_ library instead of 1 per typesupport

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I gave that a shot at 7fbb832 and it seems to work fine with a single python library per package

# define IMPORT_API __declspec(dllimport)
#else
# define IMPORT_API
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a header containing visibility macros in included on the next line (#include <rosidl_generator_c/visibility_control.h>), I'd recommend using the macros deinfed in that file instead of redefining new ones here.
The same logic could be applied to the other template below

Copy link
Member

@mikaelarguedas mikaelarguedas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

address review comments (used "comment" instead of "request changes" during my review :/)

@dirk-thomas
Copy link
Member

@martins-mozeiko Is there any update in this?

martins-mozeiko and others added 4 commits September 6, 2018 18:11
This makes generated convert to and from function to call C functions
direclty instead of going through Python capsule API. Conversion
functions are placed in separate libraries that link to each other.
Python module libraries links to these new libraries.

See #9 for more details.
@martins-mozeiko
Copy link
Contributor Author

I've added changes requested - now it uses dylib extension on macOS (I have not Mac available currently, so I cannot test it), and it uses visibility macros from visibility_control.h header instead of defining its own macros.

I've also integrated changes from Mikael and rebased all commits on top of master.

Tried building & running demo_nodes_py talker/listener - it works fine on Windows & Linux.

@mikaelarguedas mikaelarguedas dismissed their stale review September 10, 2018 16:35

Code has been updated, needs new review

@mikaelarguedas
Copy link
Member

Thanks @martins-mozeiko for iterating.

Tried building & running demo_nodes_py talker/listener - it works fine on Windows & Linux.

It looks like this change fails the linter tests on CI, you should be able to test the changes by running colcon test --packages-select <PACKAGES_TO_TEST>.

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

I pushed some fixups:

  • spurious line change from my CMake code shuffling: 77009ae
  • remove duplication of visibility_control include: 617e0c6
  • remove new blank line in generated files to fix linter failures: f931dbd

With fixups (remainung rcl test failure expected and tracked at ros2/rmw_fastrtps#226):

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@mikaelarguedas
Copy link
Member

Sorry, I cannot provide any specific performance numbers right now, I can try to do that a bit later.

I didnt get a chance to test the performances improvements either. Do you happen to have more data in that matter?

@mikaelarguedas
Copy link
Member

Change looks good to me 👍

@dirk-thomas do you want to provide a second review on this ? or should I merge as is ?

Copy link
Member

@dirk-thomas dirk-thomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beside the one comment about naming the new functions I think it would be good to reproduce the performance impact (since that is the whole reason for this patch).

rosidl_generator_py/resource/_msg_support.c.em Outdated Show resolved Hide resolved
@mikaelarguedas
Copy link
Member

think it would be good to reproduce the performance impact (since that is the whole reason for this patch).

I agree that having numbers regarding the performance improvement would be valuable. However, I think we agree that the new approach is cleaner / preferable to the current one regardless of the exact measured performance gain.

I think this function should use a double underscore between the package name and the field type in order to clearly separate the two from each other (same as between the field type and the fixed name).

I agree that would be are better naming pattern. Though this PR doesn't change the function names that are already implemented using this scheme. I am fine renaming them in here but would not be opposed in doing it in a follow-up as it's not strictly in the scope of this PR

@mikaelarguedas
Copy link
Member

Renamed the functions in 18d2af1

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@mikaelarguedas
Copy link
Member

@martins-mozeiko when you get a chance, can you please test the latest state of this PR and provide feedback on the (hopefully) time improvements compared to master?

@martins-mozeiko
Copy link
Contributor Author

I'll try to do make some comparison. The issue is that difference is mostly visible only on weaker platforms like Pi. On powerful desktop I don't remember if I saw significant change (but I can recheck). And building for Pi using system we have here takes 4+ hours (arm qemu in docker). And I need to do two builds with and without patch. Also builds we have works only for ardent, I'll need to check how can I upgrade it to bouncy & colcon. So it'll take a while, but I'll look into it next week.

@martins-mozeiko
Copy link
Contributor Author

I've made a following test - publisher sends geometry_msgs/PoseArray messages pretty much as fast as it can. And subscribe counts how long does it take to receive 100 messages.

Here is publisher.py code:

#!/usr/bin/env python3

import rclpy, rclpy.node
from geometry_msgs.msg import Pose, PoseArray

count = 0

def cb():
  global pub
  global count
  global msg

  pub.publish(msg)

  count += 1
  print("Sent", count)

msg = PoseArray()
msg.header.stamp.sec = 100
msg.header.stamp.nanosec = 200
msg.header.frame_id = "frame"
for i in range(1024):
  p = Pose()
  p.position.x = 1.0
  p.position.y = 2.0
  p.position.z = 3.0
  p.orientation.x = 4.0
  p.orientation.y = 5.0
  p.orientation.z = 6.0
  p.orientation.w = 7.0
  msg.poses.append(p)

rclpy.init()
n = rclpy.node.Node("publisher")
pub = n.create_publisher(PoseArray, "/test")
timer = n.create_timer(0.001, cb)

rclpy.spin(n)

And here is subscriber.py code:

#!/usr/bin/env python3

import rclpy, rclpy.node
from time import time
from geometry_msgs.msg import PoseArray

count = 0

def cb(msg):
  global count
  count += 1
  if count == 100:
    global start
    print(time() - start)
    exit()

rclpy.init()
n = rclpy.node.Node("listener")
sub = n.create_subscription(PoseArray, "/test", cb)

start = time()
rclpy.spin(n)

Running this on Raspberry Pi 3 with ROS2 built from master gives me ~63 seconds.

After applying this pull request the time decreases to ~37 seconds.

@mikaelarguedas
Copy link
Member

mikaelarguedas commented Sep 18, 2018

Thanks @martins-mozeiko for testing and the example 👍

Tried a similar thing on amd64 with a few changes:

  • setting the start time when the first received message so that it's easier to reproduce
  • printing to console only once every 100th
  • stop after receiving a thousand message

Without this patch

$ PYTHONOPTIMIZE=0 python3 receiver.py 
16.82235884666443

With this patch:

$ PYTHONOPTIMIZE=0 python3 receiver.py 
14.596002578735352
Code

sender.py:

import rclpy, rclpy.node
from geometry_msgs.msg import Pose, PoseArray

count = 0

def cb():
    global pub
    global count
    global msg

    pub.publish(msg)

    count += 1
    if count % 100 == 0:
        print("Sent", count)

msg = PoseArray()
msg.header.stamp.sec = 100
msg.header.stamp.nanosec = 200
msg.header.frame_id = "frame"
for i in range(1024):
    p = Pose()
    p.position.x = 1.0
    p.position.y = 2.0
    p.position.z = 3.0
    p.orientation.x = 4.0
    p.orientation.y = 5.0
    p.orientation.z = 6.0
    p.orientation.w = 7.0
    msg.poses.append(p)

rclpy.init()
n = rclpy.create_node("publisher")
pub = n.create_publisher(PoseArray, "/test")
timer = n.create_timer(0.001, cb)

rclpy.spin(n)

receiver.py:

import rclpy
from time import time
from geometry_msgs.msg import PoseArray

count = 0
start = None


def cb(msg):
    global count
    global start
    if start is None:
        start = time()

    count += 1
    if count == 1000:
        global start
        print(time() - start)
        exit()


rclpy.init()
n = rclpy.create_node('listener')
sub = n.create_subscription(PoseArray, '/test', cb)

rclpy.spin(n)

Not as drastic as what is observed on the RaspberryPi but there is definitely an improvement

@mikaelarguedas
Copy link
Member

Thanks @martins-mozeiko for the patch 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants