Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to support standard messages with zero-copy transports #2201

Open
alsora opened this issue Jun 2, 2023 · 9 comments
Open

How to support standard messages with zero-copy transports #2201

alsora opened this issue Jun 2, 2023 · 9 comments

Comments

@alsora
Copy link
Collaborator

alsora commented Jun 2, 2023

Most of the available RMW implementations now support some sort of zero-copy transport for multiple processes in the same machine.
However, to use this feature, there's an important limitation: the size of the messages must be known at compile time, so they can't contain variable-length sequences.

This effectively makes it impossible to use the vast majority of ROS 2 standard messages: indeed, the presence of a header (with its string frame_id field) doesn't respect the requirement.

We are working with eProsima to find a solution to this problem.
Our plan is to modify the rosidl code and allow to automatically bound the unbounded elements.

For example: assume that all strings are capped at X characters.
This approach wouldn't affect the application code, that would still work with strings.
It would work under the hood when a message with a string would need to be published via shared memory.

We are discussing different implementations, with an upper bound that can be defined either at runtime or at build time and with different possible fallback mechanisms (i.e. what to do if the string exceeds the upper bound?).

We would like to gather feedbacks from the community about this, which in our opinion is a required feature in order to enable multi-process ROS 2 applications.

@alsora alsora changed the title Discussion: how to support standard messages with zero-copy transports How to support standard messages with zero-copy transports Jun 2, 2023
@clalancette
Copy link
Contributor

Our plan is to modify the rosidl code and allow to automatically bound the unbounded elements.

I don't think we should do this, or, at least, we should not do this by default.

I definitely understand the desire to want these things to be bounded for the zero-copy case. But silently making them bounded in the background is going to lead to a lot of confused users later on. There are a number of ways I could see us going here:

  1. Automatically generate bounds for unbounded types. This would make the zero-copy case better, at the expense of unbounded uses not actually being unbounded.
  2. Have a special rosidl "mode" where we automatically generate bounds for unbounded types, and leave the default as-is. This keeps the existing behaviour intact, while allowing those who want true zero-copy to have it. The downside here is that true zero-copy users will have to compile from source always.
  3. Start changing most of the ROS 2 core messages to actually be bounded. For instance, we could change std_msgs/msg/Header to have a bounded string for the frame_id directly in the message. I think this is superior in most ways to case 1 above, as we are explicit about where the bounds are. The downside here is that we have to have a large transition period where we convert all of the messages over to be bounded, and this really only helps the messages in the ROS 2 core.

There may be other ways to go. But I think we should have a conversation about this, either in the ROS 2 weekly meeting or in the client libraries working group, as may potentially have wide-ranging implications.

@alsora
Copy link
Collaborator Author

alsora commented Jun 4, 2023

Have a special rosidl "mode" where we automatically generate bounds for unbounded types, and leave the default as-is. This keeps the existing behaviour intact, while allowing those who want true zero-copy to have it. The downside here is that true zero-copy users will have to compile from source always.

This is the type of approach that we are investigating.
We are exploring both compile time and runtime configurations, but always keeping the default behavior unchanged.
(for example: an env variable that defines the upper bound and by default it's "no bound")

Start changing most of the ROS 2 core messages to actually be bounded.

This seems problematic to me.
It would likely be not backward-compatible with the existing implementation.

I think we should have a conversation about this, either in the ROS 2 weekly meeting or in the client libraries working group

Definitely!
We'll try to get some more concrete idea in the next weeks and then present it to the community.

@MiguelCompany
Copy link
Contributor

MiguelCompany commented Jul 18, 2023

We have been working on a PoC for this. See ros2/rosidl#758 and ros2/rosidl_typesupport_fastrtps#106 for the relevant changes.

I'm attaching a ZIP file prepared by my colleague @EduPonz with a README.md and a docker compose project that demonstrates the usage of the Zero-Copy compatible ROS 2 types with strings. Getting the demo up and running is just a matter of running a docker compose up.

The ZIP also contains two .repos files for VCS, one with the three repos that are needed for the feature, and another one for re-building the common interfaces and the demos so you can see the feature in action.

Please do let us know what you think!

ros2_fixed_strings.zip

@MiguelCompany
Copy link
Contributor

@allenh1 I've seen this blog post of yours, and I think it aligns with the work being done here.

Are you planning on open-sourcing the work described in that post?
Would you be willing to contribute changes on the relevant ROS 2 repos (namely rosidl) ?

@allenh1
Copy link
Contributor

allenh1 commented Jul 28, 2023

Are you planning on open-sourcing the work described in that post?

@MiguelCompany At the moment, there is no plan to open source the work described in that post.

Our implementation relies on a number of assumptions we can make about the memory layout of the typesupport representation for the C++ messages. Specifically, it relies on them being the same. So any middleware without a way to ensure the typesupport representation is the same memory layout as the C++ generated messages will encounter issues.

The other important detail is the StorageBase class mentioned in that blog. This mechanism allows us to wrap an array as a bounded vector, and use that for bounded sequences and bounded strings (which is slightly nicer than fixed strings and arrays). Since this makes the vectors contiguous, they can be allocated in the middleware, and used in the C++ messages directly.

I think a great first step would be to modify the rosidl_runtime_cpp bounded vector implementation to not use std::vector as a base, and do something like the StorageBase implementation described in my blog post, as well as creating an analog for strings.

One thing to keep in mind here is that the C++ messages, as well as the generated typesupport representation of the message, both depend on the same vector implementation. This implementation is then a dependency of the middlewares, as well as the message packages, so rosidl_runtime_cpp might not be the best place for that implementation.

@MiguelCompany
Copy link
Contributor

@clalancette Could you take a look at the PoC mentioned in #2201 (comment)?

Would be nice to check whether the approach seems correct before going further with the implementation

@homalozoa
Copy link
Contributor

2. Have a special rosidl "mode" where we automatically generate bounds for unbounded types, and leave the default as-is. This keeps the existing behaviour intact, while allowing those who want true zero-copy to have it. The downside here is that true zero-copy users will have to compile from source always

Hi @clalancette ! May I know how to change rosidl mode for zero-copy use cases? Thank you!

@ZhenshengLee
Copy link

May I know how to change rosidl mode for zero-copy use cases?

@homalozoa

I think what @clalancette said in #2201 (comment) is only proposal which is not implemanted yet.

@Zard-C
Copy link
Contributor

Zard-C commented Nov 21, 2023

Hi guys, I would like to try it with [rclc] demo nodes if we had a c version of typesupport, (I've checked ros2/rosidl_typesupport_fastrtps#106 and ros2/rosidl#758, but I didn't find out how to use it with rclc 😸)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants