Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add intra process tracepoint #30

Merged
merged 1 commit into from
Apr 13, 2023

Conversation

ymski
Copy link
Contributor

@ymski ymski commented Jan 25, 2023

Hi, I'm a member of CARET development team. We would like to officially support some tracepoints added by caret in the rclcpp fork and LD_PRELOAD. I am thinking about adding tracepoints to evaluate the time from publish to subscription callback when using intra-process communication. The following link discusses this topic.

action-ros-ci-repos-override: https://gist.githubusercontent.com/christophebedard/e6525cda7d3f79d419374eb833fa328d/raw/b241df85e837804804adb32907c3900ace45a205/ros2.repos

@christophebedard
Copy link
Member

Thanks for the PR @ymski! We'll start by reviewing the rclcpp PR.

@ymski ymski marked this pull request as ready for review January 26, 2023 02:38
Copy link
Member

@christophebedard christophebedard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed some mistakes and tried to make everything more consistent with the rest of the instrumentation.

It would be great if you could update the design document to add a new Flow description subsection about the intra-process instrumentation. For example, see the Message publishing section and the Subscriptions callback section. You could probably create a single section for intra process publishing and intra process callbacks with a single diagram. I think you already have some diagrams here, although they might need to be updated or adapted: ros2/rclcpp#2091 (comment).

Ideally, adding a Python test for this instrumentation would also be great, see here: https://github.com/ros2/ros2_tracing/tree/rolling/test_tracetools/test. However, I understand if you don't want to do it right away in this PR.

Also, after applying the suggestions, can you rebase on rolling and fix the conflicts?

tracetools/include/tracetools/tracetools.h Outdated Show resolved Hide resolved
tracetools/include/tracetools/tracetools.h Outdated Show resolved Hide resolved
tracetools/include/tracetools/tracetools.h Outdated Show resolved Hide resolved
tracetools/include/tracetools/tp_call.h Outdated Show resolved Hide resolved
tracetools/src/tracetools.c Outdated Show resolved Hide resolved
tracetools/src/tracetools.c Outdated Show resolved Hide resolved
tracetools_trace/tracetools_trace/tools/names.py Outdated Show resolved Hide resolved
tracetools_trace/tracetools_trace/tools/tracepoints.py Outdated Show resolved Hide resolved
tracetools/include/tracetools/tracetools.h Outdated Show resolved Hide resolved
tracetools/src/tracetools.c Outdated Show resolved Hide resolved
@ymski
Copy link
Contributor Author

ymski commented Mar 22, 2023

@christophebedard
Thank you for your review.

I fixed some mistakes and tried to make everything more consistent with the rest of the instrumentation.

I was also concerned about consistency with existing code. Thank you for fixing it.

It would be great if you could update the design document to add a new Flow description subsection about the intra-process instrumentation.

I will create this one.

Ideally, adding a Python test for this instrumentation would also be great

OK. If you have any concerns about the code I have created, please let me know.

Also, after applying the suggestions, can you rebase on rolling and fix the conflicts?
OK.

@christophebedard
Copy link
Member

christophebedard commented Mar 23, 2023

There are some suggestions left.

Also, if you look at the changes, this PR introduces a lot of changes that are already on rolling: https://github.com/ros2/ros2_tracing/pull/30/files. Can you please rebase your branch on rolling instead of adding a "merge" commit (34b1abe)? I can do it if you prefer.

@ymski ymski force-pushed the add-intra-process-tracepoint branch 3 times, most recently from b1c77fb to 5907532 Compare March 27, 2023 07:40
@ymski
Copy link
Contributor Author

ymski commented Mar 27, 2023

The errors pointed out have been corrected and the branch has been rebased. Please wait a little longer for the test code and documentation.

@christophebedard
Copy link
Member

The errors pointed out have been corrected and the branch has been rebased. Please wait a little longer for the test code and documentation.

Let me know if you need help with the tests or documentation. The Python tests in test_tracetools (e.g., test_pub_sub.py) are a bit hard to read and take a bit of time to write, but I think they are definitely useful.

@ymski
Copy link
Contributor Author

ymski commented Mar 29, 2023

The pub-sub test in inter-process involves a single topic communication and checking the fields and order of the events. In the intra-process test case, rcl_interfaces/msg/parameterevent messages are intra-published without definition in the code. So multiple communication events are recorded, even if only one pub-sub communication is performed.

Here is one question. Do you want to test against all recorded communications? Or is it sufficient to test on events excluding the rcl_interfaces/msg/parameterevent above? In the latter case, the target of the test is the communication defined in the code and the number of communications is one.

@christophebedard
Copy link
Member

Yeah, you can definitely exclude messages like parameter event and only validate the trace data resulting from the single topic.

However, it might be a good idea to validate using more than 1 message. I think 2 messages would be fine (e.g., two consecutive messages on the same topic). The goal is to validate that you can indeed link the published message to the received message using the trace data, and that it works with more than just 1 random message.

(test_pub_sub.py is unfortunately not doing that, because it needs instrumentation in the DDS implementation in order to link published messages to received messages, and that instrumentation is only available in a fork. However, if that is merged eventually, I'll update the test to include that.)

@christophebedard
Copy link
Member

@ymski FYI the feature freeze for Iron is next week, on April 17th. Since it would be nice to get this in for the upcoming release, if you don't think you can finish writing the test in the next few days, just let me know and I can probably help out.

@ymski
Copy link
Contributor Author

ymski commented Apr 10, 2023

@christophebedard
I apologize for the delay. I have been busy and could not make progress on the work as quickly as I had hoped. Thank you for the announcement on 4/17. I would be happy if I could make progress by that day.

I added documentation and tests.
I described the documentation by adding the following sections, but please feel free to make changes as necessary:

  • IntraProcessBuffer creation
  • Intra-process callback
  • Intra-process message publishing

There is one thing I would like to discuss. I am not sure where to put the explanation of the tracepoint for ring_buffer_clear. If you have any ideas, please let me know.

I added two tests as follows:

  • test_buffer.py
    • Test for buffer creation
    • Check the link between buffer and subscription
  • test_intra_pub_sub.py
    • Test from publish to callback_end in intra-process communication
    • Since there are multiple publishes, I checked the tracepoints from rclcpp_intra_publish to callback_end for all of them.

I have one question at the end. I think it's necessary to add a copyright notice to the new files I added this time, but is it okay if I put the name of the company I work for?

@christophebedard christophebedard force-pushed the add-intra-process-tracepoint branch 2 times, most recently from 14be1a8 to 0a70ca2 Compare April 11, 2023 01:40
@christophebedard
Copy link
Member

christophebedard commented Apr 11, 2023

Thank you for the update and the new changes! I made some minor changes quickly (see below) and rebased on rolling.

I described the documentation by adding the following sections, but please feel free to make changes as necessary:

Great! I made some minor fixes. The description in the "Intra-process callback" section wasn't quite right, but the sequence diagram is good.

There is one thing I would like to discuss. I am not sure where to put the explanation of the tracepoint for ring_buffer_clear. If you have any ideas, please let me know.

You can just put it in the "IntraProcessBuffer creation" section.

I added two tests as follows:

Thank you, that looks good. I only made some very minor fixes.

I think it's necessary to add a copyright notice to the new files I added this time, but is it okay if I put the name of the company I work for?

Yes of course! That is how it usually works.


One last thing popped into my mind: tracepoint names usually start with the name of the corresponding layer, e.g., rclcpp_, rcl_, rmw_. So I think the tracepoints (ipb_to_subscription, buffer_to_ipb, construct_ring_buffer, ring_buffer_enqueue, ring_buffer_dequeue, ring_buffer_clear) should be renamed to start with rclcpp_. Then you can also include them in the big table/list of tracepoints in the design document, under the "Instrumentation design" section. Sorry for only mentioning this only now. I think it should be quick to rename them, though! Then this should be good!

@christophebedard
Copy link
Member

Also, you should create a pull request to add a release note for Iron to mention the intra-process instrumentation.

See for example this PR: ros2/ros2_documentation#3414

@ymski
Copy link
Contributor Author

ymski commented Apr 12, 2023

@christophebedard
I have added the prefix rclcpp to the newly added tracepoints and also updated the documentation accordingly. Thank you for providing an example of how to create a pull request for the release notes.

Copy link
Member

@christophebedard christophebedard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some final minor fixes, this looks good to me! Thank you again so much for this new instrumentation!

I'll wait until someone else reviews the rclcpp PR. Then I'll run CI and squash+merge eventually.

Copy link
Contributor

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just minor comment for design doc.

doc/design_ros_2.md Outdated Show resolved Hide resolved
Signed-off-by: Kodai Yamasaki <114902604+ymski@users.noreply.github.com>
Co-authored-by: Christophe Bedard <christophe.bedard@apex.ai>
@christophebedard
Copy link
Member

CI for this PR and ros2/rclcpp#2091, testing --packages-above tracetools and rclcpp:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@christophebedard
Copy link
Member

Alright, CI looks good. I'll merge this and ask someone to merge the rclcpp PR.

@christophebedard christophebedard merged commit 3a1a0ae into ros2:rolling Apr 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants