New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ros2_tracing tracepoints #120
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cwecht, thank you for your contribution! The changes look ok, I only had a few minor comments.
One thing to consider is that adding a single new argument to the write()
functions is fine, but we might be venturing on a "slippery slope", and it might be better to refactor the functions to accept a single struct
argument that contains all extra arguments. E.g.
// New structure for extra arguments. We could include the `serialized` flag too,
// but that's already part of RMW_Connext_Message
struct RMW_Connext_WriteParams {
DDS_Time_t timestamp{DDS_TIME_INVALID};
int64_t * sn_out{nullptr};
};
// Updated write() functions
rmw_connextdds_write_message(
RMW_Connext_Publisher * const pub,
RMW_Connext_Message * const message,
RMW_Connext_WriteParams * const params);
class RMW_Connext_Publisher {
write(
const void * const ros_message,
const bool serialized,
RMW_Connext_WriteParams * consts params = nullptr);
}
@@ -2973,6 +2994,14 @@ RMW_Connext_Service::send_response( | |||
rr_msg.gid.implementation_identifier = RMW_CONNEXTDDS_ID; | |||
rr_msg.payload = const_cast<void *>(ros_response); | |||
|
|||
DDS_Time_t timestamp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are using Connext's C API we should initialize the variable to a known state, e.g.
DDS_Time_t timestamp; | |
DDS_Time_t timestamp = DDS_TIME_INVALID; |
if (DDS_RETCODE_OK != | ||
DDS_DomainParticipant_get_current_time(this->reply_pub->dds_participant(), ×tamp)) | ||
{ | ||
RMW_CONNEXT_LOG_ERROR_SET("failed to write message to DDS") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change the message to be a bit more specific to this failure, e.g.
RMW_CONNEXT_LOG_ERROR_SET("failed to write message to DDS") | |
RMW_CONNEXT_LOG_ERROR_SET("failed to get current time from DDS DomainParticipant") |
I'm not very familiar with the clock configuration in Connext, and I'm looking into possible alternative solutions. I'll get back after I have a better idea of available options. |
@asorbini Thank you for the review! I just applied the changes you requested and (forced) pushed them. |
@cwecht are you sure you pushed the change? The code looks unchanged. In general, I would prefer if you didn't force push but rather pushed an incremental commit so that the additional changes can be reviewed without having to re-scan through all of the code. One reason to force push in this case would be to fix the commit message, which should include a DCO (e.g. "Signed-off-by: Your Name your@email"). I guess we are not strictly enforcing this rule and we are missing a CONTRIBUTING document in this repository (like this one), but the convention has been followed by all contributions, as far as I'm aware. |
024ad33
to
b2a6939
Compare
@asorbini now it's actually pushed including the signoff. |
I've got some news regarding DDS_DomainParticipant_get_current_time. I was able to test this PR with connext micro and using DDS_DomainParticipant_get_current_time in this case is just fine; there is basically no significant performance overhead. For Connext PRO it is a different story:
@asorbini could you double check this conclusion? |
94f0287
to
02773b9
Compare
@asorbini I've added now |
02773b9
to
4236143
Compare
@asorbini May I ask you kindly to have another look at this PR? Thank you very much! |
Signed-off-by: Christopher Wecht <cwecht@mailbox.org>
Signed-off-by: Christopher Wecht <cwecht@mailbox.org>
610afee
to
1acd284
Compare
Running full CI with repos for this PR, ros2/rmw_cyclonedds#454, ros2/rmw_fastrtps#694, ros2/ros2_tracing#74: ETA: Build failed because of typo in dds_api.hpp. |
Signed-off-by: Christopher Wecht <cwecht@mailbox.org>
1acd284
to
037a59d
Compare
I'm note quite sure why the latest build failed. Looks more like a jenkins hiccup to me. |
@cwecht agreed, seems like an unrelated failure. Running Linux again to check: . The other two builds are yellow because of a formatting error unrelated to this PR. |
I will point out that the nightlies have no issues: https://ci.ros2.org/view/nightly/job/nightly_linux-aarch64_release/2582/ . So it is worth investigating those before merging. |
@clalancette thank you for the clarification. I totally agree that those issues should be resolved, but they are C++ linter errors and they stem from a different PR (ros2/rmw_cyclonedds#454), hence why I didn't think they'd affect this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI passed (only failure comes from ros2/rmw_cyclonedds#454).
Changes can be merged once ros2/ros2_tracing#74 is in.
Depends on ros2/ros2_tracing#74; implements: ros2/ros2_tracing#73
This PR introduces ros2_tracing tracepoints which are really helpful for examining system performance. The dependency to the ros2_tracing PR stems from the fact, that one of the original tracepoints had to be extended in order to get the timestamp during writing to DDS.
In order to do that,
DDS_DataWriter_write_untypedI
had to be replaced withDDS_DataWriter_write_w_timestamp_untypedI
. At first it seemed reasonable to use` in order to get the respective timestamp.This seems to work, but
DDS_DomainParticipant_get_current_time
introduces a massive performance penality. The flamegraph below shows the result of profiling it using perf.From https://community.rti.com/kb/how-clocks-are-used-connext-dds I'd conclude, that we need the "external clock" used by Connext here. Looking at the perf profile, it seems that there is a method/function
DDS_DomainParticipant_get_external_clockI
which would provide to us said clock, but it seems that this function is not part of Connexts public API right? Is there any way to get this clock or the timestamp which Connext would use ifDDS_DataWriter_write_untypedI
is used without paying such a massive performance penalty?