Skip to content

Fix topic statistics for IPC subscriptions#3130

Open
jayyoung wants to merge 1 commit intoros2:rollingfrom
botsandus:fix-topic-stats-when-using-ipc
Open

Fix topic statistics for IPC subscriptions#3130
jayyoung wants to merge 1 commit intoros2:rollingfrom
botsandus:fix-topic-stats-when-using-ipc

Conversation

@jayyoung
Copy link
Copy Markdown

Continuing on from the discussion here on the original fix: #2913 (comment) which addresses #2911 we are opening this PR at @fujitatomoya request.

Description

The root issue: We discovered that when use_intra_process_comms is enabled, messages are delivered via SubscriptionIntraProcess::execute_impl() which previously never called the topic statistics handler. This causes all statistics to report NaN values, making it impossible for us distinguish a healthy IPC subscription from an unhealthy one.

This fix: Here we add a type-erased StatsHandlerFn function to SubscriptionIntraProcess and connect it up from the Subscription via a lambda. Using the lambda rather than a pointer to SubscriptionTopicStatistics avoids a circular include chain we discovered that could occur if subscription_topic_statistics.hpp was included directly from subscription_intra_process.hpp.

In the code, we set source_timestamp to the receive time so that message_age reports 0ms rather than an un-initialised value. Our reasoning is that: IPC delivery has no or little transport latency, so near-zero age is OK and expected. As discussed in the original fix PR, there is a one-time warning emit to inform users that message_age is not meaningful for IPC subscriptions.

IN summary this PR is a revised implementation of the original PR #2913 by @roman-y-wu, with the following differences:

  • Uses std::function type erasure to avoid the circular include issue we found
  • Sets source_timestamp to prevent message_age reporting an invalid large value (uninitialised timestamp)

From the original PR, we kept the one-time RCLCPP_WARN_ONCE log message to ensures users are not silently surprised by message_age reading ~0ms when using IPC subscriptions.

Fixes #2911

Is this user-facing behavior change?

Yeah it is for us, and addresses an issue we find in production. With this fix topic statistics now report valid values for subscriptions using IPC. Previously all statistics were NaN when IPC was enabled, and this behaviour bit us.

Did you use Generative AI?

Not on our side. This PR is a small adaptation of a older fix that already existed: #2913

Additional Information

This fix was first validated as a backport onto ROS 2 Kilted (rclcpp == 29.5.6) and has been running in production at Dexory across a fleet of 126 autonomous warehouse robots for over one month with no issues. Our robots use IPC-enabled lidar pipelines and rely on topic stats to monitor sensor health. message_period statistics are now correct and our monitoring correctly detects legitimate sensor failures with this fix.

If use_intra_process_comms is enabled, messages are delivered via
SubscriptionIntraProcess::execute_impl() which previously never called
the topic statistics handler, this causes all statistics to report NaN values.

The fix here puts a type-erased StatsHandlerFn in SubscriptionIntraProcess
and wires it up from the Subscription via a  lambda. Using std::function avoids
a circular include chain that would occur if subscription_topic_statistics.hpp
were included directly from subscription_intra_process.hpp via
publisher.hpp/callback_group.hpp.

The source_timestamp is set to the receive time so that message_age reports
0ms rather than an un-initialised value. IPC delivery has little/no expected transport
latency, by definition so near-zero age is expected.

Fixes ros2#2911

Signed-off-by: jayyoung <jay.young@gmail.com>
msg_info.publisher_gid = {0, {0}};
msg_info.from_intra_process = true;

std::chrono::time_point<std::chrono::system_clock> now;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to declare now in the scope of the if below since it's only used there. Secondly I propose to declare nanos outside so that you can reuse it later without casting, since now doesn't change

#include <stdexcept>
#include <string>
#include <type_traits>
#include <utility>
Copy link
Copy Markdown
Contributor

@tonynajjar tonynajjar Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include <chrono> ?

@tonynajjar
Copy link
Copy Markdown
Contributor

Could you also add a test that verifies that with IPC enabled the stats handler fires and reports a non-nan age?

@tonynajjar
Copy link
Copy Markdown
Contributor

@fujitatomoya we'd love to get this merged in before the freeze of rclcpp, I believe on the 20th April. Anything else you would like modified?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

use_intra_process_comms bypasses topic statistics computation

2 participants