-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check QoS policy when configuring intraprocess, skip interprocess publish when possible #664
Check QoS policy when configuring intraprocess, skip interprocess publish when possible #664
Conversation
rclcpp/src/rclcpp/publisher.cpp
Outdated
@@ -231,6 +231,19 @@ PublisherBase::setup_intra_process( | |||
IntraProcessManagerSharedPtr ipm, | |||
const rcl_publisher_options_t & intra_process_options) | |||
{ | |||
// Skip intraprocess configuration if the "durability" qos policy is not "volatile". | |||
// TODO(ivanpauno): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good, in general, to have methods to check what is the real value of RMW_QOS_*_SYSTEM_DEFAULT. Or maybe, just delete all these defines and only support explicit configurations. Having unknown default values doesn't seem like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct snippet of the TODO:
rclcpp/rclcpp/src/rclcpp/publisher.cpp
Lines 235 to 246 in c31b273
// TODO(ivanpauno): | |
// RMW_QOS_POLICY_DURABILITY_SYSTEM_DEFAULT is for FastRTPS and Connext also | |
// equal to "volatile". It would be good to add a | |
// is_durability_policy_volatile(rcl_publisher_options_t options) | |
// method to rcl, and corresponging methods in rmw and the rmw vendor | |
// implementations, which will also checks if "system_default" is "volatile" or not. | |
if (intra_process_options.qos.durability != RMW_QOS_POLICY_DURABILITY_VOLATILE) { | |
RCLCPP_WARN( | |
rclcpp::get_logger("rclcpp"), | |
"Enabled intraprocess communication with incompatible QoS policy."); | |
return; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I agree with you that we should avoid implicit settings, which is what the SYSTEM_DEFAULT
gives you, but there's a use case that some people want supported, which allows you to configure QoS settings from external XML files. That is what this allows for, so I don't know if we could realistically get rid of those.
We definitely should have a way to check the effective QoS settings after the publisher is created.
Also, I think this (re)raises the issue that use_intra_process_comms
needs to be an option on publishers/subscriptions, and not on the node itself. Otherwise you'd be unable to use intra process and have a non-volatile durability on one of your publishers.
Once we have that, I'd say it should be an error (not a warning) to have both intra process enabled and transient local durability on the same publisher/subscription.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree about having use_intra_process_comms option at publish/subscription level. Switching the warning to an error after that also sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still find this point unresolved, until we have the ability to know what the durability is when SYSTEM_DEFAULT
is in use and until we have the ability to control which publishers and subscriptions have intra-process enabled.
To be clear about the first point, I think we need a way to inspect a publisher or subscription after creation to see which durability it is using. Something like rcl_ret_t rcl_publisher_get_actual_qos(const rcl_publisher_t *, rmw_qos_profile *)
, which may need an rmw
equivalent.
I know this increases the scope of this issue considerably, but without it the warning is only going to work part of the time and would produce very subtle bugs for the user.
The alternative is an even bigger scope increase, which is to actually support transient-local durability correctly when intra-process is enabled.
For the second point, what I imagine is that the creation publishers and subscriptions have a three-way option which is "explicitly enable intra-process", "explicitly disable intra-process", or "default". Where "default" would be based on the intra-process setting at the node level.
I think this is important to do now (again I know it increases the scope of this issue), because otherwise you have to use a warning rather than an error to let the user know that they have conflicting configurations. Without it, you also have scenarios which are impossible to achieve, i.e. using intra-process and transient-local durability at the same time but on different publishers and/or subscriptions.
rclcpp/src/rclcpp/publisher.cpp
Outdated
@@ -231,6 +231,19 @@ PublisherBase::setup_intra_process( | |||
IntraProcessManagerSharedPtr ipm, | |||
const rcl_publisher_options_t & intra_process_options) | |||
{ | |||
// Skip intraprocess configuration if the "durability" qos policy is not "volatile". | |||
// TODO(ivanpauno): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I agree with you that we should avoid implicit settings, which is what the SYSTEM_DEFAULT
gives you, but there's a use case that some people want supported, which allows you to configure QoS settings from external XML files. That is what this allows for, so I don't know if we could realistically get rid of those.
We definitely should have a way to check the effective QoS settings after the publisher is created.
Also, I think this (re)raises the issue that use_intra_process_comms
needs to be an option on publishers/subscriptions, and not on the node itself. Otherwise you'd be unable to use intra process and have a non-volatile durability on one of your publishers.
Once we have that, I'd say it should be an error (not a warning) to have both intra process enabled and transient local durability on the same publisher/subscription.
@@ -169,7 +169,7 @@ class PublisherBase | |||
|
|||
using IntraProcessManagerWeakPtr = | |||
std::weak_ptr<rclcpp::intra_process_manager::IntraProcessManager>; | |||
bool use_intra_process_; | |||
bool intra_process_is_enabled_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the naming of the variable I see why you don't want to use use_
anymore since the decision about if it is actually being used is delayed. I think the order of the parts of the variable name was better before. I would suggest allow_intra_process_
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree about this name change suggestion, but the other one is fine.
if (!allow_intra_process_ || has_inter_process_subscriptions) {
Isn't as clear to read in my opinion, as it reads to me one of "not allow intra process" or "allow not intra process", neither of which make sense to me, you have to look at it a while to get to "intra process is not allowed" and even then allow implies to me that it may or may not be enabled, which isn't the case. At this point it is either enable or it is not.
Whereas:
if (!intra_process_is_enabled_ || has_inter_process_subscriptions) {
Is very clear immediately, as it reads one of "not intra process is enabled" or "intra process is not enabled", which are better for me than the alternative you suggested.
So 👎 from me to change this variable name.
@@ -216,7 +216,11 @@ class Publisher : public PublisherBase | |||
virtual void | |||
publish(std::unique_ptr<MessageT, MessageDeleter> & msg) | |||
{ | |||
this->do_inter_process_publish(msg.get()); | |||
bool inter_process_subscriptions_exist = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the above I would suggest naming this one has_inter_process_subscriptions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find this name any better or worse.
rclcpp/src/rclcpp/publisher.cpp
Outdated
if (intra_process_options.qos.durability != RMW_QOS_POLICY_DURABILITY_VOLATILE) { | ||
RCLCPP_WARN( | ||
rclcpp::get_logger("rclcpp"), | ||
"Enabled intraprocess communication with incompatible QoS policy."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the warning should be more specific about the consequence:
Skipping intraprocess communication with incompatible QoS policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivanpauno are you not planning to add the publisher/subscription option to enable intra-process and therefore make this an error? If you are planning to do so, then I think this warning does not need to be reworded as it will be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do it in another PR. For the moment, I will keep the warning reworded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really agree, I think warning isn't really sufficient, as I said in my other comment this basically prevents you from using intra-process and transient-local durability at the same time (even if on different entities). So after merging this pr we'll be potentially breaking existing use cases for a very narrow optimization. I would prefer to see the improvement done completely in this pr than have two pr's.
rclcpp/src/rclcpp/publisher.cpp
Outdated
@@ -231,6 +231,19 @@ PublisherBase::setup_intra_process( | |||
IntraProcessManagerSharedPtr ipm, | |||
const rcl_publisher_options_t & intra_process_options) | |||
{ | |||
// Skip intraprocess configuration if the "durability" qos policy is not "volatile". | |||
// TODO(ivanpauno): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still find this point unresolved, until we have the ability to know what the durability is when SYSTEM_DEFAULT
is in use and until we have the ability to control which publishers and subscriptions have intra-process enabled.
To be clear about the first point, I think we need a way to inspect a publisher or subscription after creation to see which durability it is using. Something like rcl_ret_t rcl_publisher_get_actual_qos(const rcl_publisher_t *, rmw_qos_profile *)
, which may need an rmw
equivalent.
I know this increases the scope of this issue considerably, but without it the warning is only going to work part of the time and would produce very subtle bugs for the user.
The alternative is an even bigger scope increase, which is to actually support transient-local durability correctly when intra-process is enabled.
For the second point, what I imagine is that the creation publishers and subscriptions have a three-way option which is "explicitly enable intra-process", "explicitly disable intra-process", or "default". Where "default" would be based on the intra-process setting at the node level.
I think this is important to do now (again I know it increases the scope of this issue), because otherwise you have to use a warning rather than an error to let the user know that they have conflicting configurations. Without it, you also have scenarios which are impossible to achieve, i.e. using intra-process and transient-local durability at the same time but on different publishers and/or subscriptions.
Ok, I will add those methods.
Supporting non-volatile durability is a really big change, specially in the case HISTORY=KEEP_ALL. I think we can do a big improvement in the intraprocess comm without tackling that.
I was thinking in adding a bool option to them, but a three-way option sounds quite better. |
Thank you, I know it's a bit of a task, so if you need help let me know.
I agree, that's why I want to instead properly plug this hole in the API until we can do the improvement. Which also brings up the point, if there are any other QoS the intra-process are incompatible with at the moment, we should use these new "actual QoS" accessors to disable it in those cases as well.
Again, a little more work, but I think it will make it easier for people to transition to from the current state. |
…p interprocess publish when only having intraprocess subscriptions Signed-off-by: ivanpauno <ivanpauno@ekumenlabs.com>
Signed-off-by: ivanpauno <ivanpauno@ekumenlabs.com>
Signed-off-by: ivanpauno <ivanpauno@ekumenlabs.com>
Signed-off-by: ivanpauno <ivanpauno@ekumenlabs.com>
1ee7c94
to
8a176bf
Compare
Signed-off-by: ivanpauno <ivanpauno@ekumenlabs.com>
I merged this over ivanpauno/publisher_get_actual_qos unintentionally. |
Wait did you revert it too? Please be careful to only merge things that have been reviewed and pass CI... |
Wait was it merged into something other than master? That makes more sense 😀 |
Yes, thanks for the correction! I merged this into ivanpauno/publisher_get_actual_qos branch unintentionally. I've already reverted the changes in that branch, and opened a follow-up PR here. |
* Implement a generic way to change logging levels This allows a per-logger configuration of the logging level, along with the global level. Signed-off-by: Chen Lihui <Lihui.Chen@sony.com> Signed-off-by: Chris Lalancette <clalancette@openrobotics.org> Co-authored-by: Tomoya.Fujita <Tomoya.Fujita@sony.com> Co-authored-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
This is part of the intraprocess comm improvement (ros2/ros2#649, https://docs.google.com/document/d/1dM8O34GM-SbFps1vd1rocPMAg_7O9_-OG4NF4axOz7E/edit#)
As the current and proposed IPM doesn't complain with a non-volatile 'Durability' QoS policy, intraprocess comm shouldn't be configured in those cases.
e.g.: If a publisher and subscriber in the same process had been configured with QoS policy
'durability=transient_local" and "history=keep_last", a late subscriber would receive the last message
published, as this was not supported by the intra process manager.
Previously, interprocess publish was always done. Now, it is skipped when only having intraprocess subscriptions.
e.g.: Process B subscribes to topic /topic.
Process A publishes to topic /topic just after process B subscriber.
Since maybe the count updating time it's not the same as how the publish method checks its subscribers, a message is not received when before it was.
(This hasn't been checked, it's only a possible drawback of this change)