Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support to publish as loaned message #981

Merged

Conversation

Barry-Xu-2018
Copy link
Contributor

Address #914

In order to avoid judgement on which publish function is called while sending a message, I use a new class to wrap rclcpp::GenericPublisher.
If you have a better idea, please let me know.

Signed-off-by: Barry Xu <barry.xu@sony.com>
@Barry-Xu-2018 Barry-Xu-2018 requested a review from a team as a code owner March 29, 2022 10:09
@Barry-Xu-2018 Barry-Xu-2018 requested review from gbiggs and MichaelOrlov and removed request for a team March 29, 2022 10:09
@Barry-Xu-2018
Copy link
Contributor Author

The failure of Rpr__rosbag2__ubuntu_jammy_amd64 as below

03:11:35 /tmp/ws/src/rosbag2/rosbag2_transport/include/rosbag2_transport/player.hpp:69:60: error: ‘publish_as_loaned_msg’ is not a member of ‘rclcpp::GenericPublisher’
03:11:35    69 |       publish_func_ = std::bind(&rclcpp::GenericPublisher::publish_as_loaned_msg, publisher_, _1);
03:11:35       |                                                            ^~~~~~~~~~~~~~~~~~~~~
03:11:35 gmake[2]: *** [CMakeFiles/rosbag2_transport.dir/build.make:90: 

So rclcpp used by CI is old. New interface publish_as_loaned_msg (ros2/rclcpp#1856) isn't included.

@fujitatomoya
Copy link
Contributor

@MichaelOrlov @gbiggs could you take a look at this when you have time? thanks!

Copy link
Contributor

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

originally i was expecting we are going to provide ros2 bag play option that user can choose to use LoanedMessage. (#914 (comment)) what would you say?

i think option makes sense as following.

  • more options for user application.
  • IF anything happens in rmw implementation related to LoanedMessage, fallback would be really useful to avoid the problem in that distribution release.

@Barry-Xu-2018
Copy link
Contributor Author

originally i was expecting we are going to provide ros2 bag play option that user can choose to use LoanedMessage. (#914 (comment)) what would you say?

i think option makes sense as following.

  • more options for user application.
  • IF anything happens in rmw implementation related to LoanedMessage, fallback would be really useful to avoid the problem in that distribution release.

Yes. I miss this requirement.
I will update code.

Signed-off-by: Barry Xu <barry.xu@sony.com>
@Barry-Xu-2018
Copy link
Contributor Author

Done.
I updated code to add an option to enable loaned message.

Copy link
Contributor

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implementation looks good to me, it would be nice to add test.

Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Barry-Xu-2018 Thank you for implementation.

I think It will be better to have opposite command line parameter aka disable-loan-message. To be able to force disabling loaned messages and by default use loaned messages when it's possible.
In most cases it will be beneficial to use loaned messages and could be some exceptional rare cases when need to overcome such behavior via command line parameter.

Also IMO it's over-engineering creating separate class DataPublisher just to check one flag and choose appropriate method to call. It might be be rational if we would be using publish method from multiple places. However we already have abstraction layer for this purpose in form of dedicated method of the Player class Player::publish_message(rosbag2_storage::SerializedBagMessageSharedPtr message)

In my opinion the implementation could be as simple and clean as just modifying those publish_message(..) method as following.

bool Player::publish_message(rosbag2_storage::SerializedBagMessageSharedPtr message)
{
  bool message_published = false;
  auto publisher_iter = publishers_.find(message->topic_name);
  if (publisher_iter != publishers_.end()) {
    auto & current_publisher{publisher_iter->second};
    try {
      if (!play_options_.disable_loan_message && current_publisher->can_loan_messages()) {
        current_publisher->publish_as_loaned_msg(
          rclcpp::SerializedMessage(*message->serialized_data));
      } else {
        current_publisher->publish(rclcpp::SerializedMessage(*message->serialized_data));
      }
      message_published = true;
    } catch (const std::exception & e) {
      RCLCPP_ERROR_STREAM(
        get_logger(), "Failed to publish message on `" << message->topic_name <<
          "` topic. \nError: %s" << e.what());
    }
  }
  return message_published;
}

@Barry-Xu-2018
Copy link
Contributor Author

implementation looks good to me, it would be nice to add test.

Yes. I try to add tests.

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov

Thanks for your review.

I think It will be better to have opposite command line parameter aka disable-loan-message. To be able to force disabling loaned messages and by default use loaned messages when it's possible.
In most cases it will be beneficial to use loaned messages and could be some exceptional rare cases when need to overcome such behavior via command line parameter.

Okay. I will update code.

Also IMO it's over-engineering creating separate class DataPublisher just to check one flag and choose appropriate method to call. It might be be rational if we would be using publish method from multiple places. However we already have abstraction layer for this purpose in form of dedicated method of the Player class Player::publish_message(rosbag2_storage::SerializedBagMessageSharedPtr message)

In my opinion the implementation could be as simple and clean as just modifying those publish_message(..) method as following.

Yes. I also consider that only publish_message(..) is changed.
But I think check (decide which function is called) will be executed at each sending. This can seriously affect the performance while the sending frequency is high. And which function can be chosen can be decided while creating publisher.
About the impact of performance, what do you think ?

BTW, I don't find a simple way except using new class. Do you have any better idea ?

Signed-off-by: Barry Xu <barry.xu@sony.com>
@Barry-Xu-2018
Copy link
Contributor Author

implementation looks good to me, it would be nice to add test.

After consideration, it is hard to add test if we don't add new interface for test.

@fujitatomoya
Copy link
Contributor

@Barry-Xu-2018 i am good to go with maintainer approval, thanks!

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov friendly ping.

Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Barry-Xu-2018 Thank you for changing command line option to --disable-loan-message.

As regards to the your comment:

I also consider that only publish_message(..) is changed.
But I think check (decide which function is called) will be executed at each sending. This can > seriously affect the performance while the sending frequency is high. And which function can > be chosen can be decided while creating publisher.
About the impact of performance, what do you think ?

I understand your concern about performance in theory yes using class wrapper will be faster. But how faster? And In what cases? And does it really affect final publishing performance?
Let's do comprehensive performance analysis.
if do

if (!play_options_.disable_loan_message && current_publisher->can_loan_messages())

for each publish_message(message)
We will have one extra "if" statement with two conditions inside if conditions recalculated one extra if statement cost nothing in terms of performance. Let's figure out the cost for calculating conditions inside if.
In general taking pointer and variable value from memory which is in cache processor doing very fast just in one or two cycles. Let's assume that average embedded processor has 1 CPU core with 1GHz frequency. It does means that one processor operation or cycle takes around 1 nano second.
A few extra processor cycles or operations will take no more than a few nanoseconds. Let's round it very roughly to 50 nano seconds.
In our case I don't think that it will be possible to see or measure difference in performance if message publishing will be delayed on 50 nSec.
However random memory access with misses in cache produces the most delays and penalty on system performance. Let's consider if it's possible in our case.

  • Consider performance for checking !play_options_.disable_loan_message
    This is check for boolean value which is stored in struct which doesn't change during runtime, only during class construction. Usually compilers smart enough and putting such values to the register directly. Therefore accessing to the memory and hitting cache misses very very unlikely.
  • Consider evaluation value for current_publisher->can_loan_messages().
    First of all I want to mention that we have the same call for can_loan_messages() inside constructor of the loaned message. It does means that even if we will get cache misses during this branch all data will be cached and reused on the next step on publish operation in case if publisher can loan. i.e. no impact on performance if would have cache misses we will have them even without this extra check on the next operation.
  • We have one last case to consider when publisher can't loan.
    rclcpp has very tiny wrapper for can_loan_messages which is translates to the rcl_publisher_can_loan_messages
bool
PublisherBase::can_loan_messages() const
{
  return rcl_publisher_can_loan_messages(publisher_handle_.get());
}

Ok rcl_publisher_can_loan_messages(..) translates to the

bool rcl_publisher_can_loan_messages(const rcl_publisher_t * publisher)
{
  if (!rcl_publisher_is_valid(publisher)) {
    return false;  // error message already set
  }

  bool disable_loaned_message = false;
  rcl_ret_t ret = rcl_get_disable_loaned_message(&disable_loaned_message);
  if (ret == RCL_RET_OK && disable_loaned_message) {
    return false;
  }

  return publisher->impl->rmw_handle->can_loan_messages;
}

Evaluation for the if (!rcl_publisher_is_valid(publisher)) uses in every other rcl function related to the publisher and consist inside from number of checks for null pointers which we will certainly go in to the cache.
As regards to the evaluation of the publisher->impl->rmw_handle->can_loan_messages statement. It's also pretty much straight forward and reduces to the accessing to the can_loan_messages boolean field in rmw_publisher_t structure which also has type erased pointer to this publisher's data. Which is using during many function calls to the publisher. can_loan_messages is the last in the data structure it means it's highly likely it will go to the cache during any other rcl calls for the publisher. Other words there are very very low chances that this value will not be in cache.
It's more interesting evaluation of the rcl_get_disable_loaned_message(&disable_loaned_message) it has inside call for the rcutils_get_env(..) which is reading out environment variable.
This operation is going to be a bummer and could more significantly affect performance.

Conclusion: meaningful difference in performance could be only for the case when publisher can't loan messages for some circumstances.

It turns out that rcl_get_disable_loaned_message was added relatively recently in ros2/rcl#949 by @fujitatomoya as implementation of the feature request in ros2/rcl#945.
It's very obvious that calling getenv from cstdlib each time when we are borrowing loaned message will affect all operations for publishing and getting loaned messages from subscribers.

@fujitatomoya I would recommend to revert ros2/rcl#949 and add disable_loaned_messages in rcl publisher options data structure and set this parameter via options up to the final publisher constructor on the rclcpp layer. And avoid readout for environment variable for each operation with loaned message.

@Barry-Xu-2018 Given comprehensive performance analysis mentioned above and current sub-optimal implementation of the rcl_publisher_can_loan_messages(..) I will let you to decide if keep extra DataPublisher class wrapper or not.
I would rather fix rcl_publisher_can_loan_messages(..) and have more clean transparent logic in Rosbag2 Player without extra DataPublisher class wrapper.

Just in case if you will decide to keep DataPublisher class, please make it inner class for rosbag2_transport::Player class. And I think it will make more sense to rename it to the PlayerPublisher. At least it will be more self explained rather than more abstract.
Also I would like to kindly ask you to wrap internals of the Player::publish_message(rosbag2_storage::SerializedBagMessageSharedPtr message) in try - catch statement. As I suggested in my previous review.
In ApexAI we have caught situation when publisher can throw exception on publishing message in some circumstance due to the failure on the subscriber side. In this case we shouldn't crash in Rosbag2 playback and rather just report about failure and continue.

@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov

Thank you very much for the detailed analysis.
I did not think that so deeply.
I agree with your analysis.
If no issue in rcl_get_disable_loaned_message (reading environment variable), I incline to do (without extra class wrapper)

if (!play_options_.disable_loan_message && current_publisher->can_loan_messages()) {
        current_publisher->publish_as_loaned_msg(
          rclcpp::SerializedMessage(*message->serialized_data));
} else {
        current_publisher->publish(rclcpp::SerializedMessage(*message->serialized_data));
}

Now, loaned message is enabled by default. If I added above code with current code, reading environment variable affect the performance.

The change for rcl_get_disable_loaned_message() needs to be discussed and the interface freeze for Humble is on April 4th. So I guess this work cannot be done quickly.

I'm not sure what I should do now. Wait for the change for rcl_get_disable_loaned_message() or directly add above code.
I would like to hear your advice.

@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 Don't want to delay this PR due to the sub-optimal implementation in rcl_get_disable_loaned_message(..) .
Let's go with your class wrapper for publisher. Although please make it inner class for rosbag2_transport::Player class. And I think it will make more sense to rename it to the PlayerPublisher

@Barry-Xu-2018
Copy link
Contributor Author

Don't want to delay this PR due to the sub-optimal implementation in rcl_get_disable_loaned_message(..) .
Let's go with your class wrapper for publisher. Although please make it inner class for rosbag2_transport::Player class. And I think it will make more sense to rename it to the PlayerPublisher

Okay. I update code.

Signed-off-by: Barry Xu <barry.xu@sony.com>
@Barry-Xu-2018
Copy link
Contributor Author

@MichaelOrlov Please review again.

Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Barry-Xu-2018 Now looks good.
Approve.

@MichaelOrlov
Copy link
Contributor

Running CI:
Gist:https://gist.githubusercontent.com/MichaelOrlov/54bf70e774446555d29ab8b2f31680e7/raw/6df300d26d4187c47e62153dc64bfff19cc0999b/ros2.repos
BUILD args: --packages-up-to rosbag2_transport rosbag2_tests
TEST args: --packages-select rosbag2_transport rosbag2_tests
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/10136/

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MichaelOrlov MichaelOrlov merged commit fada54c into ros2:master Apr 5, 2022
@MichaelOrlov
Copy link
Contributor

@Barry-Xu-2018 Thank you for your contribution.
Much appreciated!

@fujitatomoya
Copy link
Contributor

@MichaelOrlov sorry to be late to catch up with you. so after all, this is already in mainline. thanks for your review 👍

btw,

@fujitatomoya I would recommend to revert ros2/rcl#949 and add disable_loaned_messages in rcl publisher options data structure and set this parameter via options up to the final publisher constructor on the rclcpp layer. And avoid readout for environment variable for each operation with loaned message.

appreciate for your comment!
originally, parameter for publisher option would not be our choice here. we would like to have global switch that can manage by 3rd party externally if anything happens in rmw implementation. i believe it would be better to keep this implementation in rcl instead of specific client frontend.
but your point makes sense to me about performance aspect, it should avoid readout environment variable every single time.

i will make dedicated issue for rcl to keep tracking this ❗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants