Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be more proactive about skipping work if possible #685

Merged
merged 4 commits into from
Apr 11, 2023

Conversation

mjcarroll
Copy link
Member

@mjcarroll mjcarroll commented Apr 7, 2023

This makes rmw_wait more aggressive about how it skips work.

In the case that a condition is already met, this will skip attaching/detaching conditions from the underlying waitset and skip waiting.

In the case that a wait is required, it generally looks like this:
image

and when a wait isn't needed:

image

@mjcarroll
Copy link
Member Author

Should skip wait has a relatively constant cost:

image

And postprocess

image

@@ -49,6 +49,9 @@ find_package(fastcdr REQUIRED CONFIG)
find_package(fastrtps 2.3 REQUIRED CONFIG)
find_package(FastRTPS 2.3 REQUIRED MODULE)

find_package(Threads REQUIRED)
find_package(tracy_vendor REQUIRED)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with the tracy_vendor here? Do we need this to get this work in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there are some profiling points in this branch, I will remove for final merge if we determine this is a valid approach.

@mjcarroll
Copy link
Member Author

@MiguelCompany I am also looking at adding a method to the waitset to allow attaching/detaching multiple conditions. I think this can save a bit of churn in locking/unlocking mutexes: https://github.com/mjcarroll/Fast-DDS/tree/mjcarroll/attach_multiple_conditions

Signed-off-by: Michael Carroll <michael@openrobotics.org>
mjcarroll and others added 2 commits April 10, 2023 13:26
Signed-off-by: Michael Carroll <michael@openrobotics.org>
Co-authored-by: Miguel Company <miguelcompany@eprosima.com>
Signed-off-by: Michael Carroll <michael@openrobotics.org>
@mjcarroll
Copy link
Member Author

@clalancette this one is ready for you.

@mjcarroll
Copy link
Member Author

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left just a few nits inline. All of these are small enough that a "it compiles for me" comment is good enough so we don't have to restart CI.

I went through this with @mjcarroll in some detail. In short, what this does is to change it so that we no longer call wait_set->wait() if there is already work waiting for us. If that is the case, then we also don't need to attach/detach conditions to the wait_set if we already have work, and we can just go through and null out the entries that don't have work waiting. If we do have work waiting for us, then we need to do the whole dance.

I will note that this is much faster in the best case, where the first guard condition we are waiting on is triggered. In that case, we'll do much less work than before. However, if there is no work for us to do, then this is actually somewhat slower than before since we need to iterate over the lists 3 times instead of 2. I think this is an acceptable tradeoff for now, but it bears looking into further to see if we can reduce how expensive get_first_untaken_info is.

Anyway, with my nits fixed, I'm happy to approve this. It looks like a net win for now.

if (!skip_wait) {
timeout = (wait_timeout) ?
// Attach all of the conditions to the wait set.
// \TODO(mjcarroll) When upstream has the ability to attach a vector of conditions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// \TODO(mjcarroll) When upstream has the ability to attach a vector of conditions,
// TODO(mjcarroll): When upstream has the ability to attach a vector of conditions,

wait_result = (ret_code == ReturnCode_t::RETCODE_OK);

// Detach all of the conditions from the wait set.
// \TODO(mjcarroll) When upstream has the ability to detach a vector of conditions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// \TODO(mjcarroll) When upstream has the ability to detach a vector of conditions,
// TODO(mjcarroll): When upstream has the ability to detach a vector of conditions,

* \param[in] events events to check
* \return true if any condition is triggered, false otherwise
*/
bool has_triggered_condition(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool has_triggered_condition(
static bool has_triggered_condition(

(or an anonymous namespace, I really don't have a preference)

auto custom_subscriber_info = static_cast<CustomSubscriberInfo *>(data);
eprosima::fastdds::dds::SampleInfo sample_info;
if (ReturnCode_t::RETCODE_OK ==
custom_subscriber_info->data_reader_->get_first_untaken_info(&sample_info))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worthwhile to add a comment saying that this is relatively slow, so it is worth skipping it if possible.

@mjcarroll
Copy link
Member Author

I will note that this is much faster in the best case, where the first guard condition we are waiting on is triggered. In that case, we'll do much less work than before. However, if there is no work for us to do, then this is actually somewhat slower than before since we need to iterate over the lists 3 times instead of 2. I think this is an acceptable tradeoff for now, but it bears looking into further to see if we can reduce how expensive get_first_untaken_info is.

Based on my testing with https://github.com/irobot-ros/ros2-performance, there is less tradeoff than you would think at first. We do add a second set of iterations before the wait call, but in the case that we are actually waiting, that time would have been yielded to waiting on the waitset anyway. The amount of work after the wait is over remains the same, so the event-to-callback latency is the same.

In the case that there is something already ready, we skip attaching the conditions, which are somewhat expensive (each requires a lock and unlock of a mutex) as well as skipping the get_first_untaken_info which is also expensive.

@mjcarroll
Copy link
Member Author

Two potential enhancements here (but out of the scope of this PR):

  • Add a FastDDS waitset api that allows for the addition and removal of many conditions. Since each time a condition is inserted, the waitset api iterates the list of all conditions to check for the existence already, this can add up in the case of many conditions. I'm not exactly sure what we would want the api to look like, but I have protoyped remove_all_conditions and add_conditions on a branch to good results. The add_conditions that takes a sequence clears the waitset so that we don't have to check for already inserted as well. Additionally, this can be done under a single lock, rather than having to re-take the lock every iteration.
  • Add/find a waitset api that is cheaper to check if data is available. In this case, I think we only need a boolean to indicate if there is data in need of servicing, rather than populating the SampleInfo structure that we aren't doing anything else. Maybe @MiguelCompany has a better idea of an API to use here.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
@mjcarroll
Copy link
Member Author

mjcarroll commented Apr 10, 2023

All of these are small enough that a "it compiles for me" comment is good enough

It compiles for me. But for good measure, I'm throwing in a RHEL and Windows Debug build:

RHEL: Build Status
Windbg: Build Status

@mjcarroll
Copy link
Member Author

Alright, everything but Windows debug is green, so I'm calling this good.

@mjcarroll mjcarroll merged commit d93f410 into rolling Apr 11, 2023
@delete-merged-branch delete-merged-branch bot deleted the mjcarroll/faster_rmw_wait branch April 11, 2023 02:05
eholum pushed a commit to eholum/rmw_fastrtps that referenced this pull request May 17, 2023
In the case that a condition is already triggered, skip attaching/detaching conditions as well as the actual wait call.  This reduces call time in many cases where work is already queued.

Signed-off-by: Michael Carroll <michael@openrobotics.org>
Co-authored-by: Miguel Company <miguelcompany@eprosima.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants