Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wait for service in client example #93

Closed
wants to merge 1 commit into from
Closed

Conversation

gerkey
Copy link
Member

@gerkey gerkey commented Mar 4, 2016

The test that uses this client example can fail if the server isn't yet up and ready to hear the client's request. In that situation, the client waits forever and the test times out, as we've seen with FastRTPS. This PR adds a wait_for_service() call that the client uses to delay its request until the time is right.

This function should not live here in the long term, but rather be improved and promoted into client API (ros2/rclcpp#201). But this is a decent place to park it for now.

@gerkey gerkey added the in progress Actively being worked on (Kanban column) label Mar 4, 2016
@dirk-thomas
Copy link
Member

Since the problem probably affects all of the service tests does it make sense to only patch it for this single one?

@gerkey
Copy link
Member Author

gerkey commented Mar 4, 2016

@dirk-thomas The right fix is to implement the wait_for_service call in the client library and use it everywhere, including here. My implementation isn't yet ready to be promoted in that way (we may need some changes to the rmw API), and I don't want to copy this code everywhere that services are called in our tests. If, before we address ros2/rclcpp#201, we see other failures caused by the same underlying problem, I'll look at factoring it out for reuse, at least within this package.

// return true, hoping for the best.
rmw_reset_error();
success = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly blanket error case, maybe you could either check the error message contents or at least print it out so that when there is an error that doesn't match what you expected you can see it in the log? That might break your example test though.

@wjwwood
Copy link
Member

wjwwood commented Mar 4, 2016

+1 The change makes sense to me. It's unfortunate that FastRTPS doesn't implement the solution I suggested to you 😢.

@gerkey
Copy link
Member Author

gerkey commented Mar 9, 2016

This change made more tests fail on Linux and it didn't even build on Windows. Fun!

@gerkey gerkey force-pushed the wait_for_service branch 2 times, most recently from 7156b7f to fafad35 Compare March 18, 2016 01:18
@gerkey
Copy link
Member Author

gerkey commented Mar 18, 2016

Windows build fixed and same logic added to async client (it'll be removed from both examples once we resolve ros2/rclcpp#201). New CI:

@gerkey
Copy link
Member Author

gerkey commented Mar 18, 2016

CI shows only failures and warning that are unrelated to this change.

@gerkey gerkey added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Mar 18, 2016
@gerkey
Copy link
Member Author

gerkey commented Mar 18, 2016

@wjwwood You +1ed previously, but can you have a second look before I merge?

success = true;
break;
}
} catch (std::runtime_error & e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue here is that node->count_subscribers should never throw a C++ exception. This is actually a problem with the fast rtps implementation, but ideally it would catch the exception internally and convert it to a return code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have said, it should never throw a C++ exception because it is a C function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underlying function rmw_count_subscribers() is C (for fastrtps, it's hardcoded to return an error: https://github.com/eProsima/ROS-RMW-Fast-RTPS-cpp/blob/master/rmw_fastrtps_cpp/src/functions.cpp#L1793-L1801), but here I'm calling a C++ class method: https://github.com/ros2/rclcpp/blob/master/rclcpp/src/rclcpp/node.cpp#L317-L329.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my bad. Sorry for the noise 🔔.

@gerkey
Copy link
Member Author

gerkey commented Mar 18, 2016

When the logic is fixed to actually require that > 0 subscribers are hooked up, the test fails for opensplice and both flavors of connext. I suspect that the problem is that, as @wjwwood previously suggested, you can't assume that services are implemented atop topics and check for subscribers to infer that a service server is up. This approach is a dead end.

@gerkey gerkey closed this Mar 18, 2016
@gerkey gerkey removed the in review Waiting for review (Kanban column) label Mar 18, 2016
@wjwwood wjwwood deleted the wait_for_service branch March 18, 2016 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants