-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timer tests #163
Timer tests #163
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to me
rcl/test/rcl/test_timer.cpp
Outdated
@@ -0,0 +1,110 @@ | |||
// Copyright 2015 Open Source Robotics Foundation, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a new file and not a moved file? If so should this be 2017?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a new file and indeed should be updated 👍
…meout" This reverts commit f44a93b.
this also fixes a bug where the wait would always return when all the timers were canceled fixes rclcpp#319
Offline @mikaelarguedas and I looked at this and came up with 370b7fe. This should fix the issues in It should also replace #160 and makes ros2/rclpy#108 not necessary anymore. |
hmmm looks like we add more test exhibiting the timer inaccuracy on windows. I'll wait a bit to merge this to see if we can improve the windows situation |
👍 to only merge this once it doesn't introduce new failing / flaky tests. |
Not sure if I'll have time to investigate the windows timing issue that has been biting us for some time. The new tests are testing features that were not tested before. We can remove them from this PR and add them later when we figure out the windows timing issue. I do think that the code change should get in before beta3 though because it fixes existing bugs and provide a better behavior overall. |
I would suggest to split the PR into a part which can be merged now (with the same or less failing tests) and a second part which needs further work in order to not introduce new failing / flaky tests. |
new (flaky) tests have been removed in 30a53cb and pushed to the timer_tests2 branch |
rcl/src/rcl/wait.c
Outdated
} | ||
} | ||
|
||
bool using_provided_timeout_not_timer_timeout = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With an inverted logic this variable could be named is_timer_timeout
which sounds a bit more readable / intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we could come up with a good name at the time of the writing. I'll update this to be clearer, thanks for the variable name suggestion
rcl/src/rcl/wait.c
Outdated
// Check for error. | ||
if (ret != RMW_RET_OK) { | ||
if (using_provided_timeout_not_timer_timeout) { | ||
return RCL_RET_TIMEOUT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I fully understand the current semantic correctly. In which cases does rmw_wait
return RMW_RET_TIMEOUT
? Only when the passed timeout has happened? Does it matter if any of the conditions has been triggered?
With this change the semantic of RMW_RET_TIMEOUT
and RCL_RET_TIMEOUT
seem to be slightly different which seems confusing to me. Could instead the behavior of the rmw interface be changed (if necessary)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I fully understand the current semantic correctly. In which cases does rmw_wait return RMW_RET_TIMEOUT? Only when the passed timeout has happened?
correct, if a condition has been triggered rmw_wait
would return RMW_RET_OK
and the condition being non-null.
With this change the semantic of
RMW_RET_TIMEOUT
andRCL_RET_TIMEOUT
seem to be slightly different which seems confusing to me.
Before this PR:
rmw_wait
returns RMW_RET_TIMEOUT
only when the provided timeout has been reached without any condition being triggered.
rcl_wait
returns RCL_RET_TIMEOUT
when the provided timeout has been reached or any timer (even canceled ones) has reached its deadline.
After this PR:
rmw_wait
returns RMW_RET_TIMEOUT
only when the provided timeout has been reached without any condition being triggered.
rcl_wait
returns RCL_RET_TIMEOUT
only when the provided timeout has been reached. If any condition has been triggered including timers it will return RCL_RET_OK
with the corresponding timer non-nullified
The new new implementation matches my understanding of how wait should behave. Could you give more details (or an example) to support the inconsistency of the new behavior, and what type of change in the rmw interface would make more sense here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we are clear that both functions should:
- (1) return
RET_TIMEOUT
when the timeout expires and no other condition being triggered and - (2) return
RET_OK
when at least one condition has been triggered and the function returned before the given timeout.
The question is what should be returned when (3) the function only returns after a given timeout has expired and at the same time at least one condition has been triggered.
Without this patch there was a clear correlation: when rmw
returned RMW_RET_TIMEOUT
than rcl
returned RCL_RET_TIMEOUT
(https://github.com/ros2/rcl/pull/163/files#diff-8fda46206fc38bfdbcac0e5213bffddaL594).
With this patch the rmw
API seems to return RMW_RET_TIMEOUT
in the third case but the rcl
API will return RCL_RET_OK
(https://github.com/ros2/rcl/pull/163/files#diff-8fda46206fc38bfdbcac0e5213bffddaR626). Am I reading that correctly? If yes, the question is if that different semantic is intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I have the same reading here.
(3) the function only returns after a given timeout has expired and at the same time at least one condition has been triggered.
In this case:
the "function" is rcl_wait
the "conditions" are "all rmw_waitset entities" + "rcl timers".
"a given timeout" is the user specified timeout
the function only returns after a given timeout has expired
so we reach the user specified timeout and not a timer, correct? In that case we do return RCL_RET_TIMEOUT
regardless of if a condition has been met or not (same behavior as before)
So I don't think we changed anything in the existing semantic here.
Then if we use a timer timeout and not the one specified by the user, that means that the function will return before the user specified timeout, thus should return RCL_RET_OK regardless of the return code of rmw_wait.
Is that correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirk-thomas @wjwwood can you confirm if my reading is correct here or if I need to change something in this PR to keep the correct semantic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this patch there was a clear correlation: when rmw returned RMW_RET_TIMEOUT than rcl returned RCL_RET_TIMEOUT (https://github.com/ros2/rcl/pull/163/files#diff-8fda46206fc38bfdbcac0e5213bffddaL594).
I think that was what was inconsistent with it. A condition may have been met (i.e. a timer was ready), but timeout was returned by rcl too despite the fact rcl had more conditions to consider. To me this makes rcl and rmw more consistent with each other, i.e. ret timeout only returned when the given timeout was exceeded and not when a condition was met instead.
@mikaelarguedas I think it looks right too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some offline discussion we were able to resolve the misunderstanding. rmw
returns TIMEOUT
when it wakes up by the timeout specified rather than any other condition. rcl
might either pass the user provided timeout to rmw
or the shortest timer - whichever is lower.
So before rcl
was returning a TIMEOUT
even though the timeout signaled by rmw
was due to a timer and not due to the user provided timeout. With this patch rcl
will not return TIMEOUT
anymore when the timeout coming from rmw
was due to a timer.
With this being said the change in the test_wait.cpp
to expect OK
rather than TIMEOUT
in this case makes a lot more sense 😉
👍 ➕ 🚢
* dont return timeout errcode if we didnt hit user specified timeout * update axisting test accordingly * add timer test * Revert "dont return timeout errcode if we didnt hit user specified timeout" This reverts commit f44a93b. * alternative fix to not return timeout when timer is ready this also fixes a bug where the wait would always return when all the timers were canceled fixes rclcpp#319 * fixups * more tests * another one just for fun * remove new tests * clearer variable name * update comment
* pass context to wait set Signed-off-by: William Woodall <william@osrfoundation.org> * add fini for context Signed-off-by: William Woodall <william@osrfoundation.org>
wip: opening this for visibility, replaces #160. Still need to write some tests for this and make sure it doesnt break anything
Note: following recommendation in #160 (comment) wait will wake up if a timer reached it's deadline even if it has been canceled. It's up to the client library to make sure to not add canceled timers to the waitset