Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is first checking
diff < UPPER_BOUND
, and then assertingdiff > UPPER_BOUND
? (withUPPER_BOUND = PERIOD - TOLERANCE
)how the second condition can happen after the first check?
this is failing sometimes in CI: https://ci.ros2.org/view/nightly/job/nightly_linux_repeated/1858/testReport/junit/rclcpp/TestMultiThreadedExecutor/timer_over_take/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand my previous comment (#907 (comment)), I think the intention is that if
diff < UPPER_BOUND
, then we want the test fail. I admit it is confusing, and we could probably clarify with a comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, so in that case we want the test to unconditionally fail, right?
I think that either adding a comment or using
FAIL()
would be more clear (now that we're checking only one of the conditions).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me.
Regarding the test failure, I'm still not sure if the bug lies in the test code or not. Having timing dependent tests are tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have taken a look, and it seems to be a flaky test (though the timer is triggered MUCH BEFORE than expected).
I've read the original PR #383 that motivated these changes, the only change I see in the multithreaded executor since then is #836, which doesn't seem to be related.
The test seems to be super flaky. I have checked the last few repeated jobs, and it has only failed once.
IMO, it sounds like we can ignore it.
@wjwwood do you think that any of the reset changes can be related to this? or is it just a flaky test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I proposed a PR increasing period/tolerance by an order of magnitude, and it failed too #1105 (comment).
There seems to be a real bug, so it should be triagged/fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be connected with this other failure too ros2/rcl#640.
(edit) Doesn't sound like it's really related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Until it's fixed, the test should be either commented out or marked as
xfail
so it stops showing on CI results. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if
xfail
is intended to be used only on flaky tests, or also in tests that are failing due to something being actually broken.If that mark is also intended for the second case, then it's ok to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe so: ros2/ros2#900