Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retest-until-fail vs repeat-until-fail #19

Closed
mikaelarguedas opened this issue Dec 12, 2016 · 7 comments
Closed

retest-until-fail vs repeat-until-fail #19

mikaelarguedas opened this issue Dec 12, 2016 · 7 comments
Labels
enhancement New feature or request

Comments

@mikaelarguedas
Copy link
Member

mikaelarguedas commented Dec 12, 2016

By investigating flaky tests I noticed that the repeated nightlies are using the --retest-until-fail <N> option, whereas when testing locally I run ament test with the option --ctest-args --repeat-until-fail <N> --

I'm wondering if --retest-until-fail <N> has the desired behavior for our repeated jobs. Below explained the difference of behavior between the two options

  • ctest --repeat-until-fail <N>:
    Ensures that all tests are run until failure or N runs
for test in tests_of_this_package:
    for i in range(N):
        rc = run_test(test)
        if rc:
           break  # move on to the next TEST
  • ament test --retest-until-fail <N>
    Stop testing a package as soon as a single test of this package fails
for i in range(N):
    rc = 0
    for test in tests_of_this_package:
        rc += run_test(test)
    if rc:
        break # move on to the next PACKAGE

This could explain why the reported flaky tests are different every night. Especially on packages with a lot of tests like the test_communication

Options we may want to consider:

  • change the default test argument for the repeated nightlies to use the ctest option repeat-until-fail
  • modify ament behavior to blacklist the failed tests but repeat the passed ones up to N times
  • do a run or two with the ctest --repeat-until-fail <N> argument to have an exhaustive list of the flaky tests

To have an idea of how long the nightlies would take with ctest --repeat-until-fail 20 I launched http://ci.ros2.org/job/ci_windows/2030/, on windows it took 5h30min instead of 2h

@mikaelarguedas mikaelarguedas added the question Further information is requested label Dec 12, 2016
@dirk-thomas
Copy link
Member

The option you use locally only applies to CMake packages. It does not cover non-ctest tests like nose. So you are not checking if those are flaky.

The nightly job was designed to catch if there are any flaky tests and "fail fast". Since the goal is to not have any flaky tests there was no focus on enumerating all of them. As you have noticed doing so requires a lot of time.

@mikaelarguedas
Copy link
Member Author

Good point for the non Cmake packages.

Thanks for the explanation I was not sure if it was intentional behavior or an overlooked side effect.
I didn't know that the motivation was to find if there is A flaky test in a package and not repeating every test to see which ones are flaky.
I guess I'll tag this as wontfix then

@dirk-thomas
Copy link
Member

@mikaelarguedas what is the status of this ticket? Should it be closed?

@mikaelarguedas
Copy link
Member Author

If there is no solution at all it can be closed or marked as untargeted.
I do still think that the current logic for retesting is suboptimal and doesnt allow us to identify all the flaky tests because it's stopping at the first one it encounters in a package.

@dirk-thomas
Copy link
Member

@mikaelarguedas Can you update this issue into a todo (rather than a question) describing what should be done?

@mikaelarguedas mikaelarguedas added enhancement New feature or request and removed question Further information is requested labels Apr 12, 2017
@mikaelarguedas
Copy link
Member Author

TODO: Change the behaviour of repeat-until-fail to:

  • for each package
    • create an empty "skip list"
    • while the number of reruns is less than the specified N
      • for each test not in the skip list
        • run test
        • if the test fails mark it as fail and add it to the skipped list

I am not sure how doable it is and how much work this will require but that will be a more accurate way to identify all flaky tests

@mikaelarguedas
Copy link
Member Author

I believe this can be closed as it has been addressed by the switch to colcon. From my understanding, colcon runs and repeat tests one by one in a package (compared to ament tools that was running then entire test suite of a package N times). So tests in a package are not impacted by the result of other tests from the same package anymore.

@dirk-thomas Feel free to comment or re-open if I missed something here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants