add actual test time to xUnit result files #269

ruffsl · 2020-08-10T02:48:15Z

When generating .xml xunit result files for tests, the ament_cmake_test python module records number of seconds the test run took as a constant zero, and thus does not accurately account of the real world duration of time the test took for the runner to complete:

ament_cmake/ament_cmake_test/ament_cmake_test/__init__.py

Lines 325 to 330 in cba4e09

    
               return """<?xml version="1.0" encoding="UTF-8"?> 
        
           <testsuite name="%s" tests="1" failures="%d" time="0" errors="%d" skipped="%d"> 
        
             <testcase classname="%s" name="%s.missing_result" time="0"> 
        
               %s%s%s 
        
             </testcase> 
        
           </testsuite>\n""" % \

This inaccuracy can impede the discoverability of performance regressions by falsifying test statistics that are consumed by test result aggregation tools, e.g. continuous integration frameworks. Another second order effect is the misallocation of CI resources, as test jobs can no longer be optimally split via expected test timing across pools of CI runners.

I'd suggest extending _generate_result with say the default argument test_time=0 and update the respective call sites to provide the elapsed time taken from running the subprocess command tested.

ament_cmake/ament_cmake_test/ament_cmake_test/__init__.py

Line 313 in cba4e09

    
           def _generate_result(result_file, *, failure_message=None, skip=False, error_message=None):

Some suggestions on how timing for subprocess commands may be recorded.

https://stackoverflow.com/questions/13889066/run-an-external-command-and-get-the-amount-of-cpu-it-consumed/13933797#13933797

The text was updated successfully, but these errors were encountered:

dirk-thomas · 2020-08-10T16:59:55Z

Please consider to contribute a pull request to provide the actual test run time in the result.

ruffsl · 2020-08-10T21:05:07Z

What would you suggest with respect to means of measurement? Simple elapsed time from system clock from start to stop of python subprocess, or something like with import resource to measure the CPU time.

The first would simply work on all platforms, but could be inaccurate if the CI runner is heavily loaded with many parallel jobs competing for CPU time via the OS scheduler. The second could would account for that, but I'm not sure how it could be done for non-unix platforms.

dirk-thomas · 2020-08-10T21:11:05Z

What would you suggest with respect to means of measurement?

Measuring the delta between the start and end time seems like the only option which makes sense. Imo the CPU usage has nothing to do with the reported test time. I test could easily sleep for an hour which wouln't show up in the test time.

Also other tools like CTest will report the time between start and end.

ruffsl · 2020-08-10T21:45:44Z

Side note, it looks like the tests= and failures= attributes are hardcoded to 1 or 0, and would not increment if repeated tests where conducted, e.g. if --retest-until-fail N was used to repeat certain failing tests. Related: colcon/colcon-core#368

Would there be an appropriate direction for appending to previous tests, failures as well as time attributes to be include in the summation of test results?

dirk-thomas · 2020-08-10T21:47:51Z

Atm the logic generating these result files has no knowledge if the test invocation is a first time run or a re-rerun based on colcon arguments like --retest-until-fail. So without significant changes I don't think the information to make that decision is available.

ruffsl · 2020-08-10T21:57:37Z

Atm the logic generating these result files has no knowledge if the test invocation is a first time run or a re-rerun

Presumably, this code could just read the same result_file it wrote to disk from the previous tests?

ament_cmake/ament_cmake_nose/cmake/ament_add_nose_test.cmake

Line 78 in 996fc38

    
           set(result_file "${AMENT_TEST_RESULTS_DIR}/${PROJECT_NAME}/${testname}.xunit.xml")

And then just append or increment the test time and tests/failers from the current measurements.

dirk-thomas · 2020-08-10T22:01:44Z

Presumably, this code could just read the same result_file it wrote to disk from the previous tests?

An existing result file could be from an independent colcon test invocation in which case you certainly don't want to increment the time.

ruffsl · 2020-08-10T22:22:16Z

I was going to say that for xunit, I think it makes sense for aggregate test time over the number of tests. If you wanted to reset the rest results, then I think one would want to explicitly clean the test results for that package/workspace. But guess I'm not confident if a repeated test should be considered apart of the same test suite that the junit mentions.

http://windyroad.com.au/dl/Open%20Source/JUnit.xsd

Should intermittent test results when using --retest-until-pass be recorded to convey the flakiness of a test, or is that forfeited when one electes to use --retest-until-pass rather than repeating the test at the CI tooling level. I guess I'm not sure how most CI infrastructure mesure/record flakiness or attribute the amount of resource consumed by a given test.

dirk-thomas · 2020-08-10T22:31:03Z

I was going to say that for xunit, I think it makes sense for aggregate test time over the number of tests.

I don't think that this is the way how any existing testing frameworks do it. I would also find it extremely confusing to find a test result which shows that the test ran for 100s when the configured timeout is e.g. 30s.

Should intermittent test results when using --retest-until-pass be recorded to convey the flakiness of a test, or is that forfeited when one electes to use --retest-until-pass rather than repeating the test at the CI tooling level.

I guess it is up to the underlying tool (like pytest-rerunfailures) how to report it in the xUnit file. But from a colcon point of view if you asked to rerun it until it passed if the final result says passed it is "ok" if there is no reference to the previous failed attempts.

Fixes ament#269 Signed-off-by: ruffsl <roxfoxpox@gmail.com>

* Add actual test time to xUnit result files Fixes #269 Signed-off-by: ruffsl <roxfoxpox@gmail.com> * Report test_time even with skipped test Signed-off-by: ruffsl <roxfoxpox@gmail.com> * Set time attribute for testcase element Signed-off-by: ruffsl <roxfoxpox@gmail.com>

dirk-thomas added the enhancement New feature or request label Aug 10, 2020

dirk-thomas changed the title ~~ament_cmake_test incorrectly records test time in generated result files~~ add actual test time to xUnit result files Aug 10, 2020

ruffsl added a commit to ruffsl/ament_cmake that referenced this issue Aug 11, 2020

Add actual test time to xUnit result files

e7d165b

Fixes ament#269 Signed-off-by: ruffsl <roxfoxpox@gmail.com>

ruffsl mentioned this issue Aug 11, 2020

Add actual test time to xUnit result files #270

Merged

dirk-thomas closed this as completed in #270 Aug 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add actual test time to xUnit result files #269

add actual test time to xUnit result files #269

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

add actual test time to xUnit result files #269

add actual test time to xUnit result files #269

Comments

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020

ruffsl commented Aug 10, 2020

dirk-thomas commented Aug 10, 2020