Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add actual test time to xUnit result files #269

Closed
ruffsl opened this issue Aug 10, 2020 · 9 comments · Fixed by #270
Closed

add actual test time to xUnit result files #269

ruffsl opened this issue Aug 10, 2020 · 9 comments · Fixed by #270
Labels
enhancement New feature or request

Comments

@ruffsl
Copy link
Contributor

ruffsl commented Aug 10, 2020

When generating .xml xunit result files for tests, the ament_cmake_test python module records number of seconds the test run took as a constant zero, and thus does not accurately account of the real world duration of time the test took for the runner to complete:

return """<?xml version="1.0" encoding="UTF-8"?>
<testsuite name="%s" tests="1" failures="%d" time="0" errors="%d" skipped="%d">
<testcase classname="%s" name="%s.missing_result" time="0">
%s%s%s
</testcase>
</testsuite>\n""" % \

This inaccuracy can impede the discoverability of performance regressions by falsifying test statistics that are consumed by test result aggregation tools, e.g. continuous integration frameworks. Another second order effect is the misallocation of CI resources, as test jobs can no longer be optimally split via expected test timing across pools of CI runners.

I'd suggest extending _generate_result with say the default argument test_time=0 and update the respective call sites to provide the elapsed time taken from running the subprocess command tested.

def _generate_result(result_file, *, failure_message=None, skip=False, error_message=None):

Some suggestions on how timing for subprocess commands may be recorded.

https://stackoverflow.com/questions/13889066/run-an-external-command-and-get-the-amount-of-cpu-it-consumed/13933797#13933797

@dirk-thomas dirk-thomas added the enhancement New feature or request label Aug 10, 2020
@dirk-thomas dirk-thomas changed the title ament_cmake_test incorrectly records test time in generated result files add actual test time to xUnit result files Aug 10, 2020
@dirk-thomas
Copy link
Contributor

Please consider to contribute a pull request to provide the actual test run time in the result.

@ruffsl
Copy link
Contributor Author

ruffsl commented Aug 10, 2020

What would you suggest with respect to means of measurement? Simple elapsed time from system clock from start to stop of python subprocess, or something like with import resource to measure the CPU time.

The first would simply work on all platforms, but could be inaccurate if the CI runner is heavily loaded with many parallel jobs competing for CPU time via the OS scheduler. The second could would account for that, but I'm not sure how it could be done for non-unix platforms.

@dirk-thomas
Copy link
Contributor

What would you suggest with respect to means of measurement?

Measuring the delta between the start and end time seems like the only option which makes sense. Imo the CPU usage has nothing to do with the reported test time. I test could easily sleep for an hour which wouln't show up in the test time.

Also other tools like CTest will report the time between start and end.

@ruffsl
Copy link
Contributor Author

ruffsl commented Aug 10, 2020

Side note, it looks like the tests= and failures= attributes are hardcoded to 1 or 0, and would not increment if repeated tests where conducted, e.g. if --retest-until-fail N was used to repeat certain failing tests. Related: colcon/colcon-core#368

Would there be an appropriate direction for appending to previous tests, failures as well as time attributes to be include in the summation of test results?

@dirk-thomas
Copy link
Contributor

Atm the logic generating these result files has no knowledge if the test invocation is a first time run or a re-rerun based on colcon arguments like --retest-until-fail. So without significant changes I don't think the information to make that decision is available.

@ruffsl
Copy link
Contributor Author

ruffsl commented Aug 10, 2020

Atm the logic generating these result files has no knowledge if the test invocation is a first time run or a re-rerun

Presumably, this code could just read the same result_file it wrote to disk from the previous tests?

set(result_file "${AMENT_TEST_RESULTS_DIR}/${PROJECT_NAME}/${testname}.xunit.xml")

And then just append or increment the test time and tests/failers from the current measurements.

@dirk-thomas
Copy link
Contributor

Presumably, this code could just read the same result_file it wrote to disk from the previous tests?

An existing result file could be from an independent colcon test invocation in which case you certainly don't want to increment the time.

@ruffsl
Copy link
Contributor Author

ruffsl commented Aug 10, 2020

I was going to say that for xunit, I think it makes sense for aggregate test time over the number of tests. If you wanted to reset the rest results, then I think one would want to explicitly clean the test results for that package/workspace. But guess I'm not confident if a repeated test should be considered apart of the same test suite that the junit mentions.

http://windyroad.com.au/dl/Open%20Source/JUnit.xsd

Should intermittent test results when using --retest-until-pass be recorded to convey the flakiness of a test, or is that forfeited when one electes to use --retest-until-pass rather than repeating the test at the CI tooling level. I guess I'm not sure how most CI infrastructure mesure/record flakiness or attribute the amount of resource consumed by a given test.

@dirk-thomas
Copy link
Contributor

I was going to say that for xunit, I think it makes sense for aggregate test time over the number of tests.

I don't think that this is the way how any existing testing frameworks do it. I would also find it extremely confusing to find a test result which shows that the test ran for 100s when the configured timeout is e.g. 30s.

Should intermittent test results when using --retest-until-pass be recorded to convey the flakiness of a test, or is that forfeited when one electes to use --retest-until-pass rather than repeating the test at the CI tooling level.

I guess it is up to the underlying tool (like pytest-rerunfailures) how to report it in the xUnit file. But from a colcon point of view if you asked to rerun it until it passed if the final result says passed it is "ok" if there is no reference to the previous failed attempts.

ruffsl added a commit to ruffsl/ament_cmake that referenced this issue Aug 11, 2020
Fixes ament#269

Signed-off-by: ruffsl <roxfoxpox@gmail.com>
dirk-thomas pushed a commit that referenced this issue Aug 12, 2020
* Add actual test time to xUnit result files
Fixes #269

Signed-off-by: ruffsl <roxfoxpox@gmail.com>

* Report test_time even with skipped test

Signed-off-by: ruffsl <roxfoxpox@gmail.com>

* Set time attribute for testcase element

Signed-off-by: ruffsl <roxfoxpox@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants