prerelease: catkin_test_results return codes get suppressed #367

mathias-luedtke · 2017-02-13T23:55:58Z

Why does catkin_test_results gets called with set +e?
(https://github.com/ros-infrastructure/ros_buildfarm/blob/master/ros_buildfarm/templates/devel/devel_script_test_results.sh.em)

IMHO failing tests should stop devel and prerelease tests.
The current behavior forces the prerelease test user to call catkin_test_results again with the correct workspace paths.

The text was updated successfully, but these errors were encountered:

dirk-thomas · 2017-02-14T00:03:27Z

If a user invokes a build together with the tests it needs to be distinguishable if the build failed, or if only some tests failed, or if everything passed.

Currently this means when you invoke the tests and they have been run successfully (which doesn't imply that all tests have passed) the return code is 0. If you then want to distinguish if the any tests have failed you can use catkin_test_results.

If the invocation would return an error code the user (as well as any automated tool) couldn't distinguish if the build failed for some other reason or if only some tests failed.

mathias-luedtke · 2017-02-14T00:15:39Z

if the invocation would return an error code the user (as well as any automated tool) couldn't distinguish if the build failed for some other reason or if only some tests failed.

The automated test does not care what part failed.
And most users to do not check the exit code, but they will read the last line with the number of failures.

So in order to get a proper exit codes, the user (or the CI systems) needs to call

catkin_test_results $WORKSPACE/catkin_workspace/test_results
catkin_test_results $WORKSPACE/catkin_workspace_overlay/test_results

This is not really user-friendly and in total catkin_test_results gets called 4 times..

dirk-thomas · 2017-02-14T00:17:01Z

I am open to any suggestions if you could describe an approach which works for the existing Jenkins jobs as well as for the user / automated scripts running a prerelease.

It might also be good to consider how other tools running tests behave. Users might expect a similar behavior from our tools.

mathias-luedtke · 2017-02-14T00:26:34Z

which works for the existing Jenkins jobs

I guess the Jenkin jobs should continue to ignore test failures?

I am open to any suggestions

Somethings involving an environment variable to make the script fail on test errors?

It might also be good to consider how other tools running tests behave. Users might expect a similar behavior from our tools.

industrial_ci fails now for any error, but I am not sure if the users expect this ;)
travis runs all script lines, but exits with an error code if any of these fail.

dirk-thomas · 2017-02-14T17:04:54Z

You mentioned "failing tests should stop devel and prerelease tests" before and I probably don't understand what exactly that means. It might be good if you could describe for each case (devel job, devel script, prerelease script underlay / overlay) how you imagine it to change behavior in detail. If possible expressing the desired change in a PR might help to understand since it might be difficult to describe clearly.

gerkey · 2017-03-08T16:37:04Z

To separate the two cases, user and Jenkins:

If I, as a user, invoke catkin_make run_tests, and it causes a build step to happen before the test step happens, I expect to get non-zero if the build fails or if any test fails (including both failure to invoke test and test that invokes successfully but registers a failure). I'm perfectly happy with the ambiguity of what failed (build vs. test). I guess that this expectation comes from my view of catkin_make as an aggregator of the underlying tools that it invokes: I expect an individual test program (e.g., gtest) to return non-zero if any test case fails; I expect make test to return non-zero if any of the programs that it runs return non-zero; and I expect catkin_make to return non-zero if make test on any of the packages that it processes returns non-zero.
For Jenkins and other automated tools, would it be sufficient to structure the usage of catkin_make so that the build and test steps are separate? E.g., if a given job will run tests, have it first explicitly invoke catkin_make and check that return code to determine if the build failed and if so stop there. If the build succeeded, then proceed to call catkin_make run_tests and ignore the return code, counting on the test reporting / collocation system to do the right thing.

dirk-thomas · 2017-03-08T16:42:39Z

If the build succeeded, then proceed to call catkin_make run_tests and ignore the return code, counting on the test reporting / collocation system to do the right thing.

How do you detect a failure in test execution in this case (not a failing test but something failing to run the tests as requested by the command)?

gerkey · 2017-03-08T16:46:48Z

How do you detect a failure in test execution in this case (not a failing test but something failing to run the tests as requested by the command)?

Are these two cases being distinguished currently?

dirk-thomas · 2017-03-08T16:48:23Z

How do you detect a failure in test execution in this case (not a failing test but something failing to run the tests as requested by the command)?

Are these two cases being distinguished currently?

Yes, if the test invocation returns non-zero something failed. If the test_results invocation returns non-zero one of more tests have failed.

mathias-luedtke · 2017-03-08T17:03:42Z

IMHO the split between run_test (fails for execution errors) and test_results (actual test failed) is fine.

It might be good if you could describe for each case (devel job, devel script, prerelease script underlay / overlay) how you imagine it to change behavior in detail. If possible expressing the desired change in a PR might help to understand since it might be difficult to describe clearly.

If the exit code of catkin_test_results gets exposed (as opt-in?), the calling instance can handle it accordingly. My preference is to not allow any failures (execution or result) for my test jobs. In this case the failure might come from run_test or test_results and the user has to read the job log.

dirk-thomas · 2017-03-08T17:14:08Z

Currently catkin_test_results already returns 1 if any test failed. There is no option to disable that - not sure if that is necessary for any use case.

gerkey · 2017-03-08T17:38:24Z

Yes, if the test invocation returns non-zero something failed. If the test_results invocation returns non-zero one of more tests have failed.

Ah, interesting. Who is currently taking advantage of that distinction? Is it just for users, or does something like Jenkins rely on it?

dirk-thomas · 2017-03-08T17:42:49Z

Yes, if the test invocation returns non-zero something failed. If the test_results invocation returns non-zero one of more tests have failed.

Ah, interesting. Who is currently taking advantage of that distinction? Is it just for users, or does something like Jenkins rely on it?

Anyone automating the process will likely use the return value. E.g. when running a build and test cycle via Travis.

mathias-luedtke · 2017-03-08T19:14:39Z

Currently catkin_test_results already returns 1 if any test failed. There is no option to disable that - not sure if that is necessary for any use case.

Yes, but this exit code 1 gets dropped in devel jobs and the prerelease test scripts.

dirk-thomas · 2017-03-08T19:40:42Z

Currently catkin_test_results already returns 1 if any test failed. There is no option to disable that - not sure if that is necessary for any use case.

Yes, but this exit code 1 gets dropped in devel jobs and the prerelease test scripts.

Indeed it does.

For a devel job the rational is the following: The Jenkins build should not fail. Instead the test results are being interpreted and if any of them failed the build is marked as "unstable".

For a prerelease script the reason is very similar. If a test in the underlay fails it should continue building and testing the overlay.

gerkey · 2017-03-08T20:14:18Z

Thanks, I think that I have enough context to proceed; I'll move my thoughts into a proposed enhancement for ament.

mathias-luedtke · 2017-03-09T00:18:38Z

For a prerelease script the reason is very similar. If a test in the underlay fails it should continue building and testing the overlay.

From my point of view the test job should fail after the underlay test. Unit testing should detect code breakage, so there is no reason to go on with overlay tests and waste time if the underlay is broken.
Of course this is a matter of preferences and policies.
For now industrial_ci calls the prerelease scripts individually and tests with catkin_test_results after each script.

dirk-thomas · 2017-03-09T00:23:12Z

From my point of view the test job should fail after the underlay test.

@ipa-mdl That might very well be the case for your applications. But others might want to use it differently and we shouldn't close the door and that. E.g. if the underlay reports a linter error the user might still be interested how the tests results look like for the overlay. Please see ament/ament_tools#140 for a proposed command line option to choose the desired return code behavior to make both uses cases easily possible.

mathias-luedtke · 2017-03-09T00:39:38Z

Please see ament/ament_tools#140 for a proposed command line option to choose the desired return code behavior to make both uses cases easily possible.

I think the current implementation in catkin is fine, all cases can be handled properly this way.
I just find it confusing that the exit code gets dropped in the prerelease scripts.
Our work-around is okay for now, but it feels odd.
It would we nice if we could opt-out of this default behaviour, perhaps by setting an environment variable.

Something like

$catkin_test_results_CMD || ${HANDLE_TEST_RESULT_FAILURE:-true}

(without set +-e)

So we could call it with HANDLE_TEST_RESULT_FAILURE=exit

dirk-thomas · 2017-03-11T20:49:45Z

I created PR #399 with a similar feature. You can set ABORT_ON_TEST_FAILURE=1 to exit with a non-zero return code. Optionally for a prerelease you can also distinguish the underlay and overlay separately if needed (ABORT_ON_TEST_FAILURE_UNDERLAY, ABORT_ON_TEST_FAILURE_OVERLAY). For a prerelease if the underlay fails a test and either ABORT_ON_TEST_FAILURE or ABORT_ON_TEST_FAILURE_UNDERLAY is set the script will abort and not do anything for the overlay.

Another alternative as of #396 is to check the return code after the script invocation. Therefore you need to source the generated script instead of invoking it. After it finishes the environment variables catkin_test_results_RC (for devel jobs) or catkin_test_results_RC_underlay and catkin_test_results_RC_overlay (for prerelease jobs) provide the corresponding return codes. The script itself still exits with a zero return code if nothing fails to perform the task. See the .travis.yml file for how this is being used. The rational is again that being seeing which line fails it is immediately clear if the script failed for some reason (non-zero rc for . job.sh -y) or if a test failed (non zero-rc for (exit $catkin_test_results_RC)).

Please try the patch and let me know if that addresses your use case.

mathias-luedtke mentioned this issue Feb 14, 2017

[test] Retry more to fix #115. sniekum/ar_track_alvar#116

Merged

mathias-luedtke changed the title ~~prerelease: catkin_test_results erros gets suppressed~~ prerelease: catkin_test_results errors get suppressed Feb 14, 2017

mathias-luedtke mentioned this issue Feb 14, 2017

make prerelease test fail on test failures ros-industrial/industrial_ci#124

Merged

dirk-thomas changed the title ~~prerelease: catkin_test_results errors get suppressed~~ prerelease: catkin_test_results return codes get suppressed Feb 16, 2017

dirk-thomas mentioned this issue Mar 8, 2017

Add error summary at the end of ament test ament/ament_tools#139

Closed

gerkey mentioned this issue Mar 8, 2017

Proposal: make ament test return non-zero on test failure ament/ament_tools#140

Closed

dirk-thomas mentioned this issue Mar 11, 2017

add option to abort scripts when tests fail #399

Merged

dirk-thomas closed this as completed in #399 Mar 15, 2017

dirk-thomas mentioned this issue Apr 19, 2018

use rc 0 when CTest has failing tests colcon/colcon-cmake#3

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prerelease: catkin_test_results return codes get suppressed #367

prerelease: catkin_test_results return codes get suppressed #367

mathias-luedtke commented Feb 13, 2017 •

edited

Loading

dirk-thomas commented Feb 14, 2017

mathias-luedtke commented Feb 14, 2017

dirk-thomas commented Feb 14, 2017 •

edited

Loading

mathias-luedtke commented Feb 14, 2017 •

edited

Loading

dirk-thomas commented Feb 14, 2017

gerkey commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

gerkey commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

mathias-luedtke commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

gerkey commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

mathias-luedtke commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

gerkey commented Mar 8, 2017

mathias-luedtke commented Mar 9, 2017

dirk-thomas commented Mar 9, 2017

mathias-luedtke commented Mar 9, 2017 •

edited

Loading

dirk-thomas commented Mar 11, 2017

prerelease: catkin_test_results return codes get suppressed #367

prerelease: catkin_test_results return codes get suppressed #367

Comments

mathias-luedtke commented Feb 13, 2017 • edited Loading

dirk-thomas commented Feb 14, 2017

mathias-luedtke commented Feb 14, 2017

dirk-thomas commented Feb 14, 2017 • edited Loading

mathias-luedtke commented Feb 14, 2017 • edited Loading

dirk-thomas commented Feb 14, 2017

gerkey commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

gerkey commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

mathias-luedtke commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

gerkey commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

mathias-luedtke commented Mar 8, 2017

dirk-thomas commented Mar 8, 2017

gerkey commented Mar 8, 2017

mathias-luedtke commented Mar 9, 2017

dirk-thomas commented Mar 9, 2017

mathias-luedtke commented Mar 9, 2017 • edited Loading

dirk-thomas commented Mar 11, 2017

mathias-luedtke commented Feb 13, 2017 •

edited

Loading

dirk-thomas commented Feb 14, 2017 •

edited

Loading

mathias-luedtke commented Feb 14, 2017 •

edited

Loading

mathias-luedtke commented Mar 9, 2017 •

edited

Loading