Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prerelease: catkin_test_results return codes get suppressed #367

Closed
mathias-luedtke opened this issue Feb 13, 2017 · 20 comments
Closed

prerelease: catkin_test_results return codes get suppressed #367

mathias-luedtke opened this issue Feb 13, 2017 · 20 comments

Comments

@mathias-luedtke
Copy link
Contributor

mathias-luedtke commented Feb 13, 2017

Why does catkin_test_results gets called with set +e?
(https://github.com/ros-infrastructure/ros_buildfarm/blob/master/ros_buildfarm/templates/devel/devel_script_test_results.sh.em)

IMHO failing tests should stop devel and prerelease tests.
The current behavior forces the prerelease test user to call catkin_test_results again with the correct workspace paths.

@dirk-thomas
Copy link
Member

If a user invokes a build together with the tests it needs to be distinguishable if the build failed, or if only some tests failed, or if everything passed.

Currently this means when you invoke the tests and they have been run successfully (which doesn't imply that all tests have passed) the return code is 0. If you then want to distinguish if the any tests have failed you can use catkin_test_results.

If the invocation would return an error code the user (as well as any automated tool) couldn't distinguish if the build failed for some other reason or if only some tests failed.

@mathias-luedtke
Copy link
Contributor Author

if the invocation would return an error code the user (as well as any automated tool) couldn't distinguish if the build failed for some other reason or if only some tests failed.

The automated test does not care what part failed.
And most users to do not check the exit code, but they will read the last line with the number of failures.

So in order to get a proper exit codes, the user (or the CI systems) needs to call

catkin_test_results $WORKSPACE/catkin_workspace/test_results
catkin_test_results $WORKSPACE/catkin_workspace_overlay/test_results

This is not really user-friendly and in total catkin_test_results gets called 4 times..

@dirk-thomas
Copy link
Member

dirk-thomas commented Feb 14, 2017

I am open to any suggestions if you could describe an approach which works for the existing Jenkins jobs as well as for the user / automated scripts running a prerelease.

It might also be good to consider how other tools running tests behave. Users might expect a similar behavior from our tools.

@mathias-luedtke
Copy link
Contributor Author

mathias-luedtke commented Feb 14, 2017

which works for the existing Jenkins jobs

I guess the Jenkin jobs should continue to ignore test failures?

I am open to any suggestions

Somethings involving an environment variable to make the script fail on test errors?

It might also be good to consider how other tools running tests behave. Users might expect a similar behavior from our tools.

industrial_ci fails now for any error, but I am not sure if the users expect this ;)
travis runs all script lines, but exits with an error code if any of these fail.

@mathias-luedtke mathias-luedtke changed the title prerelease: catkin_test_results erros gets suppressed prerelease: catkin_test_results errors get suppressed Feb 14, 2017
@dirk-thomas
Copy link
Member

You mentioned "failing tests should stop devel and prerelease tests" before and I probably don't understand what exactly that means. It might be good if you could describe for each case (devel job, devel script, prerelease script underlay / overlay) how you imagine it to change behavior in detail. If possible expressing the desired change in a PR might help to understand since it might be difficult to describe clearly.

@dirk-thomas dirk-thomas changed the title prerelease: catkin_test_results errors get suppressed prerelease: catkin_test_results return codes get suppressed Feb 16, 2017
@gerkey
Copy link

gerkey commented Mar 8, 2017

To separate the two cases, user and Jenkins:

  • If I, as a user, invoke catkin_make run_tests, and it causes a build step to happen before the test step happens, I expect to get non-zero if the build fails or if any test fails (including both failure to invoke test and test that invokes successfully but registers a failure). I'm perfectly happy with the ambiguity of what failed (build vs. test). I guess that this expectation comes from my view of catkin_make as an aggregator of the underlying tools that it invokes: I expect an individual test program (e.g., gtest) to return non-zero if any test case fails; I expect make test to return non-zero if any of the programs that it runs return non-zero; and I expect catkin_make to return non-zero if make test on any of the packages that it processes returns non-zero.

  • For Jenkins and other automated tools, would it be sufficient to structure the usage of catkin_make so that the build and test steps are separate? E.g., if a given job will run tests, have it first explicitly invoke catkin_make and check that return code to determine if the build failed and if so stop there. If the build succeeded, then proceed to call catkin_make run_tests and ignore the return code, counting on the test reporting / collocation system to do the right thing.

@dirk-thomas
Copy link
Member

If the build succeeded, then proceed to call catkin_make run_tests and ignore the return code, counting on the test reporting / collocation system to do the right thing.

How do you detect a failure in test execution in this case (not a failing test but something failing to run the tests as requested by the command)?

@gerkey
Copy link

gerkey commented Mar 8, 2017

How do you detect a failure in test execution in this case (not a failing test but something failing to run the tests as requested by the command)?

Are these two cases being distinguished currently?

@dirk-thomas
Copy link
Member

How do you detect a failure in test execution in this case (not a failing test but something failing to run the tests as requested by the command)?

Are these two cases being distinguished currently?

Yes, if the test invocation returns non-zero something failed. If the test_results invocation returns non-zero one of more tests have failed.

@mathias-luedtke
Copy link
Contributor Author

IMHO the split between run_test (fails for execution errors) and test_results (actual test failed) is fine.

It might be good if you could describe for each case (devel job, devel script, prerelease script underlay / overlay) how you imagine it to change behavior in detail. If possible expressing the desired change in a PR might help to understand since it might be difficult to describe clearly.

If the exit code of catkin_test_results gets exposed (as opt-in?), the calling instance can handle it accordingly. My preference is to not allow any failures (execution or result) for my test jobs. In this case the failure might come from run_test or test_results and the user has to read the job log.

@dirk-thomas
Copy link
Member

Currently catkin_test_results already returns 1 if any test failed. There is no option to disable that - not sure if that is necessary for any use case.

@gerkey
Copy link

gerkey commented Mar 8, 2017

Yes, if the test invocation returns non-zero something failed. If the test_results invocation returns non-zero one of more tests have failed.

Ah, interesting. Who is currently taking advantage of that distinction? Is it just for users, or does something like Jenkins rely on it?

@dirk-thomas
Copy link
Member

Yes, if the test invocation returns non-zero something failed. If the test_results invocation returns non-zero one of more tests have failed.

Ah, interesting. Who is currently taking advantage of that distinction? Is it just for users, or does something like Jenkins rely on it?

Anyone automating the process will likely use the return value. E.g. when running a build and test cycle via Travis.

@mathias-luedtke
Copy link
Contributor Author

Currently catkin_test_results already returns 1 if any test failed. There is no option to disable that - not sure if that is necessary for any use case.

Yes, but this exit code 1 gets dropped in devel jobs and the prerelease test scripts.

@dirk-thomas
Copy link
Member

Currently catkin_test_results already returns 1 if any test failed. There is no option to disable that - not sure if that is necessary for any use case.

Yes, but this exit code 1 gets dropped in devel jobs and the prerelease test scripts.

Indeed it does.

For a devel job the rational is the following: The Jenkins build should not fail. Instead the test results are being interpreted and if any of them failed the build is marked as "unstable".

For a prerelease script the reason is very similar. If a test in the underlay fails it should continue building and testing the overlay.

@gerkey
Copy link

gerkey commented Mar 8, 2017

Thanks, I think that I have enough context to proceed; I'll move my thoughts into a proposed enhancement for ament.

@mathias-luedtke
Copy link
Contributor Author

For a prerelease script the reason is very similar. If a test in the underlay fails it should continue building and testing the overlay.

From my point of view the test job should fail after the underlay test. Unit testing should detect code breakage, so there is no reason to go on with overlay tests and waste time if the underlay is broken.
Of course this is a matter of preferences and policies.
For now industrial_ci calls the prerelease scripts individually and tests with catkin_test_results after each script.

@dirk-thomas
Copy link
Member

From my point of view the test job should fail after the underlay test.

@ipa-mdl That might very well be the case for your applications. But others might want to use it differently and we shouldn't close the door and that. E.g. if the underlay reports a linter error the user might still be interested how the tests results look like for the overlay. Please see ament/ament_tools#140 for a proposed command line option to choose the desired return code behavior to make both uses cases easily possible.

@mathias-luedtke
Copy link
Contributor Author

mathias-luedtke commented Mar 9, 2017

Please see ament/ament_tools#140 for a proposed command line option to choose the desired return code behavior to make both uses cases easily possible.

I think the current implementation in catkin is fine, all cases can be handled properly this way.
I just find it confusing that the exit code gets dropped in the prerelease scripts.
Our work-around is okay for now, but it feels odd.
It would we nice if we could opt-out of this default behaviour, perhaps by setting an environment variable.

Something like

$catkin_test_results_CMD || ${HANDLE_TEST_RESULT_FAILURE:-true}

(without set +-e)

So we could call it with HANDLE_TEST_RESULT_FAILURE=exit

@dirk-thomas
Copy link
Member

I created PR #399 with a similar feature. You can set ABORT_ON_TEST_FAILURE=1 to exit with a non-zero return code. Optionally for a prerelease you can also distinguish the underlay and overlay separately if needed (ABORT_ON_TEST_FAILURE_UNDERLAY, ABORT_ON_TEST_FAILURE_OVERLAY). For a prerelease if the underlay fails a test and either ABORT_ON_TEST_FAILURE or ABORT_ON_TEST_FAILURE_UNDERLAY is set the script will abort and not do anything for the overlay.

Another alternative as of #396 is to check the return code after the script invocation. Therefore you need to source the generated script instead of invoking it. After it finishes the environment variables catkin_test_results_RC (for devel jobs) or catkin_test_results_RC_underlay and catkin_test_results_RC_overlay (for prerelease jobs) provide the corresponding return codes. The script itself still exits with a zero return code if nothing fails to perform the task. See the .travis.yml file for how this is being used. The rational is again that being seeing which line fails it is immediately clear if the script failed for some reason (non-zero rc for . job.sh -y) or if a test failed (non zero-rc for (exit $catkin_test_results_RC)).

Please try the patch and let me know if that addresses your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants