Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build not stopping after script command errors #1066

Open
arr2036 opened this Issue Apr 23, 2013 · 84 comments

Comments

Projects
None yet
@arr2036
Copy link

arr2036 commented Apr 23, 2013

We recently changed our script command from:

script: ./configure && make -j8 && make travis-test

To

script:
- ./configure
- make -j8
- make travis-test

We noticed if make -j8 fails, the tests are still run... this seems like a defect.

@sarahhodne

This comment has been minimized.

Copy link
Contributor

sarahhodne commented Apr 23, 2013

Hey,

I believe this is the correct behavior as our build system works right now, but I agree with you that this should probably be changed to fail faster.

@arr2036

This comment has been minimized.

Copy link
Author

arr2036 commented Apr 24, 2013

Yeah, it also helps avoid irrelevant output. If your configure scripts fail partway through, chance are the compilation stage is going to fail too, and the tests are probably also going to fail.

You've now got three errors to deal with, and have to figure out which one caused the actual issue.

It's the same sort of issue as with test suites like Test::More, that fail on the first test, but blindly continue producing screens and screens of irrelevant output.

@electrical

This comment has been minimized.

Copy link

electrical commented Nov 14, 2013

Is this something that will be implemented anytime soon?

@BanzaiMan

This comment has been minimized.

Copy link
Member

BanzaiMan commented Jan 15, 2014

To make the script bail on the first error, you can add set -e. (Until travis-ci/travis-build#192 is deployed, however, this can make your build behave incorrectly due to #891.)

The way I see it, there is a tension here: sometimes we want the build to fail fast, but sometimes we don't (e.g., if your tests comprise of several test suites, you might want to run them all anyway, even when one fails early).

It is easier to make tests fail early when the default behavior is not, than vice versa, so I tend to prefer the current behavior.

@BanzaiMan

This comment has been minimized.

Copy link
Member

BanzaiMan commented Jan 15, 2014

You can sandwich the stuff in script with set -e and set +e. That should work around the issue I mentioned above.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Feb 24, 2014

Run the travis build as one large command
It appears that travis doesn't stop running script commands after the first one
fails (see travis-ci/travis-ci#1066), so chain all our
commands together with && for now.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Feb 25, 2014

Run the travis build as one large command
It appears that travis doesn't stop running script commands after the first one
fails (see travis-ci/travis-ci#1066), so chain all our
commands together with && for now.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Feb 25, 2014

Run the travis build as one large command
It appears that travis doesn't stop running script commands after the first one
fails (see travis-ci/travis-ci#1066), so chain all our
commands together with && for now.

jxs added a commit to jxs/rust that referenced this issue Apr 2, 2014

Run the travis build as one large command
It appears that travis doesn't stop running script commands after the first one
fails (see travis-ci/travis-ci#1066), so chain all our
commands together with && for now.
@roidrage

This comment has been minimized.

Copy link
Contributor

roidrage commented May 1, 2014

What do you guys think of changing this behaviour? I'm in favor of the build failing quicker, though it could also be an option.

@joshk

This comment has been minimized.

Copy link
Member

joshk commented May 2, 2014

I would have to tend to agree about changing the behaviour.

It would be also good to provide an option where we wait for all script commands to finish.

@roidrage

This comment has been minimized.

Copy link
Contributor

roidrage commented May 2, 2014

@travis-ci any more opinions on this? Like, either add an option to fail fast or add an option to run all if one fails and change the default behaviour to fail on the first failing command??

@techthumb

This comment has been minimized.

Copy link

techthumb commented Jun 23, 2014

👍 for changing this behaviour to halt the build when any command exits with a non zero exit code.

@sun

This comment has been minimized.

Copy link

sun commented Sep 1, 2014

As others have mentioned, "fail fast" is a paradigm, which your code/script may or may not support.

How about a new setting at the top-level or in the build matrix? (cf. fast_finish)

fail_fast: true

The flag would have an impact on before_install, install, before_script, and script:

  1. Any exit code > 0 in those sections triggers the fail fast behavior to cancel the job.
  2. If a before_install or install command fails, the entire job halts immediately.
  3. If a before_script command fails, all script commands are skipped, and the job immediately proceeds to the after_script section.
    (Is this safe in all cases?)
  4. If a script command fails, the job immediately skips to the after_failure section.
    (More or less like now, but not executing subsequent commands after the failing one.)
  5. after_script is executed last.
    (Is this safe in all cases?)

Given these considerations, I think that a build script using fail_fast has to actively support the error conditions; otherwise, it might perform unintended/unexpected operations.

Typical examples would be build/job notifications, as well as after_script commands that are trying to send job results (e.g., code coverage data) to an external web service.

Therefore, it would be sensible to make fail_fast an opt-in.

@ssbarnea

This comment has been minimized.

Copy link

ssbarnea commented Apr 1, 2015

This is a clear bug and even if that's the expected behaviour the documentation page must be updated http://docs.travis-ci.com/user/build-configuration/ as it does not say that.

I would vote for fail_fast as being enabled by default anyway.

Due to this bug now I do have to create a super long line of concatenated commands with &&.

@amadornimbis

This comment has been minimized.

Copy link

amadornimbis commented Jun 24, 2015

I agree that a fail_fast option something like what @sun described would be very nice to have.

@joshk

This comment has been minimized.

Copy link
Member

joshk commented Jul 25, 2015

I'm closing this issue for the time being as it has become stale.

We may consider this in the future.

@joshk joshk closed this Jul 25, 2015

pjrobertson added a commit to quicksilver/Quicksilver that referenced this issue Sep 23, 2015

pjrobertson added a commit to quicksilver/Quicksilver that referenced this issue Sep 23, 2015

pjrobertson added a commit to quicksilver/Quicksilver that referenced this issue Sep 23, 2015

vadz added a commit to vadz/wxWidgets that referenced this issue Jan 30, 2016

Combine all Travis CI commands into a single one
Don't build if configure failed and don't build tests if building the library
failed and so on: contrary to the expectations, Travis continues to execute
the rest of the commands even if a previous one had failed, so chain them all
explicitly together using "&&" to make sure we fail as soon as possible.

See travis-ci/travis-ci#1066

vadz added a commit to wxWidgets/wxWidgets that referenced this issue Jan 30, 2016

Combine all Travis CI commands into a single one
Don't build if configure failed and don't build tests if building the library
failed and so on: contrary to the expectations, Travis continues to execute
the rest of the commands even if a previous one had failed, so chain them all
explicitly together using "&&" to make sure we fail as soon as possible.

See travis-ci/travis-ci#1066
@szpak

This comment has been minimized.

Copy link

szpak commented Sep 30, 2018

Being aware of this issue I use || travis_terminate 1 in my projects, but recently I encountered a situation which (probably) can't be handled that way.

I refactored:

if [ "$TRAVIS_SECURE_ENV_VARS" == "true" ]; then

to:

if [ "$TRAVIS_SECURE_ENV_VARS" == "true" && "$TRAVIS_OS_NAME" == "linux" ]; then

and it seemed to work fine up until there commit triggering the release process which failed. All thanks to the fact that && in bash requires [[ ... && ... ]]. Without that the whole statement fails silently (there is an error printed, but the build moves on) and the code inside is not executed (in my case messing up the release).

It would be useful to have it supported in some better way in Travis.

@mxcl

This comment has been minimized.

Copy link

mxcl commented Oct 2, 2018

This isn't just a matter of “fail fast”, I had builds failing that were being reported as green for months because of this. Not all scripts have later steps that depend on the success or failure of previous steps, some of those previous steps matter and don't cause the last script line to fail.

@asmeurer

This comment has been minimized.

Copy link

asmeurer commented Oct 2, 2018

@szpak even set -e doesn't handle that case, because it ignores failures as part of an if test, and indeed, if [ "$TRAVIS_SECURE_ENV_VARS" == "true" && "$TRAVIS_OS_NAME" == "linux" ]; then echo; fi; has zero exit status. So even if this feature were implemented it probably wouldn't catch your case. (I don't know why bash has this behavior. I'm sure it makes sense to someone, but very little of bash makes sense to me).

@RalfJung

This comment has been minimized.

Copy link

RalfJung commented Oct 8, 2018

This isn't just a matter of “fail fast”, I had builds failing that were being reported as green for months because of this. Not all scripts have later steps that depend on the success or failure of previous steps, some of those previous steps matter and don't cause the last script line to fail.

Wait, it doesn't just keep going after a failed command, it entirely ignores the failure?

Seems like an, uhm, interesting design choice for a service intended to increase reliability.

@cesarizu

This comment has been minimized.

Copy link

cesarizu commented Oct 8, 2018

In case of any failure, it should mark the build as failed in any case. Then it should either present a list of the failing commands at the end or stop the build immediately.

@szpak

This comment has been minimized.

Copy link

szpak commented Oct 8, 2018

@asmeurer You are right. I checked it locally and even though there is an error ./foo.sh: line 5: [: missing ]'` printed out, the result is 0. Nice...

mnonnenmacher added a commit to heremaps/oss-review-toolkit that referenced this issue Oct 10, 2018

Make sure Travis fails if one of the called scripts fails
Add `set -e` and `set +e` as documented here:
travis-ci/travis-ci#1066 (comment)

Otherwise Travis will continue even if one of the called scripts exits with
1.

sschuberth added a commit to heremaps/oss-review-toolkit that referenced this issue Oct 10, 2018

Make sure Travis fails if one of the called scripts fails
Add `set -e` and `set +e` as documented here:
travis-ci/travis-ci#1066 (comment)

Otherwise Travis will continue even if one of the called scripts exits with
1.

stedolan added a commit to ocamllabs/ocaml-multicore that referenced this issue Nov 16, 2018

@darkmattercoder

This comment has been minimized.

Copy link

darkmattercoder commented Jan 3, 2019

Is there an official statement, why this is not considered as a bug?

I am still taking the first thousand steps in the world of CI/CD but what I know from the Gitlab CI that I use on my job, I expect a build definitely to fail when one of its entries fails, unless I explicitly specified otherwise.

@remram44

This comment has been minimized.

Copy link

remram44 commented Jan 4, 2019

You can use GitLab CI with GitHub projects now, so no need to tweak your configs to navigate the unmaintained Travis' bugs: https://about.gitlab.com/2018/03/22/gitlab-10-6-released/#introducing-gitlab-cicd-for-github

dwijnand added a commit to dwijnand/cargo that referenced this issue Jan 10, 2019

@worc

This comment has been minimized.

Copy link

worc commented Jan 11, 2019

this bug also causes a waste of time if you run any kind of docker commands before, say, integration tests. if the pull/tag images step fails, what's the point of trying to run the integration tests? fail fast is clearly the right option then.

@aldanor

This comment has been minimized.

Copy link

aldanor commented Jan 21, 2019

@arr2036 opened this Issue on Apr 23, 2013

...

I'm closing this issue for the time being as it has become stale.

We may consider this in the future.

@joshk closed this on Jul 25, 2015

...

$ date
Mon 21 Jan 09:40:52 GMT 2019

@joshk Sorry to remind but... I think it's safe to say the future is now.

MSF-Jarvis added a commit to MSF-Jarvis/viscerion that referenced this issue Feb 5, 2019

Travis: Change script stage to fail fast
Since travis-ci/travis-ci#1066 continues to be
neglected, the only way to make failure of one step cause the other
to fail is for them to rely on each other.

Signed-off-by: Harsh Shandilya <msfjarvis@gmail.com>

dwijnand added a commit to dwijnand/sbt-extras that referenced this issue Feb 6, 2019

@IvanBoyko

This comment has been minimized.

Copy link

IvanBoyko commented Feb 20, 2019

I can't believe this has been going on for almost 6 years...
I'm new to Travis CI, but the more I learn it, the more I want to run away to better systems like GitLab CI, Circle CI, AppVeyor, Concourse.

@tcuje

This comment has been minimized.

Copy link

tcuje commented Mar 11, 2019

6 Years, much discussion, no solution, trivial problem, wow!

@Alex-Bogdanov

This comment has been minimized.

Copy link

Alex-Bogdanov commented Mar 18, 2019

I would like to applause standing up to Travis team and German quickness, productivity and end users responsiveness! You are the best performers in the software engineering world! 6 years - Open -> Closed -> Open -> Closed ->... Are you having a while loop running for that in your core?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.