New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'fig up' should return with non-zero exit-code if some container "failed" #683

Closed
matleh opened this Issue Nov 28, 2014 · 24 comments

Comments

Projects
None yet
@matleh
Copy link

matleh commented Nov 28, 2014

fig run <container> does already return the exit-code of the container (see #197) but fig up returns with 0 no matter what the exit-code of the started containers is.

It would be helpful to able to detect, if one of the containers returned with a non-zero exit code and have fig return a non-zero exit-code in that case, too.

Also, fig up does end with exit-code 1 if one of the containers could not be started at all, and also if one does Ctl-C once, but it ends with exit-code 0 of one does Ctl-C twice. It would be more in line with "standard unix behavior" if fig would exit with -2 in case of single Ctl-C (SIGINT) and -9 in case of double Ctl-C (SIGKILL).

@matleh

This comment has been minimized.

Copy link

matleh commented Nov 28, 2014

To make it reproducible:

docker run -it --rm busybox "false"
echo $?

gives 1, as it should.

With this fig.yml:

# fig.yml
foo:
  image: busybox
  command: "false"

I can do fig run foo; echo $? and it will give me 1, too.
But

fig up
echo $?

gives 0, even though fig outputs something like test_foo_1 exited with code 1.

@sylvinus

This comment has been minimized.

Copy link

sylvinus commented Nov 30, 2014

+1

1 similar comment
@zjstrive

This comment has been minimized.

Copy link

zjstrive commented Dec 1, 2014

+1

@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Dec 2, 2014

I think this is expected behaviour. fig run is running a single container, so it's clear what is considered an "error" (the exit status of the container).

But fig up may run many containers, and it tails logs. I would say that fig up should only return an error status code if it fails to communicate with the docker daemon at some point. If one of many containers exits with a non-zero exit code, I don't really see that as a fig up error.

If you need this behaviour you should be able to use fig run, or docker inspect a specific container to get the exit status code.

@matleh

This comment has been minimized.

Copy link

matleh commented Dec 2, 2014

@dnephin I agree that, since fig up runs several containers coequallly, it is not so clear what conditions should be considered an error if one of the containers returns with a non-zero exit code.
However, since fig up is used for orchestration of containers, I still think that the majority of use-cases are based on the expectation, that containers return successfully and that there should be an easy way for the user of fig to detect if that expectation holds true.

Since we do not know, which container failed, it does not make sense to return the exact error-code of the failed container. I could imagine to have different error-codes of fig up signal different conditions, like 1 meaning one of the containers failed, 2 meaning communication between fig and docker daemon failed, -x meaning fig was interrupted by signal code x and so on. Whoever needs more information as to what container failed and what the exact error-code was can then - in the known case of an error - make use of docker inspect.

But if the user needs to fiddle with docker inspect each time he called fig up just to know whether everything went fine it takes away much of the ease-of-use that fig started to provide in the first place.

My specific use case is this: I run functional tests with fig - one container for the app under test, one or more additional containers for database etc. and one container for the testrunner. It would be great to have an easy way to see, whether the tests passed or not.

@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Dec 2, 2014

@matleh that makes a lot of sense. I use fig for the exactly same thing! Sometimes I have tests run on the hosts, and sometimes it's from a testrunner container. I've used fig run for this so that I get the response code.

Is there a reason that fig run testrunner isn't working for this case?

Edit: If it's an issue with not all containers being linked up, I believe this should work:

fig up -d
fig run testrunner
@matleh

This comment has been minimized.

Copy link

matleh commented Dec 2, 2014

I like to see the colored output from all the containers - helps me to understand what is going on.

And fig run and fig up differ in how they handle linked containers, as far as my tests and my understanding goes. Could it be that fig run does not clean up (that is stop) linked containers after it is done? And it seems it just uses existing containers without stopping and recreating them as fig up does, which makes me worry that testruns might not be fully isolated. And it does not stop linked containers on SIGINT ...

@matleh

This comment has been minimized.

Copy link

matleh commented Dec 2, 2014

@dnephin I just now read your edit: wouldn't that run testrunner twice - once as part of fig up -d and then again explicit with fig run testrunner?

Besides that, it feels more like a work-around to use both fig up and fig run just to get a hold on the return code when fig up actually does everything I need - everything except the return code.

But I have that feeling, that I might miss something important about fig run ...

@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Dec 5, 2014

@matleh yes, that's true, it does run twice. I set the command in the container to be something like bash so that it just exits right away. The actual commands I run are something like this:

fig pull
fig build
fig up -d
fig run tester /code/test.sh
fig stop
fig rm --force <any data volume containers>

If you're concerned about the tester running twice, you could also change the fig up -d to fig up -d app which should only start the things you need.

I think in general if you need a response code, or you want to run something automated, fig run is the thing to use. fig up is more for interactive use. At least that's how I see it. I think there is benefit in keeping these things distinct.

@matleh

This comment has been minimized.

Copy link

matleh commented Dec 5, 2014

@dnephin thank you for the insight into your way of doing things. I am sure that this works great. But I still think, that it is viable to do things in different ways as different people and different scenarios have different priorities, expectations and so on. So where fig run is a perfect match for one person/scenario, fig up might be better suited for another one.

All I ask for, is for fig up to return some meaningful exit-code. fig up is used for orchestration of different services, so if that orchestrator ends, it is for my understanding one of the most fundamental and important things to know, why it ended. Was it because of some kind of error or was it a "normal" finish. The assumtion, that fig up is only used for long-running services that should never finish and if they do it is always some kind of error (as I understand it this is the assumtion underlying the current implementation) I see as a little short-sighted.

To me fig up means: do whatever that fig-file was designed to do, while fig run means: with the environment defined in this fig-file, do some other administrative/development/whatever task.
In both cases, I am interested in the outcome.

@dvapelnik

This comment has been minimized.

Copy link

dvapelnik commented Jan 9, 2015

I think that fig must return more meaningful exit-codes
I'm writing bash wrapper for fig service with deploying database dump from file and other things, but I want to check each my step for rollbacking all my actions for prevent corrupting my data (database dump by example)
So, by example, my steps:

  1. Start up containers (I'm checking exit code and going next step if it equal to zero)
  2. //todo something with database and DNS
  3. I'm stopping my containers
  4. I'm removing my containers

In this workflow all look good, but if my heedless user accidentally run this steps:

  1. Start up containers (I'm checking exit code and going next step if it equal to zero)
  2. //todo something with database and DNS
  3. Start up containers again with fig up -d (I'm checking exit code and exit code is zero even with using --no-recreate command line key)
  4. My wrapper will deploy database from dump-file again with loss t my actual data in database

Can I check is containers are running? I can get fig ps and grep it but I think that is the wrong way

I thing that situation with workflow is wrong because fig up must only start up containers. I'll fig stop, fig rm and fig up again if i want to recreate my containers. I think that is a correct workflow with fig. I'm expecting nonzero exit code on fig up when containers are running but I look unexpected zero exit code.

@smecsia

This comment has been minimized.

Copy link

smecsia commented Feb 9, 2015

I've found the workaround of how to catch the exit code in fig. This is not very clean and nice solution, but at least it works for me.

exec 5>&1
log=$(sudo fig up | tee /dev/fd/5)
exitcode=$(echo $log | grep 'exited with code' | sed 's/^.*exited with code \([0-9]\+\).*$/\1/g')
@ghost

This comment has been minimized.

Copy link

ghost commented Feb 26, 2015

FIG_SERVICE=test
DOCKER_CONTAINER=`fig -p xxx ps -q $FIG_SERVICE | cut -c -12`
docker ps -a --filter 'exited=0' | grep $DOCKER_CONTAINER
EXIT_CODE=$?

OR total tally exit code for any service starting with test

EXIT_CODE=`awk '/^test/{gsub(/:\$/,"");print}' fig.yml | xargs fig -p xxx ps 2>/dev/null | grep Exit | sed 's/.*Exit //g' | awk '{n=n+\$1}END{print n}'`
@johanhaleby

This comment has been minimized.

Copy link

johanhaleby commented Mar 6, 2015

+1

@ebuchman

This comment has been minimized.

Copy link

ebuchman commented Mar 9, 2015

Just got tripped up by this as well. Using fig up for continuous integration on circle ci. My fig.yml simply calls a test.sh file. So my solution (courtesy of @xcthulhu) was to add a touch /tmp/success at the end of test.sh and to check for it with test -e /tmp/success at the end of the circle.yml. Worked beautifully, and much prettier than solutions above if your setup is similar.

@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Mar 9, 2015

Why not use fig run tester instead? You'll get the exit code you expect.

@MrMMorris

This comment has been minimized.

Copy link

MrMMorris commented Mar 10, 2015

@ebuchman I am in the exact same situation. I will try your workaround.

But this definitely needs to be handled properly in compose

@dnephin that is another way to do it, but I feel it is just another workaround for what should work as expected. I use separate test containers that have all my test framework dependancies so I can just run them 'attached' them to my app containers and get test results.

Using fig run means I have to start my test container with something like /bin/bash, make sure it's attached to my app container (volumes_from), fig run against the test container an then deal with stopping/rming the test container. It's a lot more work when I would rather just make sure I am 'attached' to my app container.

@MrMMorris

This comment has been minimized.

Copy link

MrMMorris commented Mar 10, 2015

@dnephin it seems your suggestion doesn't work? Am I doing something wrong?

docker-compose.yml:

test:
  image: node:latest
  command: sleep 100
$ docker-compose up -d
Creating core_test_1...

$ docker-compose ps
   Name        Command    State   Ports
---------------------------------------
core_test_1   sleep 100   Up

$ docker-compose run test npm test
> tv-core@1.0.0 pretest /app
> jshint .

< LOTS OF LINT ERRORS>

> ERR! Test failed.  See above for more details.

$ echo $!
0
@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Mar 10, 2015

I feel it is just another workaround for what should work as expected

I'm not really convinced of that. I would much rather see docker-compose up go the way of #741. The exit status of docker-compose up only reflects "docker-compose was able to start containers". At that point up is done it's work, and what you're really seeing is the logs. If at any point it fails, it is not related to the original docker-compose up you ran. Having docker-compose up attempt to reflect the many different possible exit scenarios is just not realistic.

On the other hand, docker-compose run has the behaviour of running one specific container (with its dependencies), so it's clear that when that container exits the exit status of docker-compose run would match that single container.

I use separate test containers that have all my test framework dependancies so I can just run them 'attached' them to my app containers and get test results.

I don't understand what you're trying to do here. Could you link to a sample config? (edit: I see the config now, will look)

I can say that my primary (and almost exclusive) use-case for docker-compose is CI, and I have never encountered any issue with this.

docker-compose run tester my_tests
@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Mar 10, 2015

That should be echo $?

This works:

docker-compose.yml

tester:
    image: busybox
    command: sleep 100
$ docker-compose run tester cat /file/does/not/exist
cat: can't open '/file/does/not/exist': No such file or directory
$ echo $?
1
@MrMMorris

This comment has been minimized.

Copy link

MrMMorris commented Mar 10, 2015

Yep, that works. Not sure where I got $! from...

Your workaround is actually fine for me. I understand the complications of handling exited codes from processes in a container that don't relate to fig up. I also think that this would be something most people expect and it kind of sucks for a container to fail immediately after starting and not know about it....

Either way I am fine with just using fig run for now.

@spenthil

This comment has been minimized.

Copy link

spenthil commented Oct 22, 2015

copy paste example of using inspect:

docker-compose ps -q | xargs docker inspect -f '{{ .State.ExitCode }}' | grep -v 0 | wc -l | tr -d ' '

  1. get container IDs
  2. get last runs exit code for each container ID
  3. only non-0 status codes
  4. count number of non-0 status codes
  5. trim out white space

Returns how many non-0 exit codes were returned. Would be 0 if everything exited with code 0.

@sanmai-NL

This comment has been minimized.

Copy link
Contributor

sanmai-NL commented Jun 1, 2016

Using the up subcommand to start a container with entry point /bin/sh and command exit 40 prints to the console:
my_container exited with code 0

@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Apr 20, 2017

Fixed in #4397 --exit-code-from

@dnephin dnephin closed this Apr 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment