Preserve exit code when docker-compose exists with an error #800

sorenlouv · 2020-03-24T12:38:22Z

Invoking apm-integration testing doesn't preserve the exit code from docker-compose.

Demonstration

scripts/compose.py start master

While running the above command, I kill docker which results in the following error:

ERROR: dial unix docker.raw.sock: connect: connection refused

However, the exit code is 0 indicating that everything went fine.

The text was updated successfully, but these errors were encountered:

kuisathaverat · 2020-03-24T13:00:23Z

we run the docker-compose in detached mode, so this is the expected behavior the process is launched in background, when we saw this abrupt fail?

https://github.com/elastic/apm-integration-testing/blob/master/scripts/modules/cli.py#L629-L634

kuisathaverat · 2020-03-24T14:14:28Z

I have remembered how we manage this on the test, we use docker-compose-wait to wait for all the containers up and ready if docker-compose-wait failed the make target fails.

https://github.com/elastic/apm-integration-testing/blob/master/Makefile#L116

sorenlouv · 2020-03-24T15:14:43Z

we run the docker-compose in detached mode, so this is the expected behavior the process is launched in background, when we saw this abrupt fail?

Is this a question for me?
I don't think this is expected behaviour from a consumer point-of-view. When any subprocess of compose.py fails, that should be propagated up, so the consumer can handle it. Right now it's being swallowed.

kuisathaverat · 2020-03-25T11:46:54Z

@sqren I meant, this is the expected behavior of docker-compose when you launch docker-compose up -d you are running it in the background so you do not care about the return code. If you run it in foreground it takes the terminal to show you the logs, so you can not make anything else in the terminal. Because compose.py is used to generate the docker-compose file and launch the containers, we delegate the responsibility of check everything is OK to docker-compose-wait and then we can run the tests. Everything is orchestrated by make docker-test-agent-go for example.

What it is the use case where you would run compose.py start and you need the return code in the same terminal? remember that if you launch docker-compose up in the foreground and everything goes OK the terminal is taken by docker-comose, you cannot run anything else in that terminal after that.

sorenlouv · 2020-03-25T17:39:23Z

this is the expected behavior of docker-compose when you launch docker-compose up -d you are running it in the background so you do not care about the return code

I don't think this is accurate. docker-compose will run in the foreground until images have been downloaded and containers built. It'll then exit as soon containers are up. See this for instance:

In the above case docker compose returns exit code "1" because an error happened while downloading the image. If it had succeeded it would exit with code 0 and continue running the container(s) in the background

scripts/compose.py also runs the build step in the foreground but returns a 0 exit code even though it failed to download the image:

So to be clear: I'm not talking about the exit code if any of the docker containers exit unexpectedly. I'm talking about the exit code for the docker-compose process itself.

What it is the use case where you would run compose.py start and you need the return code in the same terminal?

I'm running ./scripts/compose.py as part of a bigger bash program. I want to exit the bash program if ./scripts/compose.py fails. The exit code is the easiest way to achieve this.

kuisathaverat · 2020-03-25T18:19:44Z

@sqren
I think I did not be clear, we launch a subprocess from Python to avoid block the terminal
https://github.com/elastic/apm-integration-testing/blob/master/scripts/modules/cli.py#L404, this is a fork of the current process, change this behavior impacts how ITs works.

I'm running ./scripts/compose.py as part of a bigger bash program. I want to exit the bash program if ./scripts/compose.py fails. The exit code is the easiest way to achieve this.

if the do not launch a subprocess this terminal blocks for the docker-compose process, so you can not continue with the execution of the script. see the following script.

#!/usr/bin/env bash

docker-compose up
echo "*****************************************************"
echo "*****************************************************"
echo "*****************************************************"
echo "*****************************************************"
echo "*****************************************************"
echo "I'll never be executed until docker-compose is killed"
echo "*****************************************************"
echo "*****************************************************"
echo "*****************************************************"
echo "*****************************************************"

On the other screen capture, you post

in this case compose.py is no longer running, it ends with code 0, the code that you have in the command "echo $?", the forked process docker-compose is detached of the terminal and it reports its exit code nowhere.

You should use the same approach we use for the ITs call docker-compose-wait command, it is a Python command that waits for all Docker container in the docker-compose file

@victor is working on the CI part of the execution of your script, we see that the script has a bunch of logic to orchestrate the execution of the containers, that logic is easy to manage on the docker-compose side. When we have the final solution for the CI we will review the script to see if you are interested in removing some parts and relay on the compose.py

sorenlouv · 2020-03-25T19:25:07Z

if the do not launch a subprocess this terminal blocks for the docker-compose process, so you can not continue with the execution of the script. see the following script.

In the example you give you are not running in detached mode. If you run in detached mode docker-compose will exit with the appropriate exit code once the containers are up, and you'll then see the message "I'll never be executed until docker-compose is killed".

I think I did not be clear, we launch a subprocess from Python to avoid block the terminal
https://github.com/elastic/apm-integration-testing/blob/master/scripts/modules/cli.py#L404, this is a fork of the current process, change this behavior impacts how ITs works.

That's clear and I think it's fine to run as a subprocess.

the forked process docker-compose is detached of the terminal and it reports its exit code nowhere.

And this is what I'm arguing to improve. I'm by no means an experienced python dev so take the following with a grain of salt. From the python docs it looks like it's possible to get the return code from the subprocess like so:

    def run_docker_compose_process(self, docker_compose_cmd):
        try:
            return subprocess.call(docker_compose_cmd)
            
        except OSError as err:
            print('ERROR: Docker Compose might be missing. See below for further details.\n')
            raise OSError(err)

    def start_handler(self):
        returncode = run_docker_compose_process(...)
        sys.exit(returncode)

WDYT?

kuisathaverat · 2020-03-25T21:35:20Z

In the example you give you are not running in detached mode. If you run in detached mode docker-compose will exit when the containers are up with the appropriate exit code, and you'll see the message "I'll never be executed until docker-compose is killed".

I know this is an example of what happens when you launch compose in a regular way, or you are waiting for the exit code of a process in background, the terminal is blocked waiting for the process.

And this is what I'm arguing to improve. I'm by no means experienced in python so take below with a grain of salt, but from the python docs it looks like it's possible to get the return code from the sub process:

@sqren it was possible if compose.py garb the exit code. compose.py generates the docker-compose file, launch the docker-compose -d command and exits, compose.py does not wait for docker-compose -d exit code. So there is no return code at leats the docker-compose fails to start (for example if you do not have docker or docker-compose installed) if the process is launched correctly, and docker-compose fail pulling and image or building an image.
The following code could work as you expected.

    def run_docker_compose_process(self, docker_compose_cmd):
        try:
            returncode = subprocess.call(docker_compose_cmd)
        except OSError as err:
            print('ERROR: Docker Compose might be missing. See below for further details.\n')
            raise OSError(err)
       if returncode != 0:
         sys.exit(returncode)

sorenlouv · 2020-03-25T22:00:41Z

compose.py does not wait for docker-compose -d exit code

Hmmm.. I might be misunderstanding you, but subprocess.call returns the returncode of the docker-compose subprocess, which is exactly what I'm interested in.

I made a proof of concept here: #804

kuisathaverat · 2020-03-25T22:09:02Z

see #803 this use check_call to wrap all docker-compose calls if any fails it will return 1 as exit code, otherwise return 0.

sorenlouv · 2020-03-25T22:23:01Z

That's great! 👍

sorenlouv mentioned this issue Mar 24, 2020

[APM] E2E: Zero config for running e2e locally elastic/kibana#59152

Merged

2 tasks

kuisathaverat added the automation label Mar 24, 2020

kuisathaverat mentioned this issue Mar 25, 2020

feat: return the exit code of docker compose in case of failure #803

Merged

sorenlouv mentioned this issue Mar 25, 2020

Preserve returncode (proof of concept) #804

Closed

kuisathaverat closed this as completed in #803 Mar 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve exit code when docker-compose exists with an error #800

Preserve exit code when docker-compose exists with an error #800

sorenlouv commented Mar 24, 2020

kuisathaverat commented Mar 24, 2020 •

edited

kuisathaverat commented Mar 24, 2020

sorenlouv commented Mar 24, 2020

kuisathaverat commented Mar 25, 2020

sorenlouv commented Mar 25, 2020 •

edited

kuisathaverat commented Mar 25, 2020 •

edited

sorenlouv commented Mar 25, 2020 •

edited

kuisathaverat commented Mar 25, 2020

sorenlouv commented Mar 25, 2020

kuisathaverat commented Mar 25, 2020 •

edited

sorenlouv commented Mar 25, 2020

Preserve exit code when docker-compose exists with an error #800

Preserve exit code when docker-compose exists with an error #800

Comments

sorenlouv commented Mar 24, 2020

kuisathaverat commented Mar 24, 2020 • edited

kuisathaverat commented Mar 24, 2020

sorenlouv commented Mar 24, 2020

kuisathaverat commented Mar 25, 2020

sorenlouv commented Mar 25, 2020 • edited

kuisathaverat commented Mar 25, 2020 • edited

sorenlouv commented Mar 25, 2020 • edited

kuisathaverat commented Mar 25, 2020

sorenlouv commented Mar 25, 2020

kuisathaverat commented Mar 25, 2020 • edited

sorenlouv commented Mar 25, 2020

kuisathaverat commented Mar 24, 2020 •

edited

sorenlouv commented Mar 25, 2020 •

edited

kuisathaverat commented Mar 25, 2020 •

edited

sorenlouv commented Mar 25, 2020 •

edited

kuisathaverat commented Mar 25, 2020 •

edited