[Carry 22719] healthcheck feature #23218

Merged
merged 2 commits into from Jun 2, 2016

Projects

None yet

6 participants

@thaJeztah
Member
thaJeztah commented Jun 2, 2016 edited

carry #22719

closes #22719
closes #21143
closes #21142

talex5 and others added some commits Apr 18, 2016
@talex5 @thaJeztah talex5 Add support for user-defined healthchecks
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.

The `HEALTHCHECK` instruction has two forms:

* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)

The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.

When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.

The options that can appear before `CMD` are:

* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)

The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.

If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.

It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.

There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.

The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).

The command's exit status indicates the health status of the container.
The possible values are:

- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly

If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.

For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:

    HEALTHCHECK --interval=5m --timeout=3s \
      CMD curl -f http://localhost/ || exit 1

To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).

When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.

Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
b6c7bec
@thaJeztah thaJeztah Bump engine-api to fa04f66c7871183dd53a5ec666479f49b452743d
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
76d8b0d
@thaJeztah thaJeztah added this to the 1.12.0 milestone Jun 2, 2016
@crosbymichael crosbymichael merged commit ce255f7 into docker:master Jun 2, 2016

8 of 9 checks passed

gccgo Jenkins build Docker-PRs-gccgo 6212 is running
Details
docker/dco-signed All commits signed
Details
documentation success
Details
experimental Jenkins build Docker-PRs-experimental 19350 has succeeded
Details
janky Jenkins build Docker-PRs 28156 has succeeded
Details
userns Jenkins build Docker-PRs-userns 10319 has succeeded
Details
vendor Jenkins build Docker-PRs-vendor 1019 has succeeded
Details
win2lin Jenkins build Docker-PRs-Win2Lin 26685 has succeeded
Details
windowsTP5 Jenkins build Docker-PRs-WoW-TP5 2495 has succeeded
Details
@crosbymichael crosbymichael deleted the thaJeztah:carry-22719-healthcheck-feature branch Jun 2, 2016
@icecrime
Member
icecrime commented Jun 3, 2016

Thanks @thaJeztah @crosbymichael! Congratulations @talex5 🎉

@thaJeztah
Member

pinging our awesome duo @albers and @sdurrheimer (for the new flags)

@konobi
konobi commented Jun 3, 2016

doesn't the remote api allow you to access resources on the remote side? This would allow for the client to check for file existence "READY_ON /tmp/container_is_up_and_read"?

For healthchecking on a more ongoing basis, I think something like containerpilot is going to be the better solution going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment