Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Add support for user-defined healthchecks #22719
This PR adds support for user-defined health-check probes for Docker containers. It adds a
When a container has a healthcheck specified, it has a health status in addition to its normal status. This status is initially
The options that can appear before
The health check will first run interval seconds after the container is started, and then again interval seconds after each previous check completes.
If a single run of the check takes longer than timeout seconds then the check is considered to have failed.
If the health state is
It takes retries consecutive failures of the health check for the container to be considered
For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds:
The changes to "docker run" are described here:
The health status is also displayed in the
Description for the changelog: Add support for user-defined healthchecks
This was referenced
May 13, 2016
There doesn't seem to be anything documented on the protocol between docker and the probe.
Do we want this to be a simple 0 == OK, not zero == bad or more robust ala nagios-style checks (ok, warning, critical, and status messages for reporting to the user)?
That's true in Go but it's very language specific.
What I meant by "you can't have a byte there" is, at the end, that field is going to be a string no matter what. Whether we base64 encode or not is another question.
How do we currently encode output for other commands (e.g. logs)?
I'm not sure what you mean about having both. Would we include the same data twice in each message, or somehow know where to send which data?
Another possibility: if a probe wants to generate binary data then it base64 encodes it itself. That way, probes that only want text don't need to do anything and the messages are human-readable. Probes that return binary data would need something at the client end to interpret it anyway, so decoding it wouldn't be much extra work. The data on the wire would be the same as if we'd used
@talex5 Yes, have both. One that is "human-readable", and hopefully properly sanitized, and one that is the raw, unprocessed output.
This will meet both the use cases of sending binary data and debugging problems with health check output during development. It is unreasonable to make users go through so much work just to get unprocessed command output.
Let's make sure this feature isn't limited to our imaginations but, instead, enables others'.
I do not see the purpose of supporting more than just simple output from a healthcheck, and certainly not storing both binary and human-readable....
I think we should support only something very simple for a human to use as debug info, or potentially remove support for capturing output from the healthcheck from this PR and implement separately.
For the output, in this PR we should keep it as is. The size has already been increased to 4096 which is much better than what it was before. Most commands that do health checks should be encouraged to write a simple reason why they are reporting as they did. "I returned unhealthy BECAUSE i could not connect to the database." We don't want to encourage checks to do a yolo operation and puke up some stacktrace to decipher.
println("could not connect to database"); return 1;
In the future after this has some user and they are wanting some type of more structured response than a limited string it would be effortless to add an unbounded byte array to the output for this. It is something that is safe to defer until we gain feedback.
For the three states, simple health checks don't have to use the starting state if they don't want to. To the normal developer they can just return 1 or 0. If the app has a long startup or some type of complexity on boot then it can take advantage of this state and we don't have to infer any type of guesses that we would otherwise be doing. Lets to write software that guesses things, let the things that know the state tell us. You are not forced to use this state if you don't want to.
After looking at the code again, I think all these things are already implemented.