-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
registry/health: adding healthcheck package #230
Conversation
// Represents the possible server states based on the currently recorded | ||
// healthchecks. | ||
const ( | ||
StatusOK = "StatusOK" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These string values can just be "ok", "warning" and "error".
Let's move this package into |
) | ||
|
||
// Status represents a named status check and it's current status. | ||
type Status struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this structure going to have a "meta" field, as proposed by #133? Or, did we find that information should be a part of expvar?
ea601f3
to
af88da8
Compare
@stevvooe @NathanMcCauley would love another review. |
cc @icecrime @aluzzardi @mavenugo @endophage because this kind of stuff has a broader interest. |
8e00f96
to
b9cd974
Compare
}) | ||
} | ||
|
||
// HTTPChecker does a HEAD request and verifies if the HTTTP status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/HTTTP/HTTP/
Got a few comments, but overall I think it's cool 👍 |
|
||
DownHandler(recorder, req) | ||
|
||
assert.Equal(t, recorder.Code, 404, "Code should be 404") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if record.Code != 404 {
// report error
}
LGTM! Nice work on this one. |
// overwrites to a specific check status. | ||
func Register(name string, check Checker) { | ||
mutex.RLock() | ||
defer mutex.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to use mutex.Lock
& mutex.Unlock
as there is a write access 5 lines below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find. Fixed.
ca057e4
to
ceebf44
Compare
StatusHandler(recorder, req) | ||
|
||
if recorder.Code != 503 { | ||
t.Errorf("Did not get a 500.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
503 =)
LGTM. Might be nice to be able to configure the checkers but that can be future work. ...
health:
checkers:
filechecker: [/path/one/, /path/two/]
httpchecker: [http://foo.com, http://bar.com]
... |
@endophage white-belt changes? |
ceebf44
to
9573bfd
Compare
Sounds good to me |
9573bfd
to
fd8531c
Compare
Added a expvar style handler for the debug http server to allow health checks (/debug/health). Signed-off-by: Diogo Monica <diogo@docker.com>
fd8531c
to
5370f2c
Compare
registry/health: adding healthcheck package
@aluzzardi Please! Let us know if you need help! |
Summary
Package health provides a generic health checking framework. The health package works expvar style. By importing the package the debug server is getting a
/debug/health
endpoint that returns the current status of the application. If there are no errors,/debug/health
will return a HTTP 200 status, together with an empty JSON reply{}
. If there are any checks with errors, the JSON reply will include all the failed checks, and the response will be have a HTTP 500 status.A Check can either be run synchronously, or asynchronously. We recommend that most checks are registered as an asynchronous check, so a call to the
/debug/health
endpoint always returnsimmediately. This pattern is particularly useful for checks that verify upstream connectivity or database status, since they might take a long time to return/timeout.
Installing
To install health, just import it in your application:
import "github.com/docker/distribution/health"
You can also (optionally) import
health/api
that will add two convenience endpoints:/debug/health/down
and/debug/health/up
. These endpoints add "manual" checks that allow the service to quickly be brought in/out of rotation.import _ "github.com/docker/distribution/registry/health/api"
After importing these packages to your main application, you can start registering checks.
Registering Checks
The recommended way of registering checks is using a periodic Check. PeriodicChecks run on a certain schedule and asynchronously update the status of the check. This allows
CheckStatus()
to return without blocking on an expensive check.A trivial example of a check that runs every 5 seconds and shuts down our server if the current minute is even, could be added as follows:
Alternatively, you can also make use of
RegisterPeriodicThresholdFunc
to implement the exact same check, but add a threshold of failures after which the check will be unhealthy. This is particularly useful for flaky Checks, ensuring some stability of the service when handling them.The lowest-level way to interact with the health package is calling
Register
directly. Register allows you to pass in an arbitrary string and something that implementsChecker
and runs your check. If your method returns an error with nil, it is considered a healthy check, otherwise it will make the health check endpoint/debug/health
start returning a 500 and list the specific check that failed.Assuming you wish to register a method called
currentMinuteEvenCheck() error
you could do that by doing:CheckFunc is a convenience type that implements Checker.
Another way of registering a check could be by using an anonymous function and the convenience method RegisterFunc. An example that makes the status endpoint always return an error:
Examples
You could also use the health checker mechanism to ensure your application only comes up if certain conditions are met, or to allow the developer to take the service out of rotation immediately. An example that checks database connectivity and immediately takes the server out of rotation on err:
You can also use the predefined Checkers that come included with the health package. First, import the checks:
import "github.com/docker/distribution/health/checks
After that you can make use of any of the provided checks. An example of using a
FileChecker
to take the application out of rotation if a certain file exists can be done as follows:After registering the check, it is trivial to take an application out of rotation from the console:
You could also test the connectivity to a downstream service by using a
HTTPChecker
, but ensure that you only mark the test unhealthy if there are a minimum of two failures in a row: