This document defines the protocol to be used by components that need to ensure a compatible wireformat, agreed upon semantics and possible forms of interactions between system components that need to determine the “liveliness” of computing nodes in a bigger system.
Note that the force of these words is modified by the requirement level of the document in which they are used.
-
MUST This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification.
-
MUST NOT This phrase, or the phrase "SHALL NOT", mean that the definition is an absolute prohibition of the specification.
-
SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
-
SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.
-
MAY – This word, or the adjective “OPTIONAL,” mean that an item is truly discretionary.
-
MUST be compatibility with well known cloud platforms (i.e. http://kubernetes.io/docs/user-guide/liveness/)
-
MUST be appropriate for machine-to-machine communication
-
SHOULD give enough information for a human administrator
Term |
Description |
Producer |
The service/application that is checked |
Consumer |
The probing end, usually a machine, that needs to verify the liveness of a Producer |
Health Check Procedure |
The code executed to determine the liveliness of a Producer |
Producer Outcome |
The overall outcome, determined by considering all health check procedure results |
Health check procedure result |
The result of single check |
-
Consumer invokes the health check of a Producer through any of the supported protocols
-
Producer enforces security constraints on the invocation (i.e authentication)
-
Producer executes a set of Health check procedures (could be a set with one element)
-
Producer determines the overall outcome (Producer outcome)
-
The outcome is mapped to outermost protocol (i.e. HTTP status codes)
-
The payload is written to the response stream
-
The consumer reads the response
-
The consumer determines the overall outcome
This section describes the specifics of the HTTP protocol usage.
How are the health checks accessed and invoked ? We don’t make any assumptions about this, except for the wire format and protocol.
Health checks (innermost) can and should be mapped to the actual invocation protocol (outermost). This section described some of guidelines and rules for these mappings.
-
Producers MAY support a variety of protocols but the information items in the response payload MUST remain the same.
-
Producers SHOULD define a well known default context to perform checks
-
Each response SHOULD integrate with the outermost protocol whenever it makes sense (i.e. using HTTP status codes to signal the overall state)
-
Inner protocol information items MUST NOT be replaced by outer protocol information items, rather kept redundantly.
-
The inner protocol response MUST be self-contained, that is carrying all information needed to reason about the the producer outcome
-
Producer MUST provide a HTTP endpoint that follow the REST interface specifications described in Appendix A
-
The primary information MUST be boolean, it needs to be consumed by other machines. Anything between available/unavailable doesn’t make sense or would increase the complexity on the side of the consumer processing that information.
-
The response information MAY contain an additional information holder
-
Consumers MAY process the additional information holder or simply decide to ignore it
-
The response information MUST contain the boolean state of each check
-
The response information MUST contain the name of each check
-
Producer MUST support JSON encoded payload with simple UP/DOWN states
-
Producers MAY support an additional information holder with key/value pairs to provide further context (i.e. disk.free.space=120mb).
-
The JSON response payload MUST be compatible with the one described in Appendix B
-
The JSON response MUST contain the
name
entry specifying the name of the check, to support protocols that support external identifier (i.e. URI) -
The JSON response MUST contain the
state
entry specifying the state as String: “UP” or “DOWN” -
The JSON MAY support an additional information holder to carry key value pairs that provide additional context
-
A producer MUST support custom, application level health check procedures
-
A producer SHOULD support reasonable out-of-the-box procedures
-
A producer without health check procedures installed MUST returns positive overall outcome (i.e. HTTP 200)
When multiple procedures are installed all procedures MUST be executed and the overall outcome needs to be determined.
-
Consumers MUST support a logical conjunction policy to determine the outcome
-
Consumers MUST use the logical conjunction policy by default to determine the outcome
-
Consumers MAY support custom policies to determine the outcome
Aspects regarding the secure access of health check information.
-
A producer MAY support security on all health check invocations (i.e. authentication)
-
A producer MUST not enforce security by default, it SHOULD be an opt-in feature (i.e. configuration change)
The following table give valid health check responses:
Request |
HTTP Status |
JSON Payload |
State |
Comment |
/health |
200 |
Yes |
UP |
Check with payload. See With procedures installed into the runtime. |
/health |
200 |
Yes |
UP |
Check without procedures installed. See Without procedures installed into the runtime |
/health |
503 |
Yes |
Down |
Check failed |
/health |
500 |
No |
Undetermined |
Request processing failed (i.e. error in procedure) |
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"outcome": {
"type": "string"
},
"checks": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"state": {
"type": "string"
},
"data": {
"type": "object",
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "string|boolean|int"
}
}
}
},
"required": [
"name",
"state"
]
}
}
},
"required": [
"outcome",
"checks"
]
}
(See http://jsonschema.net/#/)
Status 200
{
"outcome": "UP",
"checks": [
{
"name": "myCheck",
"state": "UP",
"data": {
"key": "value",
"foo": "bar"
}
}
]
}
Status 503
{
"outcome": "DOWN",
"checks": [
{
"name": "firstCheck",
"state": "DOWN",
"data": {
"key": "value",
"foo": "bar"
}
},
{
"name": "secondCheck",
"state": "UP"
}
]
}