Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

status code definition #4

Closed
pmhsfelix opened this issue Jan 16, 2018 · 8 comments
Closed

status code definition #4

pmhsfelix opened this issue Jan 16, 2018 · 8 comments

Comments

@pmhsfelix
Copy link

I don't fully agree with the status codes definition:

For “pass” and “warn” statuses HTTP response code in the 2xx - 3xx range MUST be used. for “fail” status HTTP response code in the 4xx - 5xx range MUST be used. In case of the “warn” status, additional information SHOULD be provided, utilizing optional fields of the response.

  • It seems strange to use a 4xx when the request is correct (e.g. well-formed and properly authenticated) and the health resource does exist.
  • A 5xx should be reserved for when the health resource itself is not operating correctly.
  • My initial reaction would be to always use a 200 when the status response correctly represents the state of the system, even if that state is fail. I know that it is common practice to use a 5xx status to represent a failure system status, however that information should be on the resource representation, via this media type, and not on the response status.
@dret
Copy link
Contributor

dret commented Jan 17, 2018 via email

@inadarei
Copy link
Owner

inadarei commented Jan 17, 2018

Great point and good catch!

What if we added "code" property to the spec object and it would be HTTP status code of the service. This way you could add more detail the the status than just "pass", "fail", "warn" but do it in a standard way?

Does that feel useful?

@dret
Copy link
Contributor

dret commented Jan 18, 2018 via email

@inadarei
Copy link
Owner

@dret wouldn't that watchdog link have the same issue, however? If it responds with the HTTP code it would have to be http code of the watchdog URI endpoint and cannot be the code of the service health itself.

Or am I totally confused here?

@dret
Copy link
Contributor

dret commented Jan 20, 2018 via email

@inadarei
Copy link
Owner

inadarei commented Jan 29, 2018

So, I have thought a lot about this (since this is significantly related to: #8)

The original design was correct. /health is not a "separate resource" and this is not a regular API design exercise. There's very significant, established precedent here. As @danielbcorreia noted in #8 a lot of infrastructural tooling (load-balancers, Consul, etcd) expect the health check endpoint to represent the health of the component.

Basically, if http://api.example.com/authz is a security microservice and is exposes /authz/health endpoint the expectation is clearly that when /authz/health return 500, the entire microservice is down. This is how things already work and ignoring it would be ignoring the reality.

The health endpoint doesn't have its own "health". It is only a conduit of the component, simply because nobody cares about the health endpoint other than it indicating the health of what it is a conduit for.

I understand this is a bit odd from URI/HTTP perspective, but if we started inventing an alternate reality with wathcdogs etc. we would be documenting a world that we wish existed, instead of the world that actually exists. Which will make this spec not practical.

@inadarei
Copy link
Owner

inadarei commented Jan 29, 2018

PR with the proposed wording: 68cc850

@dret
Copy link
Contributor

dret commented Mar 25, 2018

i guess if this is what people have been doing that's kind of ok-ish. still not really great design, but agreed that ignoring deployed reality also has merit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants