Actuator Health Endpoint returns 503 when app is working but Spring Cloud Vault points to standby #112

sworisbreathing · 2017-05-18T06:47:07Z

Starting in version 0.6.2, requests to a standby Vault server are automatically forwarded to the active member, and thus complete successfully (except apparently for requests to the system backend).

However, VaultHealthIndicator will report OUT_OF_SERVICE in such a scenario, meaning if you are using Spring Boot Actuator endpoints to provide status information (for example, using /management/health for a status check in a load balancer), the actuator endpoint will report an HTTP 503 status even though the application is successfully communicating with Vault and able to service requests.

A fallback approach could query the /sys/leader endpoint and, if ha_enabled: true is returned in the response payload, fire a subsequent request to the leader_address to check the health

The text was updated successfully, but these errors were encountered:

mp911de · 2017-05-18T10:34:46Z

Thanks for the ticket. From your report, I read that you're using Vault HA with direct server communication (no load balancer in between). I'm not sure adding cluster-awareness right now is a good path to follow. True cluster-awareness would mean that cluster state reflects down to the client so the endpoint gets updated and sends requests to the active host. I created spring-projects/spring-vault#98 to track Vault cluster efforts.

ThevaultHealthIndicator bean is created conditionally so you can provide your own instance to customize health check behavior (see VaultHealthIndicator for an implementation example).

sworisbreathing · 2017-05-18T23:54:14Z

@mp911de thanks for the tip. I'll definitely try a custom VaultHealthIndicator, which passes the standbyok request parameter

mp911de · 2017-05-19T20:07:34Z

Maybe there is even a less-invasive change possible for now (until we get to cluster support).

Starting with Vault 0.6.2, a standby node isn't an issue anymore. The health response gives us all required details to decide whether we're communicating with an instance that forwards requests or not.

The health check could adapt to the version: Responses without a Version number are pre-0.6.1, Version number starting with Vault v is 0.6.1, every other version number indicates 0.6.2 or higher. In case of standby it can return out of service for versions before 0.6.2 and healthy for versions 0.6.2 and higher.

Does this make sense?

sworisbreathing · 2017-05-21T23:31:13Z

Prior to 0.6.2, Vault's default behavior was to return a redirect to the leader for any operations sent to a standby node (except for some or maybe all of the /sys/... endpoints).

Assuming the underlying client library is set to automatically follow redirects (which I believe is the default behavior in both OkHttp and HttpClient), I'm not sure it would have been an issue on older Vault releases either.

Also, in 0.6.2 onwards you can configure Vault to use the older redirect behavior. Though I can't imagine why someone would want to do this, it would probably be difficult for a client to know ahead of time which is the case. In any case, it doesn't really matter - if the underlying http library follows redirects, then requests sent to a standby node should still succeed from an application perspective.

mp911de · 2017-05-22T13:12:12Z

I'm inclined to change standby state to OK. The status message already reports standby state and applications will continue working in a healthy state.

/cc @singram

We now accept Vault standby nodes as available. Requests to standby nodes are redirected by Vault to the master node. Communication with a standby node allows using Vault without functional restrictions. Related pull request: gh-113. Fixes gh-112.

mp911de · 2017-05-27T21:33:30Z

Changed Vault standby node health check result to Health.up().

hellohelloye · 2020-04-01T12:46:40Z

I met the same issue, implementing custom health check, the app working fine, but return 503 on endpoint /actuator/health.

https://docs.spring.io/spring-boot/docs/current/reference/html/production-ready-features.html#writing-custom-healthindicators

Did you find the solution?

mp911de · 2020-04-01T14:29:47Z

Is this still an issue after upgrading? If so, please file a new ticket.

We now accept Vault standby nodes as available. Requests to standby nodes are redirected by Vault to the master node. Communication with a standby node allows using Vault without functional restrictions. Related pull request: gh-113. Fixes gh-112.

mp911de added the for: team-attention label May 18, 2017

This was referenced May 19, 2017

Allow health check to pass standbyok parameter. spring-projects/spring-vault#100

Closed

Allow vault standby nodes to pass health checks. #113

Closed

mp911de added status: waiting-for-feedback and removed for: team-attention labels May 19, 2017

mp911de removed the status: waiting-for-feedback label May 22, 2017

mp911de added this to the 1.0.2 milestone May 22, 2017

mp911de added the type: enhancement label May 27, 2017

mp911de closed this as completed in 656587d May 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actuator Health Endpoint returns 503 when app is working but Spring Cloud Vault points to standby #112

Actuator Health Endpoint returns 503 when app is working but Spring Cloud Vault points to standby #112

sworisbreathing commented May 18, 2017

mp911de commented May 18, 2017

sworisbreathing commented May 18, 2017 •

edited

mp911de commented May 19, 2017

sworisbreathing commented May 21, 2017

mp911de commented May 22, 2017

mp911de commented May 27, 2017

hellohelloye commented Apr 1, 2020

mp911de commented Apr 1, 2020

Actuator Health Endpoint returns 503 when app is working but Spring Cloud Vault points to standby #112

Actuator Health Endpoint returns 503 when app is working but Spring Cloud Vault points to standby #112

Comments

sworisbreathing commented May 18, 2017

mp911de commented May 18, 2017

sworisbreathing commented May 18, 2017 • edited

mp911de commented May 19, 2017

sworisbreathing commented May 21, 2017

mp911de commented May 22, 2017

mp911de commented May 27, 2017

hellohelloye commented Apr 1, 2020

mp911de commented Apr 1, 2020

sworisbreathing commented May 18, 2017 •

edited