New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Prepared Queries to return Critical Nodes #4983
Comments
Hey @jjathman, thanks for the details. Can you describe whether you would consume this via DNS or HTTP API (or both). If we add it we'd need to make it work for both which leads to fun dealing with DNS response truncation correctly when you have "order" to take into account. For example, we con't control which "order" DNS clients choose to consume results in. |
Hi @banks, We would use this via the HTTP API only. Would it be possible to make this an HTTP only option? I can appreciate the difficulties in having this option via DNS. If no one has asked for the feature with DNS maybe it isn't necessary? We have internally debated using services in a Error status as a final option, but it seems like it is the best thing to try if there are no other options available. |
@jjathman if it's only for working around unexpected incidents then there is already a way to do something similar with prepared queries, with manual intervention. We added field to prepared queries for ignoring certain checks . For example some people have found that if they have some bug or incident where one of their service health checks is returning an error, they can just update the prepared query to ignore that check temporarily so instances are still served until the check is fixed. Would that be sufficient for your case? It's different than needing all services via prepared query for some other reason like trying to sync the whole set with a loadbalancer that does it's own health probes granted, but interested to know if you have specific needs that fall outside of that? |
Your request is actually more-or-less the same thing as I guess the real question is: do you really want everything all the time, or do you only want critical if there are no (or fewer than some threshold) healthy instances? I could imagine for example adding a "min_healthy_threshold" which you might set to say 40% and only if there are fewer than 40% f the registered instances in a healthy state would it start returning critical. This is similar to how other load balances implement safety valves on circuit breakers. |
Regarding your first question, yes I think your idea of using the Trying to make this work I expected making a prepared query like this would do what I expected, but it doesn't seem to be returning services in a critical status.
I thought this would ignore the health check of the application. I must be doing something wrong. For your second question, in this specific use case I think we would only want/use critical services if they were the only option. However, I think there is a more general use case where I would want to see everything and deal with the results at the client level. When I originally started looking at prepared queries I assumed it would be possible to retrieve any service instances that I can see via the /health endpoint, but maybe that wasn't the intention of using a prepared query. I would be in favor of your |
Nevermind my question about why it wasn't working for me, we are on too old of a version for the ignore checks. I'll upgrade us now and test it again. |
Hey there, Feel free to check out the community forum as well! |
This feature would be extremely nice to have, because vault creates checks with unique names on each node, so IgnoreCheckIDs does not work for me, because it only matches exactly |
This is exactly what I'm looking for, essentially an automated way to set |
It seems that
To match critical services only. |
Feature Description
Allow for a new parameter of prepared query which will allow the query to return failing nodes.
Use Case(s)
We are looking at changing our usage of Consul to move from the
/health/service
endpoint to using prepared queries. Occasionally, we would like to use a service in acritical
status as the final fallback if there are no available services in a passing or warning state. Right now the service endpoint will return those nodes, we would like to be able to replicate that behavior with a prepared query.Ideally, the failing nodes would only be returned as the last ones.
The text was updated successfully, but these errors were encountered: