-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readiness Probe #355
Comments
@OperationalDev, are you using the (Not saying we couldn't use your suggestion to enhance the health check endpoint, BTW!) |
@guicassolato I am not using them. Can you give me some guidance as to how I might use them in this scenario to ensure 1 pod is always available? |
@OperationalDev, it's a per-CR check and it won't give you the readiness state for the entire deployment, nor of any arbitrary pod for that matter. Rather, it gives you the state of a particular CR in the index, from the perspective of the leader replica, i.e. according to the one pod that won the dispute amongst all 3 replicas to become the leader Authorino replica and therefore it's responsible for updating the status sub-resource of the AuthConfigs. Unfortunately, we haven't yet implemented any synchronization between pods for the status update; that's why the value in the status sub-resource reflects only what the leader knows. On the other hand, it is relatively safe to assume that what the leader knows is either identical or no more than a couple milliseconds different (ahead or behind) what the other replicas know. Checking the status sub-resource of a particular AuthConfig is straightforward. For example, given the AuthConfig from this example applied to the kubectl get authconfig/talker-api-protection -o jsonpath='{.status.summary.ready}' Output: true In the example above, the AuthConfig lists only one host name, i.e. kubectl get authconfig/talker-api-protection -o jsonpath='{.status.conditions}' Output: [{"lastTransitionTime":"2022-10-17T09:37:36Z","reason":"HostsLinked","status":"True","type":"Available"},{"lastTransitionTime":"2022-10-17T09:37:36Z","reason":"Reconciled","status":"True","type":"Ready"}] ...and/or kubectl get authconfig/talker-api-protection -o jsonpath='{.status.summary.hostsReady}' Output: ["talker-api-authorino.127.0.0.1.nip.io"] |
@guicassolato It's not clear to me how I would use the status of the authconfigs to know which of the authorino pods are ready to serve requests? |
@OperationalDev , sorry if I wasn't clear before. The status block in the authconfigs won't help you with which authorino pods are ready. Instead, it can only tell you whether the authconfig is ready in a particular authorino pod, specifically the leader one. My point from before is that the difference between being ready in the leader pod and being in any other pod should no more than a couple milliseconds away. This is not ideal. I know! But hopefully it's enough to mitigate those 403s a little bit. |
Ok ok, sorry I misunderstood, thank you for clarifying. |
While not completely overlapping, it might be nice to consider this in the context of this issue: Kuadrant/kuadrant-operator#96 |
Implements health and readiness probe endpoints for the controllers, reporting particularly the aggregated state of the AuthConfigs. New endpoints: - `/healthy`: Health probe (ping) - `/readyz`: Aggregated readiness probe (only AuthConfig reconciler currently reporting) - `/readyz/authconfigs`: Aggregated status of the AuthConfigs The default binding network address is `:8081`. It can be changed using the newly introduced flag (command-line arg) `--health-probe-addr`. The endpoints return either `200` ("ok") or `500` when 1+ probes fail. The query string parameters `verbose=true` and `exclude=authconfigs` are supported respectively to provide more verbose responses and exclude a particular probe ("authconfigs" in the example provided). Closes #355
Just wanted to say thanks for the quick turn around time on this, working as expected. |
Challenge
When restarting authorino deployments (with 3 replicas and a pdb), for a few brief seconds, requests are denied with a 403.
Solution
Have a healtz endpoint that can be used as a readiness probe that is true when all of the auth configs have been loaded.
The text was updated successfully, but these errors were encountered: