Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health Checks Failing - Readiness & Liveness For Production Azure AKS Cluster #107257

Closed
zohebs341 opened this issue Dec 30, 2021 · 6 comments
Closed
Labels
kind/support Categorizes issue or PR as a support question. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@zohebs341
Copy link

Hi All,

We are using Azure AKS Version 1.20.7 with nearly 10 Nodes running on it. Vault is Up, Database is Up, Istio is up and running.

But still health checks(Liveness & Readiness) are failing.

context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Readiness probe failed: HTTP probe failed with statuscode: 500

Any suggestions or guess?

Is it because of load at API server?? As we have Istio & vault as an init container to application pods.

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Dec 30, 2021
@k8s-ci-robot
Copy link
Contributor

@zohebs341: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Dec 30, 2021
@zohebs341
Copy link
Author

In general, if DB connectivity issues or if vault having issues. Health check will fail, that is expected.

But in my case, vault, istio& DB all are up. But still liveness & readiness failing

@zohebs341
Copy link
Author

@swetharepakula Please check if you find time.

Adding more info to it.

I am using Horizontal pod autoscaler CPU based, continuously it's creating and deleting pods based on CPU load. I've noticed "failed to update endpoints error" in events.

@yxxhero
Copy link
Member

yxxhero commented Jan 3, 2022

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 3, 2022
@ehashman
Copy link
Member

ehashman commented Jan 5, 2022

Kubernetes does not use issues on this repo for support requests. If you have a question on how to use Kubernetes or to debug a specific issue, please visit our forums.

/kind support
/close

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Jan 5, 2022
@k8s-ci-robot
Copy link
Contributor

@ehashman: Closing this issue.

In response to this:

Kubernetes does not use issues on this repo for support requests. If you have a question on how to use Kubernetes or to debug a specific issue, please visit our forums.

/kind support
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

4 participants