-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wait container | Failed to establish pod watch ... dial tcp i/o timeout #4993
Comments
We are experiencing this same issue in our Argo Workflow deployment. |
It's likely that your cluster/apiserver is super unstable. There's an environment variable |
@terrytangyuan Thanks for your answers! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
We are experiencing the same problem at our cluster. I cannot confirm the observations of @RysBen, however. Out Kubernetes is an AWS EKS cluster, which I would follow best practices so I cannot say much about customizations regarding Kubernetes API agents... |
Summary
Hi all,
We submit hundreds of workflows at specific times of the day. The status of some steps in workflows would be "Error/Failed", and MESSAGE is as follows:
At first, I thought it was caused by overloading the cluster. After observing it for a while, I found that the problem was all happening on the wait container, and main containers were normal. Why is it? And is there any way to solve this problem?
Any suggestion would be appreciated.
More Info
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
The text was updated successfully, but these errors were encountered: