You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
A job is created with a single InitContainer and a single main container. The pod restart policy is set to "Never". If the job fails, the pod is randomly deleted. I should mention that the pod is deleted sometimes, not all the time.
Most importantly, we did not observe this issue in Kubernetes 1.12.9-gke.15, but we are observing it now in 1.14.6-gke.1 - we do not have a Kubernetes 1.13 cluster.
What you expected to happen:
The pod to remain indefinitely as long as the Job object remains on the system or is explicitly deleted.
How to reproduce it (as minimally and precisely as possible):
k8s-ci-robot
added
sig/node
Categorizes an issue or PR as relevant to SIG Node.
and removed
needs-sig
Indicates an issue or PR lacks a `sig/foo` label and requires one.
labels
Oct 16, 2019
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I've figured this out - it's to do with auto-resizing of node pools in GKE - after ~15min the underlying node that was hosting the pod goes away and Kubernetes removes all the components that were once connected to that node.
What happened:
A job is created with a single InitContainer and a single main container. The pod restart policy is set to "Never". If the job fails, the pod is randomly deleted. I should mention that the pod is deleted sometimes, not all the time.
Most importantly, we did not observe this issue in Kubernetes 1.12.9-gke.15, but we are observing it now in 1.14.6-gke.1 - we do not have a Kubernetes 1.13 cluster.
What you expected to happen:
The pod to remain indefinitely as long as the Job object remains on the system or is explicitly deleted.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
I have a sinking suspension that this may be related to the following issue (#79398) / PR (#79451) - hopefully I'm not completed off-base here.
Environment: GKE
kubectl version
):cat /etc/os-release
):uname -a
):The text was updated successfully, but these errors were encountered: