New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
App container is started before initContainers could complete on node reboot #120247
Comments
/sig node NOTE: This occurs without enabling the new Glancing through the code that was added for the the sidecar feature, I wouldn't be surprised if something got broken in the normal Here is the PR that introduced these changes: |
There have been some discussions about this and we realized that it is practically difficult to guard all new code paths. /assign |
Does it sound like a regression? It should be backported to v1.28.x. |
I can not reproduce this behavior at head , the init containers correctly run before the app apiVersion: v1
kind: Pod
metadata:
name: myapp
labels:
app.kubernetes.io/name: MyApp
spec:
restartPolicy: Never
containers:
- name: myapp
image: busybox:1.28
command: ['sh', '-c', 'echo The app is running! && date && exit 0']
initContainers:
- name: init
image: busybox:1.28
command: ['sh', '-c', "echo starting && date ; sleep 15; echo finishing starting after 15 sec && date"] $ kubectl logs myapp -c init
starting
Wed Aug 30 10:22:02 UTC 2023
finishing starting after 15 sec
Wed Aug 30 10:22:17 UTC 2023
$ kubectl logs myapp -c myapp
The app is running!
Wed Aug 30 10:22:18 UTC 2023 |
@gjkim42 @shivamerla can we try to find the root cause first? |
@aojea I am hitting this with pod
On node reboot ( we can see app container started before init could finish)
|
I reproduced it with your scenario. /priority important-soon |
cc @bobbypage @smarterclayton for visibility to kubelet correctness |
I guess this happens when the pod sandbox is changed. we can reproduce it easily with,
I'll propose a PR containing the e2e-scenario and the fix for it. |
I'll cover this as well. |
once we have a solid reproducer, check if it reproduces with 1.27 or not so we know if this was a regression in 1.28 or just a long-standing issue |
confirmed that this is NOT reproduced in 1.27. It is a regression. |
kubernetes/pkg/kubelet/kuberuntime/kuberuntime_container.go Lines 886 to 890 in 3cf3702
This happens because the kubelet uses the old container statuses to check if the pod has been initialized before. Before 1.28, the kubelet considers the pod as initialized if there is any regular container running. I think the latter one is theoretically right, but currently, we cannot know if the container statuses are from the current pod sandbox or the old one. The fix would be,
|
that looks to have changed as part of sidecar work - there were a lot of changes to pkg/kubelet/kuberuntime/kuberuntime_container.go in https://github.com/kubernetes/kubernetes/pull/116429/files#diff-d86ffc24d8e4c9deafd3e39e8f4ee3546ee46a31dcea8ea6526443b3f4bffa28 that are not obviously feature gated... were all of those supposed to be no-op refactors? |
cc @thockin for visibility to kubelet regressions in 1.28 related to sidecar refactors |
A starting point that would build confidence would be to restore the kubelet logic here to the 1.27 state before the sidecar merge (reviewing The refactored / new methods could be kept in a different file and only called when the feature gate was enabled. That would restore 1.28 to match 1.27 behavior by default as quickly as possible. |
/cc |
/cc |
What happened?
On node reboot, we see app container is started before initContainers could complete. Essentially both app and initContainers are running in parallel. We are using initContainers to ensure dependent drivers are loaded by other pod before the app container could start. But because app container is starting in parallel, they hit a failure. This was not seen with k8s 1.27 and only happening with 1.28. Also, this happens only on node reboot, not during fresh install of our application (gpu-operator)
Above, we can see even though init container
mofed-validation
is still running, main containernvidia-driver-ctr
started. Also pod status shows 2/2 main containers ready when initContainer is still running.Also, another issue we are seeing is for pod
nvidia-operator-validator-ktcxj
, which has 4 init containers, only 1 is run (driver-validation
) and rest are skipped, while the app container started in parallel again. The start/finish timestamps of these initContainers indicate that clearly below.What did you expect to happen?
initContainers should complete in order before app containers could start on node reboot. Also, all initContainers should run rather than skipped before app-container could start.
How can we reproduce it (as minimally and precisely as possible)?
Create a pod with multiple initContainers (which wait for about 2-3 mins) and an app container. On node reboot, app-container is started in parallel to initContainer.
Anything else we need to know?
No response
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
root@co-node-8:~# ctr version
Client:
Version: v1.7.2
Revision: 0cae528dd6cb557f7201036e9f43420650207b58
Go version: go1.20.4
Server:
Version: v1.7.2
Revision: 0cae528dd6cb557f7201036e9f43420650207b58
UUID: f120b710-d067-4968-bb0a-6532b6918467
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: