New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated cherry pick of #116995: kubelet: Ensure pods that have not started track a #117369
Automated cherry pick of #116995: kubelet: Ensure pods that have not started track a #117369
Conversation
A pod that cannot be started yet (due to static pod fullname exclusion when UIDs are reused) must be accounted for in the pod worker since it is considered to have been admitted and will eventually start. Due to a bug we accidentally cleared pendingUpdate for pods that cannot start yet which means we can't report the right metric to users in kubelet_working_pods and in theory we might fail to start the pod in the future (although we currently have not observed that in tests that should catch such an error). Describe, implement, and test the invariant that when startPodSync returns in every path that either activeUpdate OR pendingUpdate is set on the status, but never both, and is only nil when the pod can never start. This bug was detected by a "programmer error" assertion we added on metrics that were not being reported, suggesting that we should be more aggressive on using log assertions and automating detection in tests.
Opening this for backport because we can't eliminate the possibility that this causes a severe issue, we just have no evidence that it does yet. |
+1, I agree we should backport this to v1.27. At a minimum it will fix the unexpected metrics and possibly other issues we don't know about yet. /lgtm |
LGTM label has been added. Git tree hash: 509356f08734ebdd90f8683d463191f3034259bd
|
/cc @kubernetes/release-managers for cherrypick approvals |
/triage accepted |
/kind bug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: saschagrunert, smarterclayton The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Cherry pick of #116995 on release-1.27.
#116995: kubelet: Ensure pods that have not started track a
For details on the cherry pick process, see the cherry pick requests page.