New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
manager#syncPod should cover reconciliation #89155
Conversation
/cc @sjenning @rphillips |
/retest |
/priority important-soon |
6c1850e
to
dfb0e62
Compare
The current formation is a little smarter than the first patch @byxorna put in his cluster. The checkNeedsUpdate flag only needs to be false when m.needsReconcile() says so. |
/test pull-kubernetes-e2e-gce |
/hold |
@smarterclayton i know you are also looking in similar areas. lets have a broader discussion in sig-node meeting on areas we are investigating on kubelet reliability. right now a number of the changes are hard to follow in a coherent fashion without a broader story. |
/assign |
7d975a9
to
3354ac5
Compare
/cc @Random-Liu |
/remove-lifecycle stale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't fight the bot on stale PRs.
There are no accompanying test changes for this PR so we can't be sure it actually addresses the issue. We also recently refactored the PLEG so this change may no longer be necessary. Without demonstrated impact I am inclined to close this PR.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tedyu The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I would suggest giving this PR some time to see if the problem still occurs in latest release. |
Which old releases did the PLEG refactoring go to ? What about the releases without the refactoring ? Shouldn't those releases be covered (by this fix)? |
I took a look at recent checkins to pkg/kubelet/status/status_manager.go As far as I can tell, the logic touched by the PR remains unchanged. @ehashman |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
this PR has been on hold for about 2 weeks. Can you please see how to unblock this by reviewing comments etc? or please close this out (Trying to declutter our open list of PRs) |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Any proposal doc or reference link can be followed? |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Please see related discussion (around issue #80968): #88255 (comment)
@yujuhong 's question led me to thinking about whether reconciliation may miss some case.
For the following code in manager#syncBatch :
Please note that syncedUID and uid may carry different values.
With this additional log,
Here is one example:
Please note that, in current code, if m.needsUpdate returns false, syncPod() would return early.
However, this may not work with the reconciliation code path (the UID used to check presence in apiStatusVersions in syncPod may be different from the one whose entry is cleared in syncBatch).
This PR allows manager#syncPod to perform reconciliation. Without this change, reconciliation may be skipped (when syncedUID and uid differ).
See #88255 (comment) from @byxorna
Note: another potential fix is to remove uid from apiStatusVersions in manager#syncBatch but this is not as robust as the current fix.
Which issue(s) this PR fixes:
Fixes #
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: