New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix UpdateSnapshot when Node is partially removed #95130
Conversation
/priority critical-urgent /assign @Huang-Wei Sorry I missed this in the previous PR :( |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1c34cab
to
4a99524
Compare
@@ -256,7 +256,7 @@ func (cache *schedulerCache) UpdateSnapshot(nodeSnapshot *Snapshot) error { | |||
nodeSnapshot.generation = cache.headNode.info.Generation | |||
} | |||
|
|||
if len(nodeSnapshot.nodeInfoMap) > len(cache.nodes) { | |||
if len(nodeSnapshot.nodeInfoMap) > cache.nodeTree.numNodes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment explaining that deleted nodes get removed from the tree but are kept in the cache until the pods are deleted, and hence here we check against the number of nodes in the tree not the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, document that the snapshot (both map and lists) only include nodes that are not deleted as of the time the snapshot was taken, and so nodeinfo.Node() is guaranteed not to be nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I amended.
Change-Id: I5b459e9ea67020183c87d1ce0a2380efb8cc3e05
4a99524
to
d6f09f7
Compare
/hold |
Thanks @alculquicondor ! /unhold |
/retest Review the full test history for this PR. Silence the bot with an |
…f-#95130-upstream-release-1.18 Automated cherry pick of #95130: Fix UpdateSnapshot when Node is partially removed
…f-#95130-upstream-release-1.19 Automated cherry pick of #95130: Fix UpdateSnapshot when Node is partially removed
What type of PR is this?
/kind bug
What this PR does / why we need it:
Fix Snapshot update when a Node is deleted before its Pods.
Which issue(s) this PR fixes:
Fixes #95124
Special notes for your reviewer:
This is an old bug that was reintroduced when #93938 reverted some behavior
Does this PR introduce a user-facing change?: