New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refresh Timestamp when pod is not present in the three sub-queues #97302
Conversation
The pInfo.Timestamp is refreshed but the sort in activeQ or podBackoffQ is not be updated when pod is already present in the backoff or active queue. AddUnschedulableIfNotPresent() return error if pod is already present in the backoff or active queue, and there is no re-add. So refresh pInfo.Timestamp when the pod is not present in the three sub-queues, otherwise need to update the order of the pod in the active or backoff queue, for example p.activeQ.Update(pInfo)
Welcome @jindezgm! |
@jindezgm: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @jindezgm. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/ok-to-test
if _, exists, _ := p.activeQ.Get(pInfo); exists { | ||
return fmt.Errorf("pod: %v is already present in the active queue", nsNameForPod(pod)) | ||
} | ||
if _, exists, _ := p.podBackoffQ.Get(pInfo); exists { | ||
return fmt.Errorf("pod %v is already present in the backoff queue", nsNameForPod(pod)) | ||
} | ||
|
||
// Refresh the timestamp since the pod is re-added. | ||
pInfo.Timestamp = p.clock.Now() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but I cannot see what's the real impact, the pods in the activeQ will still be popped up with the original order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Pod in activeQ have the same priority, they will be FIFO according to the enqueue time.
If the enqueue time is refreshed, the order in the activeQ should be updated.
So I don’t think should refresh the enqueue time
/kind cleanup if this does have a negative impact, pls change this label to "bug" instead. |
please squash, otherwise lgtm. /release-note-none |
The change looks reasonable, but did you find any problems in real clusters? If so, do you think you can expose the problem with a unit test? It would also dictate whether we should backport this. |
+1, it would be great to see a UT for this if possible. |
I did not find the problem in the real cluster, only analyzed the code logic. func TestPriorityQueue_AddUnschedulableIfNotPresent_IfPresent(t *testing.T) {
less := newDefaultQueueSort()
// Easy to test in nanoseconds
q := NewPriorityQueue(less, WithPodInitialBackoffDuration(time.Nanosecond))
totalNum := 2
for i := 0; i < totalNum; i++ {
// Same priority, and the sorting depends on the time to add the queue
priority := int32(0)
p := v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("pod%d", i),
Namespace: fmt.Sprintf("ns%d", i),
UID: types.UID(fmt.Sprintf("upns%d", i)),
},
Spec: v1.PodSpec{
Priority: &priority,
},
}
q.Add(&p)
}
// Pop and Unschedulable
p1, _ := q.Pop()
c1 := q.SchedulingCycle()
q.AddUnschedulableIfNotPresent(p1, c1)
// Pop and Unschedulable again
p2, _ := q.Pop()
c2 := q.SchedulingCycle()
q.AddUnschedulableIfNotPresent(p2, c2)
// Wait for more than the pods backoff time
time.Sleep(time.Microsecond)
// Move all to active
q.MoveAllToActiveOrBackoffQueue("test")
// Will the same Pod be added to the unschedulable queue twice?
q.AddUnschedulableIfNotPresent(p1, c1)
p1, _ = q.Pop()
p2, _ = q.Pop()
if less(p2, p1) {
t.Error("The priority is low but it is poped first")
}
// Easy to test delay in seconds
q.podInitialBackoffDuration = time.Second
// Pods backoff 1 second.
p1.Attempts, p2.Attempts = 1, 1
q.AddUnschedulableIfNotPresent(p1, q.SchedulingCycle())
q.AddUnschedulableIfNotPresent(p2, q.SchedulingCycle())
q.MoveAllToActiveOrBackoffQueue("test")
// 500 milliseconds have passed
time.Sleep(500 * time.Millisecond)
// Will the same Pod be added to the unschedulable queue twice?
q.AddUnschedulableIfNotPresent(p1, q.SchedulingCycle())
// flush backoff completed pods.
stopc := make(chan struct{})
defer close(stopc)
go wait.Until(q.flushBackoffQCompleted, time.Millisecond*100, stopc)
// The backoff of the POD is 1 second, 500ms have passed, and the time for the pod to pop is about 500ms
now := time.Now()
q.Pop()
if duration := time.Now().Sub(now); duration > time.Millisecond*700 {
t.Errorf("Pop pod too long %dms > 700ms", duration.Milliseconds())
}
} |
A Pod would already be present if an Update happens while a Pod is being scheduled. But I suspect this doesn't lead to anything major. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, jindezgm The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The pInfo.Timestamp is refreshed but the sort in activeQ or podBackoffQ is not be updated when pod is already present in the backoff or active queue.
AddUnschedulableIfNotPresent() return error if pod is already present in the backoff or active queue, and there is no re-add.
So refresh pInfo.Timestamp when the pod is not present in the three sub-queues, otherwise need to update the order of the pod in the active or backoff queue, for example p.activeQ.Update(pInfo)
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: