Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
ensure scheduler preemptor behaves in an efficient/correct path #70898
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Special notes for your reviewer:
The most significant change introduced in this PR is: when popping a pod, scheduler doesn't update internal cache from nominatedMap immediately. Instead, it invalidates the cache until the pod is bound.
Why this? It's because in a very rare case: when a high priority pod comes in, and it's unschedulable (failed in scheduling) (1), it got a chance to try "preemption" (2) and preempt low priority pods (3) to make room.
During phase (1), in function Error() (1.1), it's put back into unschedulableQ where cache nominatedMap is being re-updated, the key point here is: the function is asynchronous (in a goroutine). In other words, after (3) is finished, a backfill pod for the preempted low priority pod (suppose it's managed by a deployment/replicaset) can be spawned and come into scheduling cycle, and it happens prior to (1.1). At this moment, it doesn't know a Nominated pod has been there (as cache hasn't been re-updated), then it's created and enters running state, but it will definitely be preempted again. So this case can happen again and again, although not endless, but really wastes resources to do unnecessary scheduling/preemption.
Along with this PR, I wrote an e2e test to simulate the issue.
Does this PR introduce a user-facing change?:
4 times, most recently
Nov 12, 2018
changed the title from
[WIP] ensure scheduler preemptor behaves in an efficient/correct path
ensure scheduler preemptor behaves in an efficient/correct path
Nov 12, 2018
@bsalamat I've updated the logic of deleting nominatedPod from cache to
Regarding the test, I'm still trying to build an integration test, but no luck to reproduce the issue so far.
@bsalamat yes, I can reproduce both manually and using the e2e test I wrote.
This SGTM. I will update this PR to remove the e2e test and address the comment to "expose scheduling queue in factory.Config, instead of a function".
2 similar comments
[APPROVALNOTIFIER] This PR is APPROVED
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing