Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
scheduler: fix flaky test TestPreemptionRaces #77990
What type of PR is this?
What this PR does / why we need it:
In some cases, an Update event with no "NominatedNode" present is received right
If we go updating (delete and add) it, it actually un-reserves the node since
In this case, during this time other low-priority pods have chances to take space which was reserved for the nominatedPod.
Which issue(s) this PR fixes:
Special notes for your reviewer:
The flake is reproducible in my env and the above solution and analysis are given based on the real execution path. However, we can brainstorm a better solution following the same rationale.
BTW: actually it's more a bug. And the integration test was given impressively to reveal it :)
Does this PR introduce a user-facing change?:
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: Huang-Wei
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing
1 similar comment
May 18, 2019
20 checks passed
This is in the 1.15.0-alpha3 changelog as:
I believe I may have run into this issue with 1.13.7 on GKE. I was testing pod pre-emption based on priority, and to do so I made the following deployment:
i.e. binds 37 copies of
I then deployed a daemonset (
However, the actual daemonset pod
Is this evidence that perhaps this fix should be backported to 1.13/1.14 as well?
Disregard the above... I figured out my issue was totally separate.
The issue was that I had my
What I think was happening was that:
I fixed my test setup by using a node affinity on the desired node instead of directly setting
So, maybe this change should be backported, maybe not, but definitely not for this reason :)