New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove TTL for scheduler cache to resolve the race condition when Cac… #110925
Remove TTL for scheduler cache to resolve the race condition when Cac… #110925
Conversation
@kapiljain1989: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @kapiljain1989. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
/retest-required |
HI @kerthcet , This is my First PR, i want to know, should i remove or disable the expire test case k8s.io/kubernetes/pkg/scheduler: TestSchedulerNoPhantomPodAfterExpire As we are planning to remove cache expiration logic |
I don't think we're going to remove the code about
If so, no need to remove the expiring tests. But we should first come to a consequence. cc @alculquicondor @ahg-g @Huang-Wei |
/retest-required |
/test pull-kubernetes-verify |
/retest-required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually more inclined to remove the TTL code entirely.
We already increased the timeout to 15 minutes and nothing wrong happened.
To be safe, we could wait one more release before doing the complete removal. Then we can submit this PR this release, as it is easy to revert.
@alculquicondor @denkensk @sanposhiho Please review and if you think it is ok Please approve the PR Thank you, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please squash
/retest-required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, but please squash
+1 to keeping the logic and just setting value to 0 for now, we can remove the logic after at least two releases. |
b72531b
to
4706dda
Compare
@alculquicondor Done |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, kapiljain1989 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I think we still have one place unsettled, like when we set cache's ttl to zero, the deadline in podState should be nil. I patched the fix here #110954 |
What type of PR is this?
/kind bug
What this PR does / why we need it:
In this PR we are removing cache timeout because of below race condition
A race condition will happen in the follow case:
pod1 is assigned to a node, scheduler cache is updated with the assignment, bind operation issued to apiserver.
if the apiserver is under huge pressure, bind takes more than 30s, scheduler expires the cached pod-to-node assignment.
bind eventually succeeds, but because the apiserver is under huge pressure, the pod update with the node name takes a long time to propagate to the scheduler.
because the pod update took a long time to propagate and the cache entry expired, the scheduler is not aware that the assignment actually happened, and so it had no problem assigning a second pod to the same node that would otherwise not fit if the scheduler was aware that the first pod was eventually assigned to the node.
Which issue(s) this PR fixes:
Fixes #106361
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: