New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throttle pod eviction when past evictions led to the same node being chosen by the scheduler #424
Comments
I think this would be related to our long-term goal of incorporating the actual scheduler framework into the descheduler to better inform eviction decisions (see related issues like #261, #238). That work is making some progress, but currently at the step of breaking the scheduler framework out of core kubernetes (see upstream issue kubernetes/kubernetes#89930). Like you mention, there are a lot of cases where an evicted pod will end up on the same node. The descheduler is pretty naïve and optimistic, so trying to cover all of these now would be a lot like re-writing existing scheduler logic. We could potentially add some sort of "throttling" (maybe an annotation like |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
remove-lifecycle stale |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Rotten issues close after 30d of inactivity. Send feedback to sig-contributor-experience at kubernetes/community. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is your feature request related to a problem? Please describe.
Currently, multiple missing features in descheduler cause re-scheduling of pods onto the same nodes again and again. Two such examples are #335 and #335 (comment). The descheduler currently never gives up.
Describe the solution you'd like
pod eviction should be throttled when a configurable number of past evictions led to the re-scheduling on the same node. This should be configurable via options (default values) and via annotations per pod.
I assume that there are more cases where pods are re-scheduled to the same node. I already encountered 2 such cases on the first run of descheduler. I also assume that it will take some time until all these cases are fixed, so I'd suggest to implement this feature first so that there is some sensible default/fallback behavior for such cases.
The text was updated successfully, but these errors were encountered: