-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: provide an option to not shuffle internal node index when using percentageOfNodesToScore #95709
Comments
@Huang-Wei: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc |
Starting from index 0 will be expensive on the long run because the first nodes will be full, but still being evaluated in every cycle. I am wondering if node order should be a plugin; for bin-packing, the nodes would be ordered by how full they are. If we don't want to go that far, instead of resetting the index every time, may be start from the last selected node. |
At the risk of stating the obvious, would setting PercentageOfNodesToScore to 100% solve this for the user? Performance will take a hit, of course, but it might be worth benchmarking for their use-case.
I propose making this more generic. Instead of an option to just disable shuffling, we can create an enum option for the starting index:
The third option need not be added as a part of this issue, but making it an enum will make it more extensible in the future.
Wouldn't those nodes be filtered out in the filter phase? If they cannot fit the pod, I don't see why they'd be a part of the scoring phase. |
They would, but we are still paying the overhead of evaluating them in the filter phase. |
It's a fair point. But even with current logic, we cannot ensure everytime the searching scope (starting at a shuffled Another aspect, as I mentioned earlier, is to make this option in profile level, so it won't impact the global (or other profiles) behavior. This is similar to the discussion we had in profile-level percentageOfNodesToScore.
I'm not sure it's practical enough as (1) the top ordered nodes may fail at the Filter phase due to capacity, and (2) we need to figure out the number of nodes to order.
This makes sense to me. Enum can be more extensible than bool.
Using a non-100% PercentageOfNodesToScore is the prerequisite of this issue. If the user goes with 100%, they can always get a consistent behavior and hence they don't need this proposal at all. |
It sounds like moving percentagesOfNodesToScore to the profile is the best first step. Then we can re-evaluate if more is needed. |
Agree. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
@Huang-Wei do you have any updates from your users? |
@alculquicondor I can touch base with them again, but I think their requirement is clear here. Do you plan to get it implemented along with v1beta2? |
I was not planing to, as per #95709 (comment) |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I got some requirement from users that run serverless workloads, who cares a lot about minimizing the running machines to reduce cost.
They've started to use the
NodeResourcesMostAllocated
Score plugin (just as AutoScaler did) to replace the defaultNodeResourcesLeastAllocated
plugin, however, it seems not to be good enough. The problem is that when usingpercentageOfNodesToScore
, internally scheduler maintains a node index to shuffle the starting searching position everytime it starts the Scoring. Suppose we have 500 nodes that passed Filtering phase, and due topercentageOfNodesToScore
, we only Score 100 nodes. When scheduling Pod1, the searching scope is from [node1, node2, ..., node100]; while when scheduling Pod2, the scope switches to [node101, node102, ..., node200] - which sorts of balances the workloads across the cluster, and it disobeys the user's desire to pack workloads onto minimum machines as possible.So I'm proposing to provide an option to disable shuffling the internal node's index, and if we got a consensus, the implementation may need to correlate with #93270 as we may consider making it a profile level parameter.
/sig scheduling
/kind feature
The text was updated successfully, but these errors were encountered: