to add a feasible nodes cache for same type of pods in same workload #124949

lowang-bh · 2024-05-19T09:45:49Z

What would you like to be added?

A cached feasible nodes list for a worklode, eg, deployment.

Why is this needed?

As predicating is a time consume step, especially in a large cluster.
And as we know, same type of pods in a workload have same constraint, such as deployment's pod, Job's replicas.
So the first pod's feasible nodes list is aslo available for the other pods in a workload.
Choose one from the cached nodes will save much time than re-run a predicating for next pod.

The same does the cache with a node list predicating failed.

Maybe it is more significant when scheduling a batch pods with gang-scheduling or co-scheduling policy.

k8s-ci-robot · 2024-05-19T09:45:58Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

AxeZhan · 2024-05-19T15:58:03Z

/sig node

zhifei92 · 2024-05-20T04:03:25Z

Which component do you want to implement this functionality in, is it the schdeuler? And If podAffinity is present, the distribution of pods in the deployment will be affected.

chengjoey · 2024-05-20T06:19:53Z

even through same type of pods in same workload, if set volume or TopologySpreadConstraint, then pass through the filter of plugins such as resource, affinity, feasible nodes should be a big difference

AxeZhan · 2024-05-20T06:34:10Z

/sig scheduling
Sorry, I didn't look into the details. As your purpose is to accelerating predicate, this belongs to sig-scheduling.

I think I can understand some of what you're trying to do.
For example, if we have a deployment with required nodeAffinity.
Then maybe we can store a affinity->node mapping, and then we can skip preFilter of nodeAffinity plugin for other pods in the same deployment.

However, the maintaining of this cache will also increase the burden on the scheduler, and brings additional complexity to the framework.
If you have implemented this somewhere, then maybe the results of a benchmark will be very convincing.

lowang-bh · 2024-05-20T09:55:51Z

And If podAffinity is present, the distribution of pods in the deployment will be affected.

all pods in a deployment have same predicate condition, include podAffinity, right？

lowang-bh · 2024-05-20T09:56:44Z

plugins such as resource, affinity, feasible nodes should be a big difference

Resource should be re-considered, of couse.

lowang-bh · 2024-05-20T09:59:46Z

However, the maintaining of this cache will also increase the burden on the scheduler, and brings additional complexity to the framework.

Yes, it is a heavy work. I just throw out this question and we can discuss it.

If you have implemented this somewhere, then maybe the results of a benchmark will be very convincing.

Some other schedulers based on kubernetes have this feature.
I will try to run a benchmarck if I can.

AxeZhan · 2024-05-20T10:25:19Z

Some other schedulers based on kubernetes have this feature.

Can you paste a reference here? :)

basuotian · 2024-05-21T09:53:12Z

Does this feature look like equivelance class cache which is removed in v1.23?

lowang-bh · 2024-05-25T10:53:26Z

Some other schedulers based on kubernetes have this feature.

Can you paste a reference here? :)

such as the godel-scheduler in bytedance's recent paper described.
and volcano has add it in volcano-sh/volcano#1165.

https://github.com/volcano-sh/volcano/blob/70a483b66653ee402babdc733505a92fff890f5f/pkg/scheduler/plugins/predicates/predicates.go#L434

lowang-bh · 2024-05-25T12:02:57Z

equivelance class cache

Yes, seem it has been discussed in
#17390,
#71013
#65714

removed in #71399

lowang-bh added the kind/feature Categorizes issue or PR as related to a new feature. label May 19, 2024

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 19, 2024

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 19, 2024

lowang-bh changed the title ~~to add a feasible nodes/failed no cache for same type of pods in same workload~~ to add a feasible nodes cache for same type of pods in same workload May 19, 2024

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 19, 2024

k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

to add a feasible nodes cache for same type of pods in same workload #124949

to add a feasible nodes cache for same type of pods in same workload #124949

lowang-bh commented May 19, 2024 •

edited

Loading

k8s-ci-robot commented May 19, 2024

AxeZhan commented May 19, 2024

zhifei92 commented May 20, 2024

chengjoey commented May 20, 2024 •

edited

Loading

AxeZhan commented May 20, 2024

lowang-bh commented May 20, 2024

lowang-bh commented May 20, 2024

lowang-bh commented May 20, 2024 •

edited

Loading

AxeZhan commented May 20, 2024

basuotian commented May 21, 2024

lowang-bh commented May 25, 2024

lowang-bh commented May 25, 2024

to add a feasible nodes cache for same type of pods in same workload #124949

to add a feasible nodes cache for same type of pods in same workload #124949

Comments

lowang-bh commented May 19, 2024 • edited Loading

What would you like to be added?

Why is this needed?

k8s-ci-robot commented May 19, 2024

AxeZhan commented May 19, 2024

zhifei92 commented May 20, 2024

chengjoey commented May 20, 2024 • edited Loading

AxeZhan commented May 20, 2024

lowang-bh commented May 20, 2024

lowang-bh commented May 20, 2024

lowang-bh commented May 20, 2024 • edited Loading

AxeZhan commented May 20, 2024

basuotian commented May 21, 2024

lowang-bh commented May 25, 2024

lowang-bh commented May 25, 2024

lowang-bh commented May 19, 2024 •

edited

Loading

chengjoey commented May 20, 2024 •

edited

Loading

lowang-bh commented May 20, 2024 •

edited

Loading