Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion about the behavior of replica scheduling weight preference #730

Closed
Garrybest opened this issue Sep 15, 2021 · 11 comments · Fixed by #841
Closed

Discussion about the behavior of replica scheduling weight preference #730

Garrybest opened this issue Sep 15, 2021 · 11 comments · Fixed by #841
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@Garrybest
Copy link
Member

Garrybest commented Sep 15, 2021

What would you like to be added:
Hi fellows.

Now karmada has provided two kinds of replica division preference when the replica scheduling type is Divided. If the preference is Aggregated, scheduler will divide replicas aggregatedly according to member clusters idle resource. However, when the preference is set as Weighted, scheduler does not care about the dynamic clusters idle resource, and the weight preference here is just a static weight.

I'm thinking about taking dynamaic weight behavior into consideration when the preference is set as Weighted. If the field WeightPreference is nil, scheduler could divide the replicas by member cluster idle resource or member cluster maximum available replicas according to #580.

More suggestions about the behavior of replica scheduling weight preference are welcomed. How do you think?

Why is this needed:
Dynamaic weight behavior by member cluster maximum available replicas helps balance the cluster load. Imagine that we have a deployment with replicas 12 that needs to be propogated into 3 clusters:

Cluster A max available replica: 6
Cluster B max available replica: 12
Cluster C max available replica: 18

So we could divide the replica by 6:12:18, as 2 of cluster A, 4 of cluster B and 6 of cluster C. It is obvious that the division type has a benefit with cluster load balance.

@Garrybest Garrybest added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 15, 2021
@gf457832386
Copy link
Contributor

Maybe more dimensions may need to be considered. Is it necessary to anticipate future tasks and make a more rational distribution?
If anyone has a real usage problem, please publicize it and talk about it together!

@Garrybest
Copy link
Member Author

Hi Fei, glad to see you again. @gf457832386

Anticipate future tasks? Sounds interesting. I don't know how to do that. Could you please give more demonstration?

@gf457832386
Copy link
Contributor

gf457832386 commented Sep 23, 2021

Our current allocation is a greedy strategy to achieve the current optimal. However, whether some situations can be predicted based on historical resource usage data to make the overall resource utilization more reasonable. For example,now we can devide a task according to the current resource, but maybe considering next tasks in advance can improve resource utilization and save costs.

@gf457832386
Copy link
Contributor

Another question is that if Karmada works need to wait for the resources because of too many works? If the resources are sufficient at any time, the impact of this problem will be much smaller.

@Garrybest
Copy link
Member Author

Our current allocation is a greedy strategy to achieve the current optimal. However, whether some situations can be predicted based on historical resource usage data to make the overall resource utilization more reasonable. For example,now we can devide a task according to the current resource, but maybe considering next tasks in advance can improve resource utilization and save costs.

Good thinking. But I'm afraid the prediction could not be reliable. Kubernetes also does not rely on any replica prediction when scheduling. Descheduler would be better since scheduler only cares about the current optimal scheduling result and descheduler make some adjustment in a interval.

@Garrybest
Copy link
Member Author

Another question is that if Karmada works need to wait for the resources because of too many works? If the resources are sufficient at any time, the impact of this problem will be much smaller.

Sorry I don't get you. Do you mean that workloads waiting for assigning replicas when a cluster is insufficient with resource? I think so. Now, this behavior is like kube-scheduler. It will do nothing but record a failing condition when lack of resource.

@Garrybest
Copy link
Member Author

It seems blocking, any more ideas? @RainbowMango @gf457832386

@RainbowMango
Copy link
Member

I'll be back to this discussion soon. :)

@gf457832386
Copy link
Contributor

I think we have to think about the proportion of surplus resources. and next we can calculate the resource balance score according to the formula of resource balance and select the best way to allocate resources that meet the conditions of resource balance.

@RainbowMango
Copy link
Member

Dynamaic weight behavior by member cluster maximum available replicas helps balance the cluster load.

I guess this is the use case and value we want to provide. That makes sense to me.

Not an objection, I wonder to know why you didn't extend the API in ClusterPreferences? (Maybe I missed the info from the meeting)

@Garrybest
Copy link
Member Author

Well, I found the behavior is opposite to Aggregated although it is a kind of dynamic weight in some ways. I don't wanna make the weight confusing and complicating. So a Dispersive division preference is added instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants