-
Notifications
You must be signed in to change notification settings - Fork 883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add KEP-521: Cluster Accurate Replica Estimator #580
Conversation
/assign @kevin-wangzefeng I'm very busy this week, and today I'm out of the office. |
@RainbowMango: GitHub didn't allow me to request PR reviews from the following users: algebra2k. Note that only karmada-io members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Based on this prefilter result, when assigning replicas, the Karmada Scheduler could try to calculate cluster max available replicas by starting gRPC requests concurrently to the Cluster Accurate Replica Estimator. At last, the Cluster Accurate Replica Estimator will soon return how many available replicas that the cluster could produce. Then the Karmada Scheduler assgin replicas into different clusters in terms of the estimation result. | ||
|
||
Furthermore, replica estimation can be considered as a new scheduler plugin. | ||
We could implement this by modifying function calClusterAvailableReplicas to a interface. The previous estimation method, based on `ResourceSummary` in `Cluster.Status`, is able to be a default normal estimation approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could implement this by modifying function calClusterAvailableReplicas to a interface.
Do you mean we will have two implementations of this interface?
- The default one still calculate replicas via
ResourceSummary
- The accurate one calculate replicas via the
Cluster Accurate Replica Estimator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a user, how to config to select the two implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we will have two implementations of this interface?
Right.
As a user, how to config to select the two implementations?
Now we could just add a switch to determine whether Cluster Accurate Replica Estimator
is applied. ResourceSummary
interface could be a default one that does not support disabling. In the future, when scheduler profile is added, a user could custimize the config by using a profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about document this in the proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me take a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome to me.
Based on this prefilter result, when assigning replicas, the Karmada Scheduler could try to calculate cluster max available replicas by starting gRPC requests concurrently to the Cluster Accurate Replica Estimator. At last, the Cluster Accurate Replica Estimator will soon return how many available replicas that the cluster could produce. Then the Karmada Scheduler assgin replicas into different clusters in terms of the estimation result. | ||
|
||
Furthermore, replica estimation can be considered as a new scheduler plugin. | ||
We could implement this by modifying function calClusterAvailableReplicas to a interface. The previous estimation method, based on `ResourceSummary` in `Cluster.Status`, is able to be a default normal estimation approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about document this in the proposal?
Furthermore, replica estimation can be considered as a new scheduler plugin. | ||
We could implement this by modifying function calClusterAvailableReplicas to a interface. The previous estimation method, based on `ResourceSummary` in `Cluster.Status`, is able to be a default normal estimation approach. | ||
|
||
### Karmada Cluster Accurate Replica Estimator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many clusters serve for each Cluster Accurate Replica Estimator
? I mean if there are a huge number of clusters, does it also mean a lot of Cluster Accurate Replica Estimator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now each Cluster Accurate Replica Estimator
just serves for one cluster, just like karmada-agent
. If Cluster Accurate Replica Estimator
servers for multiple clusters, each replica will consume more resource because they have more pod and node informers. I thought one-for-one serving does not have obvious disadvantages and could make the module interaction easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I didn't mean it's a flaw, just needs to be annotated.
@irfanurrehman Are you interest in this proposal? |
thanks @RainbowMango for pointing me to this. I cannot commit to this immediately. I can take a look at this over my weekend. Meanwhile, what is the kind of urgency/priority you have attached with this proposal? |
My bad. I meant to invite you to review this proposal since you are a senior expert in the federation area.
There is no rush. You can post your comments at any time and I appreciate it. |
// ReplicaRequirements represents the requirements required by each replica. | ||
// +optional | ||
ReplicaRequirements *ReplicaRequirements `json:"replicaRequirements,omitempty"` | ||
|
||
// Replicas represents the replica number of the referencing resource. | ||
// +optional | ||
Replicas int32 `json:"replicas,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to add this fields in ResourceBingding.spec, but keep ObjectReference focusing on indicating the relationship between binding and resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Now we got ReplicaResourceRequirements
in ObjectReference
, which represents the resources required by each replica. Should we move ReplicaResourceRequirements
into ResourceBindingSpec
first in next PR and then modify it to ReplicaRequirements
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move ReplicaResourceRequirements into ResourceBindingSpec first in next PR
What do you mean in next PR
? I think we can do it before this proposal.
Both ReplicaResourceRequirements
and Replicas
should be moved out, right?
karmada/pkg/apis/work/v1alpha1/binding_types.go
Lines 59 to 65 in 4c2c189
// ReplicaResourceRequirements represents the resources required by each replica. | |
// +optional | |
ReplicaResourceRequirements corev1.ResourceList `json:"resourcePerReplicas,omitempty"` | |
// Replicas represents the replica number of the referencing resource. | |
// +optional | |
Replicas int32 `json:"replicas,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean in next PR? I think we can do it before this proposal.
My bad. I mean before this proposal.
Both ReplicaResourceRequirements and Replicas should be moved out, right?
Yes.
Just a question, the ReplicaRequirements represents the requirements required by each replica. But the request and replica phase can be overridden by overidepolicy which happen in binding controller. So the replica and request may be incorrect when used by karmada schedule. How to solve this problem? |
Well, it's a good question but I haven't figured out a solution. I'm afraid that the problem already occurs when using |
I don't think the issue should be covered by this proposal, though I agree this is a good question. |
feb9013
to
9180b96
Compare
Signed-off-by: Garrybest <garrybest@foxmail.com>
As discussed at today's meeting, we are going to merge this proposal first. |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Garrybest garrybest@foxmail.com
What type of PR is this?
/kind design
What this PR does / why we need it:
Add a new proposal, KEP-521: Cluster Accurate Replica Estimator.
Which issue(s) this PR fixes:
Fixes #521
Special notes for your reviewer:
Does this PR introduce a user-facing change?: