New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature]support rescheduling when deleting a cluster #1383
Conversation
/cc @dddddai |
/cc @mrlihanbo |
notes: This PR does not consider the spreadconstraints. |
Please solve the conflict. @huone1 |
issue #1411 is also fixed in the PR |
@@ -48,7 +51,7 @@ func divideReplicasByResource( | |||
} else if assignedReplicas < spec.Replicas { | |||
// We need to enlarge the replicas in terms of the previous result (if exists). | |||
// First scheduling is considered as a special kind of scaling up. | |||
newTargetClusters, err := scaleUpScheduleByReplicaDivisionPreference(clusters, spec, preference) | |||
newTargetClusters, err := scaleUpScheduleByReplicaDivisionPreference(clusters, spec, preference, scheduledClusters, assignedReplicas) | |||
if err != nil { | |||
return nil, fmt.Errorf("failed to scaleUp: %v", err) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if assignedReplicas == spec.Replicas, but len(scheduledClusters) < len(spec.clusters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if assignedReplicas == spec.Replicas, but len(scheduledClusters) < len(spec.clusters)
I didn't think of a similar rescheduling scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please give an example? It seems not possible to happen "in one event"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say we have 2 clusters [A、B] , both with allocatable resources: 4C8G
a deployment requests: 2C4G, 3 replicas
and the propagationPolicy is:
placement:
clusterAffinity:
clusterNames:
- A
- B
replicaScheduling:
replicaDivisionPreference: Weighted
replicaSchedulingType: Divided
weightPreference:
dynamicWeight: AvailableReplic
so the deployment was placed in both A and B, let's assume A has 2 replicas and B has 1。
- update propagationPolicy placement.clusterAffinity.clusterNames=[A], now no enough resources for this deployment to reschedule, so nothing happened;
- update deployment.spec.replicas=2, with this pr, nothing happened again..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right that it should return the scheduledClusters as the result
Signed-off-by: huone1 <huwanxing@huawei.com>
ok, it is done |
cc @dddddai @Garrybest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
Leave LGTM to @Garrybest or @dddddai
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: RainbowMango The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one concern, would leave lgtm to @Garrybest
deltaLen := len(spec.Clusters) - len(reservedClusters) | ||
if len(candidateClusters) < deltaLen { | ||
// for ReplicaSchedulingTypeDivided, we will try to migrate replicas to the other health clusters | ||
if placement.ReplicaScheduling == nil || placement.ReplicaScheduling.ReplicaSchedulingType == policyv1alpha1.ReplicaSchedulingTypeDuplicated { | ||
klog.Warningf("ignore reschedule binding as insufficient available cluster") | ||
return ScheduleResult{}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say we have 2 clusters [A、B] (with Failover
enabled)
- a rb was scheduled to [A、B] (duplicated)
- cluster A becomes not ready
- the orignal behavior: scheduled to [A、B] while the current behavior: scheduled to [B]
I'm not sure if we want to keep the original behavior, if not then lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrlihanbo Please see if this change is reasonable.
/assign |
// Step 1: Get previous total sum of replicas. | ||
assignedReplicas := util.GetSumOfReplicas(spec.Clusters) | ||
// Step 1: Find the ready clusters that have old replicas | ||
scheduledClusters := findOutScheduledCluster(spec.Clusters, clusters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we just delete the unjoined clusters from spec.clusters
?
Then the replicas in unjoined clusters will be considered as a special kind of scaling up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we don't need more changes, am I right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a special kind of scaling up
/lgtm |
I think there is a bug here. Imagine:
We should not erase the scheduling replicas of the not-ready cluster in RB because |
when Failover is disabled, the rescheduling for not-ready cluster is not triggered; |
I'm afraid not. When scaling up, rescheduling will be triggered. |
what is wrong with it |
Why is it dangerous? What I'm thinking about is how to postpone the deletion operation until the desired replicas are all in the available state. That guarantees there are always sufficient replicas running at any time. |
|
|
Agree with @Garrybest. |
it is different between deleting a cluster and the change from healthy to unhealthy。it is reasonable to migrate the replicas to Other Clusters when deleting a cluster |
Yeah, deleting a cluster is another story. |
Let me describe an example.
|
Thanks @Garrybest for the details. Can you help file an issue to track this? |
Sure. |
Signed-off-by: huone1 huwanxing@huawei.com
What type of PR is this?
/kind feature
What this PR does / why we need it:
support rescheduling when deleting a cluster
Which issue(s) this PR fixes:
Fixes #829
Fixes #1411
Special notes for your reviewer:
This PR does not consider the spreadconstraints and it will be considered in the scheduler refactoring .
Does this PR introduce a user-facing change?: