-
Notifications
You must be signed in to change notification settings - Fork 890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix a corner case that re-schedule be skipped in case of the cluster becomes not fit. #2912
Conversation
Thanks~ |
Codecov Report
@@ Coverage Diff @@
## master #2912 +/- ##
==========================================
+ Coverage 37.87% 38.52% +0.65%
==========================================
Files 201 205 +4
Lines 18494 18815 +321
==========================================
+ Hits 7004 7249 +245
- Misses 11081 11137 +56
- Partials 409 429 +20
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
/retitle fix a corner case that re-schedule be skipped in case of the cluster becomes not fit. |
cc @chaunceyjiang , Please take a look, I've heard that this question has other context. |
Oh, I missed this pr.
yes ,refer to #2261 |
452595a
to
24bc5a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some typo.
Other parts LTGM
95beefb
to
ba18160
Compare
/lgtm |
pkg/scheduler/scheduler.go
Outdated
@@ -476,13 +478,14 @@ func (s *Scheduler) scheduleResourceBinding(resourceBinding *workv1alpha2.Resour | |||
} | |||
|
|||
scheduleResult, err := s.Algorithm.Schedule(context.TODO(), &placement, &resourceBinding.Spec, &core.ScheduleAlgorithmOption{EnableEmptyWorkloadPropagation: s.enableEmptyWorkloadPropagation}) | |||
if err != nil { | |||
if err != nil && reflect.TypeOf(err) != reflect.TypeOf(&framework.FitError{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid we have a situation here. When a cluster is not ready for a while, the cluster will be added a NoSchedule taint. And before, a workload is running on this cluster and happens to scale up(may be triggered by users or hpa-controller), however, no cluster is fit due to the taint. I guess the previous replicas will be evicted here because now fit error is skipped and result will be patched to RB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to
if bindingSpec.TargetContains(cluster.Name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I didn't realize taint plugin is updated here. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the FitError
will be dedicated used in the situation that no cluster fit
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, then the fitError
information will be shown in the event and RB's status, the same behavior as k8s.
/lgtm |
cc @RainbowMango for final checking |
@XiShanYongYe-Chang Have you confirmed the E2E part? |
Yes, I've confirmed it. |
pkg/scheduler/scheduler.go
Outdated
@@ -476,13 +478,14 @@ func (s *Scheduler) scheduleResourceBinding(resourceBinding *workv1alpha2.Resour | |||
} | |||
|
|||
scheduleResult, err := s.Algorithm.Schedule(context.TODO(), &placement, &resourceBinding.Spec, &core.ScheduleAlgorithmOption{EnableEmptyWorkloadPropagation: s.enableEmptyWorkloadPropagation}) | |||
if err != nil { | |||
if err != nil && reflect.TypeOf(err) != reflect.TypeOf(&framework.FitError{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil && reflect.TypeOf(err) != reflect.TypeOf(&framework.FitError{}) { | |
var noClusterFit *framework.FitError | |
// in case of no cluster fit, can not return but continue to patch(cleanup) the result. | |
if err != nil && !errors.As(err, &noClusterFit) { |
Does this make it a little clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
…nges Signed-off-by: jwcesign <jiangwei115@huawei.com>
Have you figured out the root cause of the failing tests? I didn't see any change from the force-push. |
it's my mistake, should be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Please cherry-pick this patch to release branches.
And we are going to tag a new release after that.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: RainbowMango The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…-upstream-release-1.4 Automated cherry pick of #2912: fix a corner case that re-schedule be skipped in case of the cluster becomes not fit
…-upstream-release-1.3 Automated cherry pick of #2912: fix a corner case that re-schedule be skipped in case of the cluster becomes not fit
Signed-off-by: jwcesign jiangwei115@huawei.com
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #2906
Special notes for your reviewer:
none
Does this PR introduce a user-facing change?: