-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
register unschedulable plugin for those plugins that PreFilter's PreFilterResult filter out some nodes #122251
base: master
Are you sure you want to change the base?
Conversation
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Sun Dec 10 10:17:03 UTC 2023. |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @olderTaoist. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
ad02730
to
9b41295
Compare
0bf09c0
to
5ecdfec
Compare
7a9b583
to
bea0b63
Compare
/assign I'll review this in this weekend hopefully, during the new year holiday at the latest. |
I have read the relevant |
when all nodes are filtered by PreFilter that returns PreFilterResult with allNodes() false, kubernetes/pkg/scheduler/framework/runtime/framework.go Lines 733 to 740 in 6938c29
The original logic will update |
No need to update those NodeStatus, nodes passed prefilter will get their status updated during filter stage. And we'll need to find a better way than iterating all nodes and set status in a map. But that is out of this pr's scope. |
pkg/scheduler/framework/plugins/defaultpreemption/default_preemption_test.go
Outdated
Show resolved
Hide resolved
e0db92a
to
8a8c040
Compare
pkg/scheduler/framework/interface.go
Outdated
@@ -588,7 +588,7 @@ type Framework interface { | |||
// cycle is aborted. | |||
// It also returns a PreFilterResult, which may influence what or how many nodes to | |||
// evaluate downstream. | |||
RunPreFilterPlugins(ctx context.Context, state *CycleState, pod *v1.Pod) (*PreFilterResult, *Status) | |||
RunPreFilterPlugins(ctx context.Context, state *CycleState, pod *v1.Pod, diagnosis *Diagnosis) (*PreFilterResult, *Status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another option would be to add UnschedulablePlugins to the PreFilterResult and have the caller of RunPrefilterPlugins use that result to update diagnosis. It avoids changing the interface, and makes the data flow clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can do that.
Me myself prefer the current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an advantage to the current implementation that I'm not seeing which justifies having side effects? My concern is that readers/maintainers of this code will have to read the implementation of RunPreFilterPlugins
to see why Diagnosis
is passed, and which fields in it are read/mutated. Passing this type also makes it easier in the future for other fields in Diagnosis
to be modified without careful consideration, increasing complexity (suppose NodeToStatusMap
is also modified inside of RunPreFilterPlugins
, rather than just by findNodesThatFitPod).
While the data flow/mutability is my primary concern, this PR also becomes smaller (8 -> 4 files), if we include result in PreFilterResult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @gabesaba's rationale.
Please update the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already updated, please review again @AxeZhan @gabesaba @alculquicondor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just return UnschedulablePlugins as a return value from RunPreFilterPlugins
, instead of a new field in PreFilterResult
?
PreFilterResult
is a data structure we also use between the framework and plugins, that is, supposed to be exposed to plugin developers, while UnschedulablePlugins
would not be a field for plugins developers. It looks better not to expose such field in PreFilterResult
, if we haven't got a special reason doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gabesaba has raised a similar point. #122251 (comment)
We can do this as a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be done in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Only few nits.
pkg/scheduler/framework/plugins/defaultpreemption/default_preemption_test.go
Outdated
Show resolved
Hide resolved
8a8c040
to
8a1b092
Compare
8a1b092
to
87035b6
Compare
87035b6
to
605d4b1
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: olderTaoist The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
605d4b1
to
fe6b4b5
Compare
pkg/scheduler/framework/interface.go
Outdated
@@ -740,6 +743,7 @@ func (p *PreFilterResult) Merge(in *PreFilterResult) *PreFilterResult { | |||
} | |||
|
|||
r.NodeNames = p.NodeNames.Intersection(in.NodeNames) | |||
r.UnschedulablePlugins = p.UnschedulablePlugins.Clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we handle case when in
has UnschedulablePlugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in
will be no UnschedulablePlugins
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in will be no UnschedulablePlugins
In current implementation in this pr, yes. But I also prefer a union
here for future usage.
@@ -726,10 +726,14 @@ func (f *frameworkImpl) RunPreFilterPlugins(ctx context.Context, state *framewor | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to record unschedulable plugins in the branches above - lines 715 and 720? or only when the plugin filters out some nodes (as title of PR indicates)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous implementation that unschedulable plugins was also added in lines 713 and 714
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand now. We set the plugin name, and later we add it via this call?
But if more than one plugin returns Unschedulable (or that, +a plugin returns UnschedulableAndUnresolvable), we may miss some plugins, right?
result = result.Merge(r) | ||
if !r.AllNodes() { | ||
if result.UnschedulablePlugins == nil { | ||
result.UnschedulablePlugins = sets.New[string]() | ||
} | ||
result.UnschedulablePlugins.Insert(pl.Name()) | ||
pluginsWithNodes = append(pluginsWithNodes, pl.Name()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result = result.Merge(r) | |
if !r.AllNodes() { | |
if result.UnschedulablePlugins == nil { | |
result.UnschedulablePlugins = sets.New[string]() | |
} | |
result.UnschedulablePlugins.Insert(pl.Name()) | |
pluginsWithNodes = append(pluginsWithNodes, pl.Name()) | |
} | |
if !r.AllNodes() { | |
r.UnschedulablePlugins = sets.New(pl.Name()) | |
pluginsWithNodes = append(pluginsWithNodes, pl.Name()) | |
} | |
result = result.Merge(r) |
or, if we want fewer allocs, create set outside of the loop and assign to result before final return. (requires small refactor, to break outside of loop on line 744, rather than return)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything wrong with me writing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation works, but there is room for simplification (fewer branches) and less code. After all, we just want to do a set union of unschedulable plugin names, not merge some complex data structure.
create set outside of the loop and assign to result before final return
I think this is the simplest option, as we don't have to worry about the merge logic (which, in the general case, becomes complex if we want to properly handle all combinations of nil/non-nil)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your new version is even better than what I had in mind :)
@@ -718,6 +718,9 @@ type PreFilterResult struct { | |||
// The set of nodes that should be considered downstream; if nil then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps outside of the scope of this PR (something we can fix as followup), but a question for @AxeZhan, @alculquicondor: Does it make sense to share this result type between PreFilter
and RunPreFilterPlugins
? As now, there is a field which users of the framework should never need to set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please elaborate? I don't quite get it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RunPreFilterPlugins
acts on all plugins, while PreFilter
runs a single plugin, but they both return the same PreFilterResult
type. My concerns are:
- With the addition to
PreFilterResult
I suggested, the fieldUnschedulablePlugins
should only be used byRunPreFilterPlugins
. AsPreFilter
shouldn't modify this field, this may be confusing in the interface, and we may have to defensively code against the case when it was modified. - We assume we can fold or "merge" these
PreFilterResults
into a singlePreFilterResult
. This may not always be true, and it is already a little clumsy. All we want to do is a set intersection onNodeNames
, and a set union onUnschedulablePlugins
. We don't need a complex merge function for that, where we have to safely handle one or both sides being nil. We just need two sets inRunPreFilterPlugins
.
I don't think this should block this PR, but waited to raise it as something we might clean up after, and therefore wanted to get your feedback. If you agree with this, I'm happy to make these changes after this PR merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only a few nits left. otherwise it looks good. thanks for addressing the comments!
if preRes != nil && preRes.UnschedulablePlugins != nil { | ||
diagnosis.UnschedulablePlugins = preRes.UnschedulablePlugins.Clone() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if preRes != nil && preRes.UnschedulablePlugins != nil { | |
diagnosis.UnschedulablePlugins = preRes.UnschedulablePlugins.Clone() | |
} | |
if preRes != nil { | |
diagnosis.UnschedulablePlugins = preRes.UnschedulablePlugins | |
} |
nit: since we don't use this field elsewhere, we may omit copy (and propagating nil value is fine)
result.UnschedulablePlugins = sets.New[string]() | ||
result.UnschedulablePlugins.Insert(pluginsWithNodes...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result.UnschedulablePlugins = sets.New[string]() | |
result.UnschedulablePlugins.Insert(pluginsWithNodes...) | |
result.UnschedulablePlugins = sets.New(pluginsWithNodes...) |
does this compile?
@@ -718,6 +718,9 @@ type PreFilterResult struct { | |||
// The set of nodes that should be considered downstream; if nil then | |||
// all nodes are eligible. | |||
NodeNames sets.Set[string] | |||
|
|||
// UnschedulablePlugins are plugins that returns Unschedulable or UnschedulableAndUnresolvable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// UnschedulablePlugins are plugins that returns Unschedulable or UnschedulableAndUnresolvable. | |
// UnschedulablePlugins are plugins which filter out nodes. |
if plugin returns UnschedulableAndUnresolvable
or Unschedulable
, we don't add to this set. How about this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update this comment too, as we will now include plugins which return success, but filtered some nodes.
kubernetes/pkg/scheduler/framework/types.go
Lines 345 to 346 in b498eb9
// UnschedulablePlugins are plugins that returns Unschedulable or UnschedulableAndUnresolvable. | |
UnschedulablePlugins sets.Set[string] |
@olderTaoist Also, please fill in the release note, e.g.,
(Maybe we can say more, but no need to explain very deeply.) |
What type of PR is this?
/kind bug
What this PR does / why we need it:
unschedulable plugin isn't registered for those plugins, when some
PreFilter
filter out some Nodes. In this case, some changes in the cluster may change the result from thosePreFilter
and may make this Pod schedulable.Which issue(s) this PR fixes:
Fixes #122018
Special notes for your reviewer:
@sanposhiho
Does this PR introduce a user-facing change?