New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Short circuit volume checker if the pod is not requesting any volumes #73652
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bsalamat The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -1643,6 +1643,11 @@ func NewVolumeBindingPredicate(binder *volumebinder.VolumeBinder) FitPredicate { | |||
} | |||
|
|||
func (c *VolumeBindingChecker) predicate(pod *v1.Pod, meta PredicateMetadata, nodeInfo *schedulernodeinfo.NodeInfo) (bool, []PredicateFailureReason, error) { | |||
// If pod does not request any volumes, we don't need to do anything. | |||
if len(pod.Spec.Volumes) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might help the benchmark but I'm not sure if it will help in reality because there is an admission controller that injects a secret volume into every pod.
This predicate should be a no-op if the pod does not use any PVCs. @cofyc can you help check if there's anything we can fix in FindPodVolumes
to return early?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FindPodVolumes
will return early if pod has no PVCs, see
kubernetes/pkg/controller/volume/persistentvolume/scheduler_binder.go
Lines 154 to 178 in 5323aed
// FindPodVolumes caches the matching PVs and PVCs to provision per node in podBindingCache. | |
// This method intentionally takes in a *v1.Node object instead of using volumebinder.nodeInformer. | |
// That's necessary because some operations will need to pass in to the predicate fake node objects. | |
func (b *volumeBinder) FindPodVolumes(pod *v1.Pod, node *v1.Node) (unboundVolumesSatisfied, boundVolumesSatisfied bool, err error) { | |
podName := getPodName(pod) | |
// Warning: Below log needs high verbosity as it can be printed several times (#60933). | |
klog.V(5).Infof("FindPodVolumes for pod %q, node %q", podName, node.Name) | |
// Initialize to true for pods that don't have volumes | |
unboundVolumesSatisfied = true | |
boundVolumesSatisfied = true | |
start := time.Now() | |
defer func() { | |
VolumeSchedulingStageLatency.WithLabelValues("predicate").Observe(time.Since(start).Seconds()) | |
if err != nil { | |
VolumeSchedulingStageFailed.WithLabelValues("predicate").Inc() | |
} | |
}() | |
if !podHasClaims(pod) { | |
// Fast path | |
return unboundVolumesSatisfied, boundVolumesSatisfied, nil | |
} | |
How about skip VolumeSchedulingStageLatency
logic and klog.V
in FindPodVolumes
too? This will make predicate a real no-op if the pod does not use any PVCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @msau42 and @cofyc!
We definitely need to fix this issue. The current predicate is unacceptably slow for pods that do not request volumes.
FindPodVolumes
gets pod name, reads time which often needs a system call, and increments Prometheus metrics even when a pod does not need any volume.
The predicate itself (which calls FindPodVolumes
) also allocates an array for failure reasons and processes the output of FindPodVolumes for all pods whether they request a volume or not. The predicate itself must have a check to bypass all this and return immediately when a pod does not need a volume. I changed the PR to check PVCs and return immediately if there is no PVCs.
PTAL.
0011e57
to
eb59bc6
Compare
LGTM |
Thanks, @cofyc! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (c *VolumeBindingChecker) predicate(pod *v1.Pod, meta PredicateMetadata, nodeInfo *schedulernodeinfo.NodeInfo) (bool, []PredicateFailureReason, error) { | ||
// If pod does not request any PVC, we don't need to do anything. | ||
if !podHasPVCs(pod) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding a log statement with level 5 or above would here would help in debugging.
/lgtm |
What type of PR is this?
/kind cleanup
This is an optimization.
What this PR does / why we need it:
Volume checker predicate function turns out to take about 5% of the scheduler execution time in our benchmarks while pods in those tests do not request any volumes. This PR short circuits the logic for pods that do not need any volumes.
Before:
After:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
/sig scheduling
/assign @msau42 @cofyc