-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
volumebinding: scheduler queueing hints - StorageClass #124958
volumebinding: scheduler queueing hints - StorageClass #124958
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bells17 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/cc @sanposhiho @utam0k |
/retest |
/kind feature |
651ccab
to
e4fd29b
Compare
/retest |
837ce6f
to
c9d523a
Compare
e3ed38e
to
ee91dca
Compare
ee91dca
to
c4507f6
Compare
/cc @carlory |
/cc @jsafrane @xing-yang |
if err != nil { | ||
return framework.Queue, err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's forbidden to update the volume binding mode of a storage class. If a storage class's binding mode isn't VolumeBindingWaitForFirstConsumer, when it is created or updated, the action itself doesn't make the pod become schedulable directly. If then some PVCs is bound by PVs due to the storage class (re-)created, it should be reflected in the PVC update event. so in this function, we should only take care of special storage classes with the VolumeBindingWaitForFirstConsumer binding mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only consider VolumeBindingWaitForFirstConsumer, this PR will depend on the other hint func which handle PVC events.
So let's merge another firstly once that PR is ready to be merged. @sanposhiho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion. In that case, would you mind reviewing the following PR first, which adds QHint for PVC?
#124959
if err != nil { | ||
if pinfo.isEphemeral && apierrors.IsNotFound(err) { | ||
err = fmt.Errorf("waiting for ephemeral volume controller to create the persistentvolumeclaim %q", pinfo.pvcName) | ||
return framework.Queue, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return an error? If so, the pod will put into BackOffQ. It doesn't make pod become schedulable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comment. After considering your feedback, I think it might be better to make the following changes:
for _, vol := range pod.Spec.Volumes {
var pvc *v1.PersistentVolumeClaim
switch {
case vol.PersistentVolumeClaim != nil:
pvcName := vol.PersistentVolumeClaim.ClaimName
pvc, err = pl.PVCLister.PersistentVolumeClaims(pod.Namespace).Get(pvcName)
if err != nil {
return framework.Queue, err
}
case vol.Ephemeral != nil:
pvc = &v1.PersistentVolumeClaim{
ObjectMeta: vol.Ephemeral.VolumeClaimTemplate.ObjectMeta,
Spec: vol.Ephemeral.VolumeClaimTemplate.Spec,
}
default:
continue
}
if pvc.Spec.VolumeName != "" {
// Skipping the check for CSIStorageCapacity as the PVC is configured
// to be bound to an existing PV.
continue
}
className := volume.GetPersistentVolumeClaimClass(pvc)
if className == newSC.Name {
if oldSC == nil {
logger.V(4).Info("StorageClass was created")
return framework.Queue, nil
}
if !apiequality.Semantic.DeepEqual(newSC.AllowedTopologies, oldSC.AllowedTopologies) {
logger.V(4).Info("StorageClass was created or updated, and changed Provisioner", "AllowedTopologies", newSC.AllowedTopologies)
return framework.Queue, nil
}
}
}
What do you think about this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried committing the code content mentioned in the above comment.
7cc8646
to
537975d
Compare
…e's PVC Template when a Pod is using an Ephemeral Volume
537975d
to
dbbaa06
Compare
@@ -123,6 +127,60 @@ func (pl *VolumeBinding) EventsToRegister() []framework.ClusterEventWithHint { | |||
return events | |||
} | |||
|
|||
func (pl *VolumeBinding) isSchedulableAfterStorageClassChange(logger klog.Logger, pod *v1.Pod, oldObj, newObj interface{}) (framework.QueueingHint, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to #124961 (comment); if we can consider StorageClass update is not that frequent, it's not worthy for a fine but time-consuming filtering (using PVCLister.PersistentVolumeClaims().Get()
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a first step, we should make it minimum, at least until we establish some observability around QHint and could ensure this level of fine filtering doesn't impact a large cluster negatively (#124566).
5acaacf
to
843435f
Compare
@sanposhiho Thank you for your review. I have made the necessary changes, so please take another look. |
/close |
@bells17: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
kube-scheduler implements scheduling hints for the VolumeBinding plugin.
The scheduling hints allow the scheduler to determine whether to retry or skip scheduling a Pod based on the changes made to the StorageClass resource referenced by the plugin.
Which issue(s) this PR fixes:
Part of #118893
KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/4247-queueinghint/README.md
Base PR: #124939
Special notes for your reviewer:
Fields Impacting QueueingHintFn
PersistentVolume (PV) is not included in this table because it can undergo extensive changes when a conversion is performed by csi-translation-lib.
ref(ja): https://zenn.dev/bells17/scraps/65bd6891012bdc
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: