volumebinding: scheduler queueing hints - CSIStorageCapacity #124961

bells17 · 2024-05-20T04:48:24Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

kube-scheduler implements scheduling hints for the VolumeBinding plugin.
The scheduling hints allow the scheduler to determine whether to retry or skip scheduling a Pod based on the changes made to the CSIStorageCapacity resource referenced by the plugin.

Which issue(s) this PR fixes:

Part of #118893
KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/4247-queueinghint/README.md
Base PR: #124939

Special notes for your reviewer:

Fields Impacting QueueingHintFn

PersistentVolume (PV) is not included in this table because it can undergo extensive changes when a conversion is performed by csi-translation-lib.

resource	field	Referenced in PreFilter+Filter?	Admission Overwrite Prevention Config	Need to Check Changes in QHint?
CSIStorageCapacity	.metadata.labels	x	x	x
CSIStorageCapacity	.metadata.annotations	x	x	x
CSIStorageCapacity	.nodeTopology	o	o	x
CSIStorageCapacity	.storageClassName	o	o	x
CSIStorageCapacity	.capacity	o	x	o
CSIStorageCapacity	.maximumVolumeSize	o	x	o

ref(ja): https://zenn.dev/bells17/scraps/65bd6891012bdc

Does this PR introduce a user-facing change?

kube-scheduler implements scheduling hints for the VolumeBinding plugin.
The scheduling hints allow the scheduler to retry scheduling a Pod that was previously rejected by the VolumeBinding plugin only if a new resource referenced by the plugin was created or an existing resource referenced by the plugin was updated.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2024-05-20T04:48:29Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bells17
Once this PR has been reviewed and has the lgtm label, please assign thockin for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

pkg/scheduler/framework/plugins/volumebinding/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2024-05-20T04:48:32Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

bells17 · 2024-05-20T05:34:05Z

/cc @sanposhiho @utam0k

sanposhiho · 2024-05-20T06:58:53Z

/kind feature

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go

linux-foundation-easycla · 2024-05-21T04:31:49Z

The committers listed above are authorized under a signed CLA.

✅ login: bells17 / name: bells17 (411d7ae, 2e8edf8, 74f80bf, 5187bf1, c05b069, 362a702, 21ad053, e1e7313, 8a942f3)

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go

sanposhiho · 2024-05-26T06:48:59Z

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go

 	}
 	return events
 }

+func (pl *VolumeBinding) isSchedulableAfterCSIStorageCapacityChange(logger klog.Logger, pod *v1.Pod, oldObj, newObj interface{}) (framework.QueueingHint, error) {


How often is CSIStorageCapacity updated in an actual world?

If it's infrequent in general, can we just return Queue when CSIStorageCapacity is created/updated and a Pod has specific .spec.volumes?

e.g.,

func (pl *VolumeBinding) isSchedulableAfterCSIStorageCapacityChange(logger klog.Logger, pod *v1.Pod, oldObj, newObj interface{}) (framework.QueueingHint, error) { //... for _, vol := range pod.Spec.Volumes { if vol.PersistentVolumeClaim != nil || vol.Ephemeral != nil: // This Pod might have got unschedulable due to CSIStorageCapacity in a past scheduling cycle. return framework.Queue, nil } } return framework.QueueSkip, nil

Thank you for your comment.

How often is CSIStorageCapacity updated in an actual world?

CSIStorageCapacity is updated by the external-provisioner of each CSI Driver. The relevant code can be found here:
https://github.com/kubernetes-csi/external-provisioner/blob/v5.0.0/pkg/capacity/capacity.go#L592-L669

There are several events that trigger the update, but with the default settings, each CSIStorageCapacity for a single CSI Driver is updated at least every minute:
https://github.com/kubernetes-csi/external-provisioner/blob/v5.0.0/pkg/capacity/capacity.go#L512-L526
https://github.com/kubernetes-csi/external-provisioner/blob/v5.0.0/cmd/csi-provisioner/csi-provisioner.go#L111

The number of CSIStorageCapacity objects depends on the NodeTopology configuration of the target CSI Driver.
For example, when using network volumes like the gcp-compute-persistent-disk-csi-driver, it can be on a per-zone basis:
https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/v999.1.0/pkg/gce-pd-csi-driver/node.go#L377-L392
When using lvm running on each node, like topolvm, CSIStorageCapacity is created on a per-node basis:
https://github.com/topolvm/topolvm/blob/main/internal/driver/node.go#L557-L566

If it's infrequent in general, can we just return Queue when CSIStorageCapacity is created/updated and a Pod has specific .spec.volumes?

Yes, that approach alone can help skip queueing Pods unrelated to CSIStorageCapacity, reducing unnecessary overhead.

bells17 · 2024-05-26T16:00:23Z

@sanposhiho @toVersus Thank you for your review. I have made the necessary changes, so please take another look.

toVersus

Only some nits, other than that LGTM.

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go

sanposhiho · 2024-05-27T08:01:07Z

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go

+	for _, vol := range pod.Spec.Volumes {
+		if vol.PersistentVolumeClaim != nil || vol.Ephemeral != nil {
+			// This Pod might have got unschedulable due to CSIStorageCapacity in a past scheduling cycle.
+			logger.V(5).Info("CSIStorageCapacity was created or updated, but the pod doesn't mount any persistent volumes and generic ephemeral storage volumes. It doesn't make this pod schedulable.")


It makes me wonder: in the first place, could a Pod without any PVC or ephemeral volume get unschedulable by VolumeBinding plugin?
If No, we likely end up having no QHint for CSIStorageCapacity (for now, at least until we build some observability like I said).

I'm sorry, I made a mistake in correcting the log message.
(This has been fixed)

It makes me wonder: in the first place, could a Pod without any PVC or ephemeral volume get unschedulable by VolumeBinding plugin?

I think the answer to this is No.

If No, we likely end up having no QHint for CSIStorageCapacity (for now, at least until we build some observability like I said).

Regarding this point, I think that if we don't return QueueSkip in the QHint, scheduling will be attempted due to CSIStorageCapacity changes even in cases where the Pod doesn't have a Volume. Would this not cause any issues in your opinion?

Regarding this point, I think that if we don't return QueueSkip in the QHint, scheduling will be attempted due to CSIStorageCapacity changes even in cases where the Pod doesn't have a Volume. Would this not cause any issues in your opinion?

Only QHints of plugins that have rejected a Pod would be run. e.g., if, in a previous scheduling cycle, a Pod is failed by resource fit plugin and PodAffinity plugin, then only QHints of those plugins would be run for the Pod.

Then, if the answer to my first question "could a Pod without any PVC or ephemeral volume get unschedulable by VolumeBinding plugin?" is No, filtering Pods with volume in QHint doesn't make any point since only Pods with volume would reach this QHint in the first place.

Thank you for your input. I understand.
I'll proceed to close the unnecessary PRs.

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go

toVersus · 2024-05-30T03:21:12Z

/lgtm

k8s-ci-robot · 2024-05-30T03:21:16Z

@toVersus: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

bells17 · 2024-05-31T07:55:04Z

/close
#124961 (comment)

k8s-ci-robot · 2024-05-31T07:55:10Z

@bells17: Closed this PR.

In response to this:

/close
#124961 (comment)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot requested review from humblec and saikat-royc May 20, 2024 04:48

k8s-ci-robot requested review from sanposhiho and utam0k May 20, 2024 05:34

bells17 mentioned this pull request May 20, 2024

[Umbrella] Implement QueueingHintFn in in-tree plugins #118893

Open

12 tasks

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels May 20, 2024

volumebinding: scheduler queueing hints - CSIStorageCapacity

411d7ae

bells17 force-pushed the qhint-volume-binding-csistoragecapacity branch from 20b3c2b to 411d7ae Compare May 20, 2024 12:44

toVersus reviewed May 20, 2024

View reviewed changes

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go Show resolved Hide resolved

Fixed points mentioned in the review

74f80bf

bells17 force-pushed the qhint-volume-binding-csistoragecapacity branch from bedc76c to 74f80bf Compare May 20, 2024 16:17

toVersus reviewed May 20, 2024

View reviewed changes

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go Outdated Show resolved Hide resolved

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go Outdated Show resolved Hide resolved

k8s-ci-robot removed the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 21, 2024

k8s-ci-robot requested review from jsafrane and xing-yang May 22, 2024 08:56

Update

2e8edf8

toVersus reviewed May 25, 2024

View reviewed changes

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go Outdated Show resolved Hide resolved

toVersus reviewed May 25, 2024

View reviewed changes

sanposhiho reviewed May 26, 2024

View reviewed changes

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go Outdated Show resolved Hide resolved

pkg/scheduler/framework/plugins/volumebinding/volume_binding.go Outdated Show resolved Hide resolved

sanposhiho reviewed May 26, 2024

View reviewed changes

sanposhiho mentioned this pull request May 26, 2024

volumebinding: scheduler queueing hints - StorageClass #124958

Closed

bells17 force-pushed the qhint-volume-binding-csistoragecapacity branch 2 times, most recently from 5346abf to be636e8 Compare May 26, 2024 13:19

Fixed points mentioned in the review

362a702

bells17 force-pushed the qhint-volume-binding-csistoragecapacity branch from be636e8 to 362a702 Compare May 26, 2024 13:38

bells17 requested review from toVersus and sanposhiho May 26, 2024 13:48

toVersus reviewed May 27, 2024

View reviewed changes

sanposhiho reviewed May 27, 2024

View reviewed changes

Fixed points mentioned in the review

8a942f3

bells17 force-pushed the qhint-volume-binding-csistoragecapacity branch from 82b1102 to 8a942f3 Compare May 29, 2024 14:58

bells17 requested review from sanposhiho and toVersus May 30, 2024 02:21

toVersus reviewed May 30, 2024

View reviewed changes

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go Show resolved Hide resolved

pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go Outdated Show resolved Hide resolved

Add a testcase

21ad053

bells17 requested a review from toVersus May 30, 2024 04:11

k8s-ci-robot closed this May 31, 2024

bells17 mentioned this pull request May 31, 2024

volumebinding: scheduler queueing hints - PersistentVolume #124960

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

volumebinding: scheduler queueing hints - CSIStorageCapacity #124961

volumebinding: scheduler queueing hints - CSIStorageCapacity #124961

bells17 commented May 20, 2024 •

edited

k8s-ci-robot commented May 20, 2024

k8s-ci-robot commented May 20, 2024

bells17 commented May 20, 2024

sanposhiho commented May 20, 2024

linux-foundation-easycla bot commented May 21, 2024 •

edited

sanposhiho May 26, 2024 •

edited

bells17 May 26, 2024 •

edited

bells17 commented May 26, 2024

toVersus left a comment •

edited

sanposhiho May 27, 2024 •

edited

bells17 May 29, 2024

sanposhiho May 30, 2024 •

edited

bells17 May 31, 2024

toVersus commented May 30, 2024

k8s-ci-robot commented May 30, 2024

bells17 commented May 31, 2024

k8s-ci-robot commented May 31, 2024

volumebinding: scheduler queueing hints - CSIStorageCapacity #124961

volumebinding: scheduler queueing hints - CSIStorageCapacity #124961

Conversation

bells17 commented May 20, 2024 • edited

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented May 20, 2024

k8s-ci-robot commented May 20, 2024

bells17 commented May 20, 2024

sanposhiho commented May 20, 2024

linux-foundation-easycla bot commented May 21, 2024 • edited

sanposhiho May 26, 2024 • edited

Choose a reason for hiding this comment

bells17 May 26, 2024 • edited

Choose a reason for hiding this comment

bells17 commented May 26, 2024

toVersus left a comment • edited

Choose a reason for hiding this comment

sanposhiho May 27, 2024 • edited

Choose a reason for hiding this comment

bells17 May 29, 2024

Choose a reason for hiding this comment

sanposhiho May 30, 2024 • edited

Choose a reason for hiding this comment

bells17 May 31, 2024

Choose a reason for hiding this comment

toVersus commented May 30, 2024

k8s-ci-robot commented May 30, 2024

bells17 commented May 31, 2024

k8s-ci-robot commented May 31, 2024

bells17 commented May 20, 2024 •

edited

linux-foundation-easycla bot commented May 21, 2024 •

edited

sanposhiho May 26, 2024 •

edited

bells17 May 26, 2024 •

edited

toVersus left a comment •

edited

sanposhiho May 27, 2024 •

edited

sanposhiho May 30, 2024 •

edited