Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volumezone: scheduler queueing hints: node #124998

Closed
wants to merge 3 commits into from

Conversation

Gekko0114
Copy link
Member

@Gekko0114 Gekko0114 commented May 20, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

Added scheduler queueing hints to volume zone plugin.
This PR covers node.

I divided #119373 into following 4 PRs.
#125000
#125001
#124996
#124998

Which issue(s) this PR fixes:

Part of #118893

Special notes for your reviewer:

Thank you so much for your review!

Does this PR introduce a user-facing change?

kube-scheduler implements scheduling hints for the VolumeZone plugin.
The scheduling hints allow the scheduler to only retry scheduling a Pod
that was previously rejected by the VolemeZone plugin if  addition/update of node, 
addition/update of PV, addition/update of PVC, or addition of SC matches pod's topology settings.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 20, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label May 20, 2024
@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 20, 2024
@Gekko0114
Copy link
Member Author

/assign @sanposhiho

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Gekko0114
Once this PR has been reviewed and has the lgtm label, please ask for approval from sanposhiho. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

@Gekko0114: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-integration a3af1e3 link true /test pull-kubernetes-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Gekko0114
Copy link
Member Author

/assign @msau42
/assign @xing-yang

// isSchedulableAfterNodeChange is invoked whenever a node added or updated.
// It checks whether the change of Node has made a previously unschedulable pod schedulable.
// If the node label related to pod's pv topology is changed, it could make the pod schedulable.
func (pl *VolumeZone) isSchedulableAfterNodeChange(logger klog.Logger, pod *v1.Pod, oldObj, newObj interface{}) (framework.QueueingHint, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto 🙏

As we talked in DM, can you refactor this QHint so that it doesn't fetch any other kind of resources? Then, if you think there's nothing we can do for StorageClass, please just update the comment in EventsToRegister. 🙏
https://github.com/kubernetes/kubernetes/pull/124996/files#r1621867140

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't depend on any other resources.
Node QHint only depends on nodes and the pod.

Copy link
Member Author

@Gekko0114 Gekko0114 May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though pvcLister is used within getPVbyPod, getPVbyPod is not a new function and I don't modify this function in this PR.
Therefore I thought using getPVbyPod in isSchedulableAfterNodeChange is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether getPVbyPod is a new function introduced in this PR or not doesn't really matter.

If isSchedulableAfterNodeChange calls getPVbyPod and getPVbyPod calls pl.pvcLister.PersistentVolumeClaims().Get(), then isSchedulableAfterNodeChange depends on PVC, doesn't it?

Copy link
Member Author

@Gekko0114 Gekko0114 May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.
I thought the point is whether the QHint PR increases the complexity of the code.
This PR doesn't introduce getPVbyPod.
Therefore I thought it is OK.
Please let me know if I misunderstood.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether getPVbyPod is a new function introduced in this PR or not doesn't really matter.

Sorry, I've missed this message. I will fix it. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can implement isSchedulableAfterNodeChange without getPVbyPod.
Therefore, I will close this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the point is whether QHint would go thru a complicated implementation, that is, fetching another kind of resource(s), rather than how complex a PR is.

@Gekko0114 Gekko0114 closed this Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants