Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the permit extension point in scheduler. #77559

Merged
merged 1 commit into from May 13, 2019

Conversation

@ahg-g
Copy link
Member

commented May 7, 2019

Ref #77372

Looking for early feedback, tests and example plugins to be added.

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind feature

What this PR does / why we need it:
Implements the permit extension point of the new scheduler framework

Which issue(s) this PR fixes:

Fixes #77372

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Implement Permit extension point of the scheduling framework.
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

Hi @ahg-g. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested review from bsalamat and wgliang May 7, 2019

@bsalamat
Copy link
Member

left a comment

Thanks, @ahg-g for sending this PR quickly.

Show resolved Hide resolved pkg/scheduler/framework/v1alpha1/interface.go Outdated
@@ -30,11 +31,18 @@ import (
type framework struct {
registry Registry
nodeInfoSnapshot *cache.NodeInfoSnapshot
waitingPods []*WaitingPod

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 7, 2019

Member

This should probably be a map from a pod UID to the pod pointer. A map allows faster look-up. For example, a plugin might want to store UIDs of some pods that it monitors and approve them once a certain condition is met. The plugin can use to UIDs to find the pods faster in the map.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 8, 2019

Author Member

I changed this to a map. The plugin can call WaitingPods to get a pointer to the map, and can either iterate over it or use UID to lookup specific pods.

pod: pod,
s: make(chan *Status),
}
f.waitingPods = append(f.waitingPods, w)

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 7, 2019

Member

IIUC, when all premit plugins return "Success", we still add the pod to waitingPods. This does not seem correct to me.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 8, 2019

Author Member

Oops, fixed!

return w.pod
}

func (w *WaitingPod) SetStatus(status *Status) {

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 7, 2019

Member

I am not sure about SetStatus. I am more inclined towards providing an Approve and a Reject function. Those functions must also be exposed by the framework.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 8, 2019

Author Member

I changed SetStatus to Approve and Reject, but kept them in the WaitingPod struct. The way I see this being used is that the caller will get a pointer to the waitingPods map, iterate over it or lookup a specific WaitingPod use a UID, and then call Approve or Reject on the WaitingPod.

f.waitingPods = append(f.waitingPods, w)
if statusCode == Wait {
timer := time.NewTimer(timeout)
defer timer.Stop()

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 7, 2019

Member

Do we need this?

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 8, 2019

Author Member

No, removed. What was missing is removing the WaitingPod from the map, which I added.

if forgetErr := sched.Cache().ForgetPod(assumedPod); forgetErr != nil {
klog.Errorf("scheduler cache ForgetPod failed: %v", forgetErr)
}
sched.recordSchedulingFailure(assumedPod, permitStatus.AsError(), reason, permitStatus.Message())

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 7, 2019

Member

We must ensure that Unreserve plugins are called here. The PR for Unreserve plugins is not merged yet.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 8, 2019

Author Member

Ack, once that one is submitted, I will rebase and call Unreserve.

}

const (
// Specifies the maximum timeout a permit plugin can return.
maxTimeout time.Duration = 5 * time.Minute

This comment has been minimized.

Copy link
@wgliang

wgliang May 7, 2019

Member

Whether the size of this time decision needs to be based or configurable?

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 8, 2019

Author Member

Do we have a way to pass configurations to the framework? If not and we need one, then I think it is better to do it in a separate PR.

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

Yes, there will be configuration for plugins. It is under construction: #77501.
Since plugins can have their own configuration, I think we should have a long default timeout here and let each plugin shorten the timeout when it is appropriate. Based on some observations in the past, 5 minutes is short. We should probably change this to 15 minutes.

@@ -105,6 +113,62 @@ func (f *framework) RunReservePlugins(
return nil
}

func (f *framework) RunPermitPlugins(
pc *PluginContext, pod *v1.Pod, nodeName string) *Status {
timeout := maxTimeout

This comment has been minimized.

Copy link
@wgliang

wgliang May 7, 2019

Member

Why re-declare a new variable here instead of using maxTimeout directly?

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 8, 2019

Author Member

timeout is used to track the minimum duration returned by any plugin. I added a comment to the interface documentation to clarify the semantics.

@ahg-g ahg-g changed the title [WIP] Implement the permit extension point in scheduler. Implement the permit extension point in scheduler. May 9, 2019

@bsalamat
Copy link
Member

left a comment

Thanks, @ahg-g! Looks good. Only a few minor comments.

}

const (
// Specifies the maximum timeout a permit plugin can return.
maxTimeout time.Duration = 5 * time.Minute

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

Yes, there will be configuration for plugins. It is under construction: #77501.
Since plugins can have their own configuration, I think we should have a long default timeout here and let each plugin shorten the timeout when it is appropriate. Based on some observations in the past, 5 minutes is short. We should probably change this to 15 minutes.

// RunPermitPlugins runs the set of configured permit plugins. If any of these
// plugins returns a status other than "Success" or "Wait", it does not continue
// running the remaining plugins and returns an error. Otherwise, if any of the
// plugins returns "Wait", then this function will block for the timeout period

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

It would be good to clarify that if any plugin returns "wait", this function will continue running the remaining plugins.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 10, 2019

Author Member

Done.

@@ -139,4 +175,13 @@ type FrameworkHandle interface {
// a pod finishes "Reserve" point. There is no guarantee that the information
// remains unchanged in the binding phase of scheduling.
NodeInfoSnapshot() *internalcache.NodeInfoSnapshot

// IterateOverWaitingPods acquires a read lock and iterates over the WaitingPods map.
IterateOverWaitingPods(callback func(types.UID, WaitingPod))

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

Why does callback take a UID? UID can be retrieved from the pod object if needed.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 10, 2019

Author Member

Done.

}

// Add a new WaitingPod to the map.
func (m *waitingPodsMap) Add(uid types.UID, wp WaitingPod) {

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

I think we should get only a WaitingPod. UID can be retrieved from a WaitingPod. Receiving only a WaitingPod has another benefit. The UID will never mismatch the pod.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 10, 2019

Author Member

Done.

}

// Add a new WaitingPod to the map.
func (m *waitingPodsMap) Add(uid types.UID, wp WaitingPod) {

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

I think we should make "Add" and "Remove" private. Only the framework should be able to add or remove waiting pods. Also, if these functions are called by a "callback" function passed to "Iterate", they will cause a deadlock.


}

// IterateOverWaitingPods acquires a read lock and iterates over the WaitingPods map.

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

s/IterateOverWaitingPods/Iterate/

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 10, 2019

Author Member

Done.

}

// GetWaitingPodsCount returns the number of waiting pods.
func (f *framework) GetWaitingPodsCount() int {

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

Let's remove this function and added when needed.

This comment has been minimized.

Copy link
@ahg-g

ahg-g May 10, 2019

Author Member

This is actually used in the integration test.

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 10, 2019

Member

If it is only for testing purposes, I guess you can easily use IterateOverWaitingPods and pass a callback function that counts the pods.
The reason that I insist on removing this function or other functions that we don't have an obvious use for is that these functions will be a part of the framework API and will be hard to change or remove in the future. Once it is added, it will stay!

@bsalamat

This comment has been minimized.

Copy link
Member

commented May 9, 2019

Please add a release note.

/ok-to-test

@ahg-g

This comment has been minimized.

Copy link
Member Author

commented May 10, 2019

/retest

@bsalamat
Copy link
Member

left a comment

Thanks! Almost there. Please go ahead and squash commits. Also, run hack/update-bazel.sh in your working directory and commit changes after running the command.

}

// GetWaitingPodsCount returns the number of waiting pods.
func (f *framework) GetWaitingPodsCount() int {

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 10, 2019

Member

If it is only for testing purposes, I guess you can easily use IterateOverWaitingPods and pass a callback function that counts the pods.
The reason that I insist on removing this function or other functions that we don't have an obvious use for is that these functions will be a part of the framework API and will be hard to change or remove in the future. Once it is added, it will stay!

@ahg-g ahg-g force-pushed the ahg-g:permit-extension-point branch 2 times, most recently from bfe2e58 to 16359db May 10, 2019

@ahg-g ahg-g force-pushed the ahg-g:permit-extension-point branch from 16359db to c79ad8c May 10, 2019

@k8s-ci-robot k8s-ci-robot added size/L and removed size/XL labels May 10, 2019

@ahg-g ahg-g force-pushed the ahg-g:permit-extension-point branch from c79ad8c to 06dcbf2 May 10, 2019

@ahg-g

This comment has been minimized.

Copy link
Member Author

commented May 10, 2019

/test pull-kubernetes-verify

@ahg-g ahg-g force-pushed the ahg-g:permit-extension-point branch from 06dcbf2 to 40bdc23 May 10, 2019

@ahg-g ahg-g force-pushed the ahg-g:permit-extension-point branch from 40bdc23 to 98de316 May 10, 2019

@k8s-ci-robot k8s-ci-robot added size/XL and removed size/L labels May 10, 2019

@ahg-g

This comment has been minimized.

Copy link
Member Author

commented May 10, 2019

@bsalamat done.

@bsalamat
Copy link
Member

left a comment

/lgtm
/approve

Thanks, @ahg-g!

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 13, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, bsalamat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit c54b664 into kubernetes:master May 13, 2019

20 checks passed

cla/linuxfoundation ahg-g authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details

@ahg-g ahg-g referenced this pull request May 28, 2019

Closed

REQUEST: New membership for ahg-g #848

6 of 6 tasks complete

@draveness draveness referenced this pull request Jun 3, 2019

Open

Scheduling Framework #624

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.