Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement "queue-sort" extension point for scheduling framework #77529

Conversation

@draveness
Copy link
Member

commented May 7, 2019

What type of PR is this?

/kind feature
/priority important-soon
/sig scheduling

What this PR does / why we need it:

Implement "queue-sort" extension point for scheduling framework

Which issue(s) this PR fixes:

Fixes #77524

KEP: kubernetes/enhancements#624

Does this PR introduce a user-facing change?:

Support "queue-sort" extension point for scheduling framework
@draveness

This comment has been minimized.

Copy link
Member Author

commented May 7, 2019

/assign @bsalamat @misterikkit

This is a preliminary implementation for scheduling framework. I'll add more unit tests after the design is good to go :)

@draveness draveness force-pushed the draveness:feature/add-queuesort-extension-point branch 3 times, most recently from 2df3136 to 57f14f5 May 7, 2019

type QueueSortPlugin interface {
Plugin
// Less are used to sort pods in the scheduling queue.
Less(*internalqueue.PodInfo, *internalqueue.PodInfo) bool

This comment has been minimized.

Copy link
@wgliang

wgliang May 7, 2019

Member

Is it better to name it Sort?

This comment has been minimized.

Copy link
@draveness

draveness May 7, 2019

Author Member

I followed the KEP definition:

These plugins are used to sort pods in the scheduling queue. A queue sort plugin essentially will provide a "less(pod1, pod2)" function. Only one queue sort plugin may be enabled at a time.

And the sort package in golang also prefers Less in this scenario, e.g. func Slice(slice interface{}, less func(i, j int) bool)

This comment has been minimized.

Copy link
@wgliang

wgliang May 7, 2019

Member

All right, understand.

type QueueSortPlugin interface {
Plugin
// Less are used to sort pods in the scheduling queue.
Less(*internalqueue.PodInfo, *internalqueue.PodInfo) bool

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 8, 2019

Member

Plugins will not be able to use internalqueue.PodInfo if they import scheduler's code. We should probably move PodInfo outside of the internal.

cc/ @misterikkit

This comment has been minimized.

Copy link
@draveness

draveness May 9, 2019

Author Member

I could move the PodInfo into this file if it is appropriate.

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 9, 2019

Member

It feels a bit odd to move PodInfo here, but since we want plugins to use it, I guess we don't have any other option.

This comment has been minimized.

Copy link
@onursatici

onursatici May 10, 2019

Contributor

out of the scope of this pr, but @bsalamat does the same logic apply for NodeInfoSnapshot used here, which is also under internal?

This comment has been minimized.

Copy link
@draveness

draveness May 11, 2019

Author Member

It feels a bit odd to move PodInfo here, but since we want plugins to use it, I guess we don't have any other option.

Moved LessFunc and PodInfo to the framework interface. PTAL

@draveness draveness changed the title [WIP] feat: implement "queue-sort" extension point for scheduling framework feat: implement "queue-sort" extension point for scheduling framework May 11, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 12, 2019

/retest

@draveness draveness force-pushed the draveness:feature/add-queuesort-extension-point branch from de393e4 to 39b37cb May 12, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 12, 2019

/retest

Show resolved Hide resolved pkg/scheduler/internal/queue/scheduling_queue.go Outdated
Show resolved Hide resolved pkg/scheduler/framework/v1alpha1/framework.go Outdated
Show resolved Hide resolved pkg/scheduler/internal/queue/scheduling_queue_test.go Outdated
}

// only active one at a time.
return f.queueSortPlugins[0].Less

This comment has been minimized.

Copy link
@bsalamat

bsalamat May 14, 2019

Member

An alternative to this approach is to support multiple plugins for QueueSort. We could rename Less function to Cmp and let it return -1, 0, or 1. A returned 0 means that the values are equal. In that case, we can call the next plugin Cmp until one returns non-zero. For now, I think we can stay with the current implementation.

Show resolved Hide resolved pkg/scheduler/framework/v1alpha1/framework.go Outdated

@draveness draveness force-pushed the draveness:feature/add-queuesort-extension-point branch 2 times, most recently from d9ae14d to 58ceeda May 15, 2019

@draveness draveness force-pushed the draveness:feature/add-queuesort-extension-point branch from 58ceeda to d60bccc May 15, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 15, 2019

/test pull-kubernetes-dependencies

@bsalamat
Copy link
Member

left a comment

/lgtm
/approve

Thanks, @draveness!

@k8s-ci-robot k8s-ci-robot added the lgtm label May 15, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 15, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, draveness

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 16, 2019

/retest

@k8s-ci-robot k8s-ci-robot merged commit 796ecb9 into kubernetes:master May 16, 2019

16 of 20 checks passed

pull-kubernetes-e2e-gce Job triggered.
Details
pull-kubernetes-e2e-gce-100-performance Job triggered.
Details
pull-kubernetes-integration Job triggered.
Details
pull-kubernetes-kubemark-e2e-gce-big Job triggered.
Details
cla/linuxfoundation draveness authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details

@draveness draveness deleted the draveness:feature/add-queuesort-extension-point branch May 16, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

@draveness: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce d60bccc link /test pull-kubernetes-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@@ -69,6 +70,10 @@ func NewFramework(r Registry, _ *runtime.Unknown) (Framework, error) {
// TODO: For now, we assume any plugins that implements an extension
// point wants to be called at that extension point. We should change this
// later and add these plugins based on the configuration.
if qsp, ok := p.(QueueSortPlugin); ok {
f.queueSortPlugins = append(f.queueSortPlugins, qsp)

This comment has been minimized.

Copy link
@tedyu

tedyu May 16, 2019

Contributor

For the current implementation, when len(f.queueSortPlugins) reaches 1, we don't need to append more plugin.

@apelisse

This comment has been minimized.

Copy link
Member

commented May 16, 2019

Not sure if this is this PR, but I've noticed this flake:

INFO: From Testing //pkg/scheduler/internal/queue:go_default_test:
==================== Test output for //pkg/scheduler/internal/queue:go_default_test:
==================
WARNING: DATA RACE
Write at 0x00c000334750 by goroutine 29:
  runtime.mapdelete_faststr()
      GOROOT/src/runtime/map_faststr.go:297 +0x0
  k8s.io/kubernetes/pkg/scheduler/util.(*heapData).Pop()
      pkg/scheduler/util/heap.go:113 +0x203
  container/heap.Pop()
      GOROOT/src/container/heap/heap.go:64 +0xb0
  k8s.io/kubernetes/pkg/scheduler/util.(*Heap).Pop()
      pkg/scheduler/util/heap.go:200 +0x5b
  k8s.io/kubernetes/pkg/scheduler/internal/queue.(*PriorityQueue).flushBackoffQCompleted()
      pkg/scheduler/internal/queue/scheduling_queue.go:356 +0x490
  k8s.io/kubernetes/pkg/scheduler/internal/queue.(*PriorityQueue).flushBackoffQCompleted-fm()
      pkg/scheduler/internal/queue/scheduling_queue.go:334 +0x41
  k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1()
      staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x61
  k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil()
      staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0x108
  k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until()
      staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x5a
 
Previous read at 0x00c000334750 by goroutine 28:
  runtime.mapaccess2_faststr()
      GOROOT/src/runtime/map_faststr.go:107 +0x0
  k8s.io/kubernetes/pkg/scheduler/util.(*Heap).Get()
      pkg/scheduler/util/heap.go:221 +0x11d
  k8s.io/kubernetes/pkg/scheduler/internal/queue.TestPriorityQueue_AddUnschedulableIfNotPresent_Backoff()
      pkg/scheduler/internal/queue/scheduling_queue_test.go:327 +0xa57
  testing.tRunner()
      GOROOT/src/testing/testing.go:865 +0x163
 
Goroutine 29 (running) created at:
  k8s.io/kubernetes/pkg/scheduler/internal/queue.(*PriorityQueue).run()
      pkg/scheduler/internal/queue/scheduling_queue.go:200 +0xd0
  k8s.io/kubernetes/pkg/scheduler/internal/queue.NewPriorityQueueWithClock()
      pkg/scheduler/internal/queue/scheduling_queue.go:193 +0x9af
  k8s.io/kubernetes/pkg/scheduler/internal/queue.TestPriorityQueue_AddUnschedulableIfNotPresent_Backoff()
      pkg/scheduler/internal/queue/scheduling_queue.go:164 +0x70
  testing.tRunner()
      GOROOT/src/testing/testing.go:865 +0x163
 
Goroutine 28 (finished) created at:
  testing.(*T).Run()
      GOROOT/src/testing/testing.go:916 +0x65a
  testing.runTests.func1()
      GOROOT/src/testing/testing.go:1157 +0xa8
  testing.tRunner()
      GOROOT/src/testing/testing.go:865 +0x163
  testing.runTests()
      GOROOT/src/testing/testing.go:1155 +0x523
  testing.(*M).Run()
      GOROOT/src/testing/testing.go:1072 +0x2eb
  main.main()
      bazel-out/k8-fastbuild/bin/pkg/scheduler/internal/queue/linux_amd64_race_stripped/go_default_test%/testmain.go:124 +0x2e1
==================
--- FAIL: TestRecentlyTriedPodsGoBack (1.00s)
    testing.go:809: race detected during execution of test
FAIL
================================================================================
==================== Test output for //pkg/scheduler/internal/queue:go_default_test:
==================
WARNING: DATA RACE
Write at 0x00c00019ddd0 by goroutine 30:
  runtime.mapdelete_faststr()
      GOROOT/src/runtime/map_faststr.go:297 +0x0
  k8s.io/kubernetes/pkg/scheduler/util.(*heapData).Pop()
      pkg/scheduler/util/heap.go:113 +0x203
  container/heap.Pop()
      GOROOT/src/container/heap/heap.go:64 +0xb0
  k8s.io/kubernetes/pkg/scheduler/util.(*Heap).Pop()
      pkg/scheduler/util/heap.go:200 +0x5b
  k8s.io/kubernetes/pkg/scheduler/internal/queue.(*PriorityQueue).flushBackoffQCompleted()
      pkg/scheduler/internal/queue/scheduling_queue.go:356 +0x490
  k8s.io/kubernetes/pkg/scheduler/internal/queue.(*PriorityQueue).flushBackoffQCompleted-fm()
      pkg/scheduler/internal/queue/scheduling_queue.go:334 +0x41
  k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1()
      staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x61
  k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil()
      staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0x108
  k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until()
      staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x5a
 
Previous read at 0x00c00019ddd0 by goroutine 29:
  runtime.mapaccess2_faststr()
      GOROOT/src/runtime/map_faststr.go:107 +0x0
  k8s.io/kubernetes/pkg/scheduler/util.(*Heap).Get()
      pkg/scheduler/util/heap.go:221 +0x11d
  k8s.io/kubernetes/pkg/scheduler/internal/queue.TestPriorityQueue_AddUnschedulableIfNotPresent_Backoff()
      pkg/scheduler/internal/queue/scheduling_queue_test.go:327 +0xa57
  testing.tRunner()
      GOROOT/src/testing/testing.go:865 +0x163
 
Goroutine 30 (running) created at:
  k8s.io/kubernetes/pkg/scheduler/internal/queue.(*PriorityQueue).run()
      pkg/scheduler/internal/queue/scheduling_queue.go:200 +0xd0
  k8s.io/kubernetes/pkg/scheduler/internal/queue.NewPriorityQueueWithClock()
      pkg/scheduler/internal/queue/scheduling_queue.go:193 +0x9af
  k8s.io/kubernetes/pkg/scheduler/internal/queue.TestPriorityQueue_AddUnschedulableIfNotPresent_Backoff()
      pkg/scheduler/internal/queue/scheduling_queue.go:164 +0x70
  testing.tRunner()
      GOROOT/src/testing/testing.go:865 +0x163
 
Goroutine 29 (finished) created at:
  testing.(*T).Run()
      GOROOT/src/testing/testing.go:916 +0x65a
  testing.runTests.func1()
      GOROOT/src/testing/testing.go:1157 +0xa8
  testing.tRunner()
      GOROOT/src/testing/testing.go:865 +0x163
  testing.runTests()
      GOROOT/src/testing/testing.go:1155 +0x523
  testing.(*M).Run()
      GOROOT/src/testing/testing.go:1072 +0x2eb
  main.main()
      bazel-out/k8-fastbuild/bin/pkg/scheduler/internal/queue/linux_amd64_race_stripped/go_default_test%/testmain.go:124 +0x2e1
==================
--- FAIL: TestRecentlyTriedPodsGoBack (1.00s)
    testing.go:809: race detected during execution of test
FAIL
@apelisse

This comment has been minimized.

Copy link
Member

commented May 16, 2019

Interestingly, I haven't been able to reproduce this flake (still I think it should be looked at it looks pretty bad). I managed to trigger another one though:

==================== Test output for //pkg/scheduler/internal/queue:go_default_test (run 857 of 1000):
--- FAIL: TestPendingPodsMetric (0.00s)
    --- FAIL: TestPendingPodsMetric/add_pods_to_unschedulableQ_and_then_move_all_to_activeQ (0.00s)
        scheduling_queue_test.go:1309: ActivePods: Expected 50, got 59
        scheduling_queue_test.go:1317: BackoffPods: Expected 0, got -8
FAIL
@draveness

This comment has been minimized.

Copy link
Member Author

commented May 16, 2019

Interestingly, I haven't been able to reproduce this flake (still I think it should be looked at it looks pretty bad). I managed to trigger another one though:

==================== Test output for //pkg/scheduler/internal/queue:go_default_test (run 857 of 1000):
--- FAIL: TestPendingPodsMetric (0.00s)
    --- FAIL: TestPendingPodsMetric/add_pods_to_unschedulableQ_and_then_move_all_to_activeQ (0.00s)
        scheduling_queue_test.go:1309: ActivePods: Expected 50, got 59
        scheduling_queue_test.go:1317: BackoffPods: Expected 0, got -8
FAIL

It seems like there is no difference if we pass a nil framework into the initialiser which both cases indeed passed nil. So we can not get into the if branch.

	comp := activeQComp
	if fwk != nil {
		if queueSortFunc := fwk.QueueSortFunc(); queueSortFunc != nil {
			comp = func(podInfo1, podInfo2 interface{}) bool {
				pInfo1 := podInfo1.(*framework.PodInfo)
				pInfo2 := podInfo2.(*framework.PodInfo)

				return queueSortFunc(pInfo1, pInfo2)
			}
		}
	}

Could you give me some inputs on how to trigger the failure of TestPendingPodsMetric?

@apelisse

This comment has been minimized.

Copy link
Member

commented May 17, 2019

bazel test --runs_per_test=1000 //pkg/scheduler/internal/queue:go_default_test

That should trigger the failure pretty quickly

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 18, 2019

bazel test --runs_per_test=1000 //pkg/scheduler/internal/queue:go_default_test

That should trigger the failure pretty quickly

There's some problem with bazel on my laptop. I ran the tests 100 times with go test and did not trigger the failure.

$ go test ./pkg/scheduler/internal/queue/... -count=100
ok  	k8s.io/kubernetes/pkg/scheduler/internal/queue	191.594s
@tedyu

This comment has been minimized.

Copy link
Contributor

commented May 18, 2019

Using bazel, it is easy to reproduce:

https://pastebin.com/0Q2M8dHk

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 18, 2019

Using bazel, it is easy to reproduce:

https://pastebin.com/0Q2M8dHk

I revert the commit, and the test fails the same. I'll open a flaky test issue. #78064

@apelisse

This comment has been minimized.

Copy link
Member

commented May 20, 2019

Thanks for tracking this! Have you opened an issue for the race condition or managed to investigate it?

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 21, 2019

Thanks for tracking this! Have you opened an issue for the race condition or managed to investigate it?

No, I didn't open an issue for the race condition problem. Do you have links to failing CI jobs?

@apelisse

This comment has been minimized.

Copy link
Member

commented May 21, 2019

The one described in #77529 (comment). A race condition could also be responsible for the other error.

@tedyu

This comment has been minimized.

Copy link
Contributor

commented May 22, 2019

Tried the following:

diff --git a/pkg/scheduler/internal/queue/scheduling_queue_test.go b/pkg/scheduler/internal/queue/scheduling_queue_test.go
index 0252014e83..e96c06d7aa 100644
--- a/pkg/scheduler/internal/queue/scheduling_queue_test.go
+++ b/pkg/scheduler/internal/queue/scheduling_queue_test.go
@@ -323,11 +323,13 @@ func TestPriorityQueue_AddUnschedulableIfNotPresent_Backoff(t *testing.T) {

        // Since there was a move request at the same cycle as "oldCycle", these pods
        // should be in the backoff queue.
+       q.lock.RLock()
        for i := 1; i < totalNum; i++ {
                if _, exists, _ := q.podBackoffQ.Get(newPodInfoNoTimestamp(&expectedPods[i])); !exists {
                        t.Errorf("Expected %v to be added to podBackoffQ.", expectedPods[i].Name)
                }
        }
+       q.lock.RUnlock()
 }

 func TestPriorityQueue_Pop(t *testing.T) {

The test still fails with bazel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.