Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bind extension point to the scheduling framework #79313

Merged
merged 2 commits into from Jul 16, 2019

Conversation

@chenchun
Copy link
Contributor

commented Jun 23, 2019

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes ##79297
And re-cherry-pick #78513

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Add Bind extension point to the scheduling framework.
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2019

Hi @chenchun. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chenchun

This comment has been minimized.

Copy link
Contributor Author

commented Jun 23, 2019

/assign @bsalamat
cc/ @draveness

@k8s-ci-robot k8s-ci-robot requested review from k82cn and resouer Jun 23, 2019
@chenchun chenchun force-pushed the chenchun:fix-flaky branch from ac17eab to 738b3e6 Jun 23, 2019
Copy link
Member

left a comment

/ok-to-test
/priority important-soon
/kind feature

Could you explain why the tests were flaky and the approach to fix this in the PR? It's a little hard for me to compare the diff between the previous PR.

@chenchun

This comment has been minimized.

Copy link
Contributor Author

commented Jun 24, 2019

I add a second commit 738b3e6 , please check that one.

t.Errorf("test #%v: Didn't expected the postbind plugin to be called %d times.", i, ptbdPlugin.numPostbindCalled)
}
if err = wait.Poll(10*time.Millisecond, 30*time.Second, func() (done bool, err error) {
// The scheduler may have tried to schedule the pod several times, so we can only assert numUnreserveCalled > 0

This comment has been minimized.

Copy link
@draveness

draveness Jun 25, 2019

Member

Why did The scheduler try to schedule the pod several times and could you explain the reason for retrying?

This comment has been minimized.

Copy link
@chenchun

chenchun Jun 25, 2019

Author Contributor

You can see the log "About to try and schedule pod bind-plugin " for many times in https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/79293/pull-kubernetes-integration/1142426772172705796/

I0622 14:00:13.249848  110366 scheduling_queue.go:815] About to try and schedule pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod
I0622 14:00:13.249863  110366 scheduler.go:470] Attempting to schedule pod: bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod
I0622 14:00:13.250201  110366 scheduler_binder.go:256] AssumePodVolumes for pod "bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod", node "test-node-1"
I0622 14:00:13.250216  110366 scheduler_binder.go:266] AssumePodVolumes for pod "bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod", node "test-node-1": all PVCs bound and nothing to do
E0622 14:00:13.250264  110366 framework.go:274] bind plugin bind-plugin-1 failed to bind pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: failed to bind
I0622 14:00:13.250281  110366 scheduler.go:438] Failed to bind pod: bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod
E0622 14:00:13.250298  110366 factory.go:678] Error scheduling bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: scheduler RunBindPlugins failed for pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: code 1, err <nil>; retrying
I0622 14:00:13.250328  110366 factory.go:736] Updating pod condition for bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod to (PodScheduled==False, Reason=SchedulerError)
E0622 14:00:13.250343  110366 scheduler.go:596] error binding pod: scheduler RunBindPlugins failed for pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: code 1, err <nil>
I0622 14:00:13.252746  110366 wrap.go:47] GET /api/v1/namespaces/bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/pods/test-pod: (1.76251ms) 200 [scheduler.test/v0.0.0 (linux/amd64) kubernetes/$Format 127.0.0.1:37400]
I0622 14:00:13.253554  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:14.152063  110366 wrap.go:47] GET /api/v1/namespaces/default: (1.615006ms) 200 [scheduler.test/v0.0.0 (linux/amd64) kubernetes/$Format 127.0.0.1:37400]
I0622 14:00:14.153778  110366 wrap.go:47] GET /api/v1/namespaces/default/services/kubernetes: (1.326618ms) 200 [scheduler.test/v0.0.0 (linux/amd64) kubernetes/$Format 127.0.0.1:37400]
I0622 14:00:14.155321  110366 wrap.go:47] GET /api/v1/namespaces/default/endpoints/kubernetes: (1.176727ms) 200 [scheduler.test/v0.0.0 (linux/amd64) kubernetes/$Format 127.0.0.1:37400]
I0622 14:00:14.246811  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:14.246866  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:14.246880  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:14.246954  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:14.249982  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:14.250109  110366 scheduling_queue.go:815] About to try and schedule pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod
I0622 14:00:14.250123  110366 scheduler.go:470] Attempting to schedule pod: bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod
I0622 14:00:14.250385  110366 scheduler_binder.go:256] AssumePodVolumes for pod "bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod", node "test-node-0"
I0622 14:00:14.250412  110366 scheduler_binder.go:266] AssumePodVolumes for pod "bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod", node "test-node-0": all PVCs bound and nothing to do
E0622 14:00:14.250465  110366 framework.go:274] bind plugin bind-plugin-1 failed to bind pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: failed to bind
I0622 14:00:14.250484  110366 scheduler.go:438] Failed to bind pod: bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod
E0622 14:00:14.250502  110366 factory.go:678] Error scheduling bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: scheduler RunBindPlugins failed for pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: code 1, err <nil>; retrying
I0622 14:00:14.250542  110366 factory.go:736] Updating pod condition for bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod to (PodScheduled==False, Reason=SchedulerError)
E0622 14:00:14.250557  110366 scheduler.go:596] error binding pod: scheduler RunBindPlugins failed for pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: code 1, err <nil>
I0622 14:00:14.252858  110366 wrap.go:47] GET /api/v1/namespaces/bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/pods/test-pod: (1.679811ms) 200 [scheduler.test/v0.0.0 (linux/amd64) kubernetes/$Format 127.0.0.1:37400]
I0622 14:00:14.253692  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:15.247011  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:15.247064  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:15.247078  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync
I0622 14:00:15.247118  110366 reflector.go:243] k8s.io/client-go/informers/factory.go:133: forcing resync

This comment has been minimized.

Copy link
@ahg-g

ahg-g Jun 26, 2019

Member

The reason the scheduler tries to schedule the pod multiple times is that the scheduler is running on a different thread, and could have tried to schedule the pod again before another poll is triggered.

Another problem we have in this test file in general is that we rely on shared state across tests (the counters), and each test relies on the other to reset that state, we should refactor this whole file to make sure we enforce that the state is reset before each test is run.

This comment has been minimized.

Copy link
@draveness

draveness Jun 26, 2019

Member

I opened a PR #79424 to add reset methods to all test plugins and reset them after each test which could mitigate the problem you mentioned in the second part.

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jun 26, 2019

Member

Even if the scheduler retries scheduling a pod and determines that the pod is unschedulable, it shouldn't call "Unreserve" plugins. Unreserve plugins are called only if a pod passes the "reserve" point and later on is rejected and not get bound. The problem is probably shared state that @ahg-g referred to.

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 1, 2019

Author Contributor

Even if the scheduler retries scheduling a pod and determines that the pod is unschedulable, it shouldn't call "Unreserve" plugins. Unreserve plugins are called only if a pod passes the "reserve" point and later on is rejected and not get bound.

Why is that? I think "Unreserve" plugin would be called each time bind plugin returns an error, see https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/scheduler.go#L580-L593
And my previous comment has the log framework.go:274] bind plugin bind-plugin-1 failed to bind pod bind-plugin97b74d23-14d8-4525-9072-7fbfa7cbacad/test-pod: failed to bind showing multiple times which indicates the bind plugin did have been called multiple times.

While I still deleted my second commit to see if simply rebasing revert commit against master could solve my flaky test.

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 1, 2019

Member

Why is that? I think "Unreserve" plugin would be called each time bind plugin returns an error, see https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/scheduler.go#L580-L593

That's one of the places that "unreserve" plugins are called. There are other places too:

fwk.RunUnreservePlugins(pluginContext, assumedPod, scheduleResult.SuggestedHost)

fwk.RunUnreservePlugins(pluginContext, assumedPod, scheduleResult.SuggestedHost)

That said, I see your point about bind being called multiple times. In that case, we should expect unreserve to be called the same number of times are bind is called. Due to race conditions it could sometimes be called 'bind - 1' number of times if we check "after bind, and before unreserve".

@chenchun chenchun force-pushed the chenchun:fix-flaky branch from 738b3e6 to 48ec7ba Jul 1, 2019
t.Errorf("test #%v: Didn't expected the postbind plugin to be called %d times.", i, ptbdPlugin.numPostbindCalled)
}
if err = wait.Poll(10*time.Millisecond, 30*time.Second, func() (done bool, err error) {
return unresPlugin.numUnreserveCalled == 1, nil

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 1, 2019

Member

Doesn't this have the same problem that existed in the past? If the pod is not bound and is retried, the scheduler may try binding it again and numUnreserveCalled could be larger than 1.
I think a more accurate check is numUnreserveCalled == numBindCalled || numUnreserveCalled == numBindCalled - 1.

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 2, 2019

Author Contributor

I think a more accurate check is numUnreserveCalled == numBindCalled || numUnreserveCalled == numBindCalled - 1.

I don't agree this. The check isn't atomic at all. RunUnreservePlugins may happen after numUnreserveCalled == numBindCalled and before numUnreserveCalled == numBindCalled - 1

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 2, 2019

Author Contributor

I added a commit to use chan to check the invoke nums, PTAL

Copy link
Member

left a comment

Is this related to #79639? I'm not sure.

@chenchun chenchun force-pushed the chenchun:fix-flaky branch 3 times, most recently from b75849a to fe3f0f0 Jul 2, 2019
@bsalamat bsalamat changed the title Fix flaky test TestBindPlugin Add Bind extension point to the scheduling framework Jul 4, 2019
pkg/scheduler/framework/v1alpha1/interface.go Outdated Show resolved Hide resolved
pkg/scheduler/scheduler.go Outdated Show resolved Hide resolved
// bind plugin fails to bind the pod
{
bindPluginStatuses: []*framework.Status{framework.NewStatus(framework.Error, "failed to bind"), framework.NewStatus(framework.Success, "")},
expectInvokeEvents: []pluginInvokeEvent{{pluginName: bindPlugin1.Name(), val: 1}, {pluginName: unreservePlugin.Name(), val: 1}},

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 4, 2019

Member

How does this address the problem of the scheduler retrying the pod?

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 4, 2019

Author Contributor

Actually I'm not trying to stop scheduler from retrying the pod. I'm using chan and events to prove if bind plugin fails to bind a pod, unreserve plugin will be called instead of postbind plugin.

I wrote the following program to prove that it is race to check the relation between numUnreserveCalled and numBindCalled if binding failure. Running the program will print x as 1, 0 and even -1 ... which proves it is impossible to check the result of numBindCalled -numUnreserveCalled.

func main() {
	var numBindCalled, numUnreserveCalled int
	go func() {
		for {
			// the scheduling goroutine
			time.Sleep(time.Millisecond*1)
			numBindCalled++
			numUnreserveCalled++
		}
	}()
	var x int
	for {
		// the test goroutine
		time.Sleep(time.Millisecond*1)
		x = numBindCalled - numUnreserveCalled
		fmt.Printf("x %d\n" , x)
	}
}

So I went through chan and events way. I changed this line to

expectInvokeEvents: []pluginInvokeEvent{{pluginName: bindPlugin1.Name(), val: 1}, {pluginName: unreservePlugin.Name(), val: 1}, {pluginName: bindPlugin1.Name(), val: 2}, {pluginName: unreservePlugin.Name(), val: 2}},

to make it more clear that scheduler does bind again without calling postbind plugin.

By the way I think the TestUnreservePlugin case https://github.com/kubernetes/kubernetes/blob/master/test/integration/scheduler/framework_test.go#L592-L594 is flaky too.

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 8, 2019

Member

I am not questioning the use of events and channels. I think that part makes sense. The problem is that you expect bind and unreserve to be called only once. If the scheduler retries the pod, aren't those plugins called more than once?

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 9, 2019

Author Contributor

The problem is that you expect bind and unreserve to be called only once

My past test case expect this and I deleted it in the second commit of this PR.

-                       if err = wait.Poll(10*time.Millisecond, 30*time.Second, func() (done bool, err error) {
-                               return unresPlugin.numUnreserveCalled == 1, nil
-                       }); err != nil {
-                               t.Errorf("test #%v: Expected the unreserve plugin to be called once, was called %d times.", i, unresPlugin.numUnreserveCalled)

I now expect it will be called more than once and changed bind plugin fails to bind the pod case to expect a second term scheduling cycle. See the case expect 4 invoke events, 2 from bindPlugin1 and 2 from unreservePlugin.

		// bind plugin fails to bind the pod
		{
			bindPluginStatuses: []*framework.Status{framework.NewStatus(framework.Error, "failed to bind"), framework.NewStatus(framework.Success, "")},
			expectInvokeEvents: []pluginInvokeEvent{{pluginName: bindPlugin1.Name(), val: 1}, {pluginName: unreservePlugin.Name(), val: 1}, {pluginName: bindPlugin1.Name(), val: 2}, {pluginName: unreservePlugin.Name(), val: 2}},
		},
@chenchun chenchun force-pushed the chenchun:fix-flaky branch 3 times, most recently from 01fa254 to fa0e53d Jul 4, 2019
@chenchun

This comment has been minimized.

Copy link
Contributor Author

commented Jul 4, 2019

/retest

pkg/scheduler/scheduler.go Outdated Show resolved Hide resolved
// bind plugin fails to bind the pod
{
bindPluginStatuses: []*framework.Status{framework.NewStatus(framework.Error, "failed to bind"), framework.NewStatus(framework.Success, "")},
expectInvokeEvents: []pluginInvokeEvent{{pluginName: bindPlugin1.Name(), val: 1}, {pluginName: unreservePlugin.Name(), val: 1}, {pluginName: bindPlugin1.Name(), val: 2}, {pluginName: unreservePlugin.Name(), val: 2}},

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 10, 2019

Member

We still expect these plugins to be invoked once or twice. Don't these cause similar flakes?

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 11, 2019

Author Contributor

Yes, if the scheduler goroutine is too poor to get enough cpu cycles to schedule the pod twice within 30 seconds, then this test case will be failed. But I expect it will happen rarely, right?

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 11, 2019

Author Contributor

But In order to totally resolve the race condition, should I add a func like initTestSchedulerWithOptions without calling scheduler.Run() and export scheduler.scheduleOne func so that we can call it in tests?

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 11, 2019

Member

Yes, if the scheduler goroutine is too poor to get enough cpu cycles to schedule the pod twice within 30 seconds, then this test case will be failed. But I expect it will happen rarely, right?

My worry is the reverse of this scenario. The scheduler may retry more than a couple of times before you check. So, it could retry 3 times before the test gets a chance to check the value.

But In order to totally resolve the race condition, should I add a func like initTestSchedulerWithOptions without calling scheduler.Run() and export scheduler.scheduleOne func so that we can call it in tests?

No, let's go with checking that numBindCalls >= numUnreserveCalls and numUnreserveCalls >= 1.

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 12, 2019

Author Contributor

This is the event checking parts I added in the second commit. If scheduler retries more than twice, it won't fail either, right?

		for j := range test.expectInvokeEvents {
			expectEvent := test.expectInvokeEvents[j]
			select {
			case event := <-pluginInvokeEventChan:
				if event.pluginName != expectEvent.pluginName {
					t.Errorf("test #%v: Expect invoke event %d from plugin %s instead of %s", i, j, expectEvent.pluginName, event.pluginName)
				}
				if event.val != expectEvent.val {
					t.Errorf("test #%v: Expect val of invoke event %d to be %d instead of %d", i, j, expectEvent.val, event.val)
				}
			case <-time.After(time.Second * 30):
				t.Errorf("test #%v: Waiting for invoke event %d timeout.", i, j)
			}
		}

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 12, 2019

Member

In the last test, isn't expecptInvokeEvents[3].val is 2? This is the number of times we expect the unreserve plugin to be called. If it is called more than 2 times, won't it cause if event.val != expectEvent.val to be triggered? Am I missing something?

This comment has been minimized.

Copy link
@chenchun

chenchun Jul 15, 2019

Author Contributor

No, it won't. expecptInvokeEvents[3].val is the snapshot value of numUnreserveCalled when scheduler calls unreserve plugin. It is not an int pointer to numUnreserveCalled.

This comment has been minimized.

Copy link
@bsalamat

bsalamat Jul 15, 2019

Member

Thanks for explaining. I was aware of the fact that the code was sending those numUnreserveCalled to a channel and compared them; what I missed was the fact that your code ignores additional values in the channel. So, if the unreserve plugins is called 5 times, you only care about the first two invocations that you have specified in expectInvokeEvents.

This is fine for this test that we expect the plugins to be called "at least" a certain number of times. This wouldn't work in other tests where a fixed number of invocations is expected.

@chenchun chenchun force-pushed the chenchun:fix-flaky branch from fa0e53d to 69151d2 Jul 15, 2019
@chenchun chenchun force-pushed the chenchun:fix-flaky branch from 69151d2 to e04f03d Jul 15, 2019
Copy link
Member

left a comment

/lgtm
/approve

Thanks, @chenchun!

@k8s-ci-robot k8s-ci-robot added the lgtm label Jul 15, 2019
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jul 15, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, chenchun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit fc5ef37 into kubernetes:master Jul 16, 2019
23 checks passed
23 checks passed
cla/linuxfoundation chenchun authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-iscsi Skipped.
pull-kubernetes-e2e-gce-iscsi-serial Skipped.
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
@ahg-g

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

Seems to be flaky again: #80310

I think the bind integration test is overly complicated and should be simplified, I opened an issue for it as well: #80227

@chenchun chenchun deleted the chenchun:fix-flaky branch Jul 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.