Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(daemon): create more expections when skipping pods #74856

Conversation

@draveness
Copy link
Member

commented Mar 3, 2019

What type of PR is this?

/kind bug

What this PR does / why we need it:

DaemonSetsController creates more expections when errors happens in the 2nd batch.

https://github.com/draveness/kubernetes/blame/master/pkg/controller/daemon/daemon_controller.go#L1027-L1084

	batchSize := integer.IntMin(createDiff, controller.SlowStartInitialBatchSize)
	for pos := 0; createDiff > pos; batchSize, pos = integer.IntMin(2*batchSize, createDiff-(pos+batchSize)), pos+batchSize {
		errorCount := len(errCh)
		createWait.Add(batchSize)
		for i := pos; i < pos+batchSize; i++ {
			go func(ix int) {
				defer createWait.Done()
				var err error
				// ...
			}(i)
		}
		createWait.Wait()
		// any skipped pods that we never attempted to start shouldn't be expected.
		skippedPods := createDiff - batchSize
		if errorCount < len(errCh) && skippedPods > 0 {
			klog.V(2).Infof("Slow-start failure. Skipping creation of %d pods, decrementing expectations for set %q/%q", skippedPods, ds.Namespace, ds.Name)
			for i := 0; i < skippedPods; i++ {
				dsc.expectations.CreationObserved(dsKey)
			}
			// The skipped pods will be retried later. The next controller resync will
			// retry the slow start process.
			break
		}
	}

skippedPods := createDiff - batchSize expressions doesn't consider pods which created previously. However, in job_controller it decreases the diff after each batch's creation.

https://github.com/draveness/kubernetes/blame/master/pkg/controller/job/job_controller.go#L765-L809

for batchSize := int32(integer.IntMin(int(diff), controller.SlowStartInitialBatchSize)); diff > 0; batchSize = integer.Int32Min(2*batchSize, diff) {
			errorCount := len(errCh)
			wait.Add(int(batchSize))
			for i := int32(0); i < batchSize; i++ {
				go func() {
					defer wait.Done()
					// ...
				}()
			}
			wait.Wait()
			// any skipped pods that we never attempted to start shouldn't be expected.
			skippedPods := diff - batchSize
			if errorCount < len(errCh) && skippedPods > 0 {
				klog.V(2).Infof("Slow-start failure. Skipping creation of %d pods, decrementing expectations for job %q/%q", skippedPods, job.Namespace, job.Name)
				active -= skippedPods
				for i := int32(0); i < skippedPods; i++ {
					// Decrement the expected number of creates because the informer won't observe this pod
					jm.expectations.CreationObserved(jobKey)
				}
				// The skipped pods will be retried later. The next controller resync will
				// retry the slow start process.
				break
			}
			diff -= batchSize
		}

Does this PR introduce a user-facing change?:

Remove extra pod creation expections when daemonset fails to create pods in batches.
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 3, 2019

Hi @draveness. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@draveness draveness changed the title fix(daemon): create more expections when skippingy pods fix(daemon): create more expections when skipping pods Mar 3, 2019

@k8s-ci-robot k8s-ci-robot requested review from lukaszo and mikedanese Mar 3, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented Mar 3, 2019

/assign @k82cn

@k82cn

This comment has been minimized.

Copy link
Member

commented Mar 3, 2019

can you help to add e2e test for this case?

@draveness draveness force-pushed the draveness:fix/daemonset-controller-slow-batch-creation branch from 46f21a1 to 1dad9c4 Mar 3, 2019

@k8s-ci-robot k8s-ci-robot added size/M and removed size/XS labels Mar 3, 2019

@draveness draveness force-pushed the draveness:fix/daemonset-controller-slow-batch-creation branch 2 times, most recently from 97ec9d0 to c45fd23 Mar 3, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented Mar 3, 2019

Hi @k82cn, I use a unit test to verify the pods creation expectations which works the same but simpler, is that ok?

@draveness draveness force-pushed the draveness:fix/daemonset-controller-slow-batch-creation branch 3 times, most recently from e9e2cd4 to 7e1d314 Mar 3, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented Mar 14, 2019

Hi, @k82cn it's already been a while. Could you give me some advice on this? Many thanks.

@dims

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

/ok-to-test

@draveness draveness force-pushed the draveness:fix/daemonset-controller-slow-batch-creation branch from 80607e2 to 8e76760 Apr 6, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2019

@draveness draveness force-pushed the draveness:fix/daemonset-controller-slow-batch-creation branch 2 times, most recently from ea35b7a to d8e45d3 Apr 7, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented Apr 7, 2019

The expectations for the controller is just like remainingCreations what @lavalamp has mentioned above. When there isn't an error with pod creations, the timing of sending creation expectations is when daemon set controller receives create pod notification by the informer.

func (dsc *DaemonSetsController) addPod(obj interface{}) {
	pod := obj.(*v1.Pod)

	// ...
	if controllerRef := metav1.GetControllerOf(pod); controllerRef != nil {
		ds := dsc.resolveControllerRef(pod.Namespace, controllerRef)

		dsKey, err := controller.KeyFunc(ds)
		dsc.expectations.CreationObserved(dsKey)
		dsc.enqueueDaemonSet(ds)
		return
	}

	// ...
}

But FakePodControl does not create pod actually, so we send pods creation expectations in it which means daemonset satisfied expectations after manage called:

func (dsc *DaemonSetsController) syncDaemonSet(key string) error {
	if !dsc.expectations.SatisfiedExpectations(dsKey) {
		return dsc.updateDaemonSetStatus(ds, hash, false) // originally returned from here.
	}

	err = dsc.manage(ds, hash)
	if err != nil {
		return err
	}

	if dsc.expectations.SatisfiedExpectations(dsKey) {
		switch ds.Spec.UpdateStrategy.Type {
		case apps.OnDeleteDaemonSetStrategyType:
		case apps.RollingUpdateDaemonSetStrategyType:
			err = dsc.rollingUpdate(ds, hash) // !! with rolling update strategy, update daemonset status and send an event.
		}
		if err != nil {
			return err
		}
	}

	err = dsc.cleanupHistory(ds, old)
	if err != nil {
		return fmt.Errorf("failed to clean up revisions of DaemonSet: %v", err)
	}

	return dsc.updateDaemonSetStatus(ds, hash, true) // send an event.
}

So I change the two test case TestSufficientCapacityWithTerminatedPodsDaemonLaunchesPod and TestSufficientCapacityNodeDaemonLaunchesPod as below:

func TestSufficientCapacityWithTerminatedPodsDaemonLaunchesPod(t *testing.T) {
	defer utilfeaturetesting.SetFeatureGateDuringTest(t, utilfeature.DefaultFeatureGate, features.ScheduleDaemonSetPods, false)()
	{
		strategy := newOnDeleteStrategy()
		podSpec := resourcePodSpec("too-much-mem", "75M", "75m")
		ds := newDaemonSet("foo")
		ds.Spec.UpdateStrategy = *strategy
		ds.Spec.Template.Spec = podSpec
		manager, podControl, _, err := newTestController(ds)
		if err != nil {
			t.Fatalf("error creating DaemonSets controller: %v", err)
		}
		node := newNode("too-much-mem", nil)
		node.Status.Allocatable = allocatableResources("100M", "200m")
		manager.nodeStore.Add(node)
		manager.podStore.Add(&v1.Pod{
			Spec:   podSpec,
			Status: v1.PodStatus{Phase: v1.PodSucceeded},
		})
		manager.dsStore.Add(ds)
		syncAndValidateDaemonSets(t, manager, ds, podControl, 1, 0, 1)
	}

	{
		strategy := newRollbackStrategy()
		podSpec := resourcePodSpec("too-much-mem", "75M", "75m")
		ds := newDaemonSet("foo")
		ds.Spec.UpdateStrategy = *strategy
		ds.Spec.Template.Spec = podSpec
		manager, podControl, _, err := newTestController(ds)
		if err != nil {
			t.Fatalf("error creating DaemonSets controller: %v", err)
		}
		node := newNode("too-much-mem", nil)
		node.Status.Allocatable = allocatableResources("100M", "200m")
		manager.nodeStore.Add(node)
		manager.podStore.Add(&v1.Pod{
			Spec:   podSpec,
			Status: v1.PodStatus{Phase: v1.PodSucceeded},
		})
		manager.dsStore.Add(ds)
		syncAndValidateDaemonSets(t, manager, ds, podControl, 1, 0, 2)
	}
}

I worked on this for quite a period to get it done, and I have to say it's really hard to test and verify it. And I'm very happy to change it if someone gives me more context and advice.🤣

@draveness

This comment has been minimized.

Copy link
Member Author

commented Apr 7, 2019

/hold cancel

@draveness draveness force-pushed the draveness:fix/daemonset-controller-slow-batch-creation branch 2 times, most recently from e339259 to 7a02eb3 Apr 7, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented Apr 7, 2019

/test pull-kubernetes-kubemark-e2e-gce-big

@draveness

This comment has been minimized.

Copy link
Member Author

commented Apr 18, 2019

Kindly ping @k82cn @janetkuo @mikedanese for review and approval.

@mikedanese

This comment has been minimized.

Copy link
Member

commented May 3, 2019

Agree that this is way to hard to understand, but the fix looks correct. I have one question about why we are seeing two events in those tests now when we weren't before.

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 3, 2019

Agree that this is way to hard to understand, but the fix looks correct. I have one question about why we are seeing two events in those tests now when we weren't before.

I think I explain the two events in the previous comment - when the stateful set update strategy is rollingUpdate, it would check whether daemonset satisfies the expectations in syncDeamonSet function:

  • Without sending creation expectations in fake pod control, controller emits creation expectations when it receives pod creation notification which would never happen in the unit test.
  • With sending creation expectations in fake pod control, fakePodControl issue creation expectation synchronously, so the controller satisfies the expectation immediately after syncing.

When the expectations are satisfied, and the update strategy is rolling update, the logic will go into rollingUpdate function and send another event. I know it's a complicated fix and took me quite a long time to figure out. You could see the previous comments for more information.

BTW, the previous fix has no changes on other unit tests, but it was involved in some math which is hard to understand. And you could find it here 0d37ef6 if you are interested in the previous approach.

@mikedanese

This comment has been minimized.

Copy link
Member

commented May 3, 2019

Can you address my comments?

@draveness draveness force-pushed the draveness:fix/daemonset-controller-slow-batch-creation branch from 7a02eb3 to 5f8dfdc May 4, 2019

@draveness

This comment has been minimized.

Copy link
Member Author

commented May 4, 2019

Hi @mikedanese, just resolved the comments, PTAL.

@mikedanese

This comment has been minimized.

Copy link
Member

commented May 4, 2019

/lgtm
/approve

I see SlowStart got copied and pasted a few times. I filed #77436 to clean this up.

@k8s-ci-robot k8s-ci-robot added the lgtm label May 4, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 4, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: draveness, mikedanese

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit d5245b9 into kubernetes:master May 4, 2019

20 checks passed

cla/linuxfoundation draveness authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details

@draveness draveness deleted the draveness:fix/daemonset-controller-slow-batch-creation branch May 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.