Skip to content

Conversation

@zhouya0
Copy link
Contributor

@zhouya0 zhouya0 commented Sep 22, 2020

What type of PR is this?
/kind flake

What this PR does / why we need it:
As all error log showing:

Sep 20 14:59:38.242: INFO: Noticed Pod "taint-eviction-b1" gets evicted.
Sep 20 15:00:26.106: FAIL: Failed to evict all Pods. 1 pod(s) is not evicted.

I think it's just the time to wait the pod eviction is not long enough since:

// 1. Run two pods both with toleration; one with tolerationSeconds=5, the other with 25
pod1 := createPodForTaintsTest(true, additionalWaitPerDeleteSeconds, podGroup+"1", podGroup, ns)
pod2 := createPodForTaintsTest(true, 5*additionalWaitPerDeleteSeconds, podGroup+"2", podGroup, ns)

We would better to wait for 5*additionalWaitPerDeleteSeconds right?

Which issue(s) this PR fixes:

Fixes #94931

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

None

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/flake Categorizes issue or PR as related to a flaky test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 22, 2020
@oomichi
Copy link
Member

oomichi commented Sep 22, 2020

5*additionalWaitPerDeleteSeconds at

pod2 := createPodForTaintsTest(true, 5*additionalWaitPerDeleteSeconds, podGroup+"2", podGroup, ns)
is passed to
Tolerations: []v1.Toleration{{Key: "kubernetes.io/e2e-evict-taint-key", Value: "evictTaintVal", Effect: v1.TaintEffectNoExecute, TolerationSeconds: &ts}},

and the tolerationSeconds field that dictates how long the pod will stay bound to the node after the taint is added according to https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
So this change seems straightforward.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 22, 2020
@zhouya0
Copy link
Contributor Author

zhouya0 commented Sep 23, 2020

/cc @ScrapCodes @hasheddan

@k8s-ci-robot
Copy link
Contributor

@zhouya0: GitHub didn't allow me to request PR reviews from the following users: ScrapCodes.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @ScrapCodes @hasheddan

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zhouya0
Copy link
Contributor Author

zhouya0 commented Sep 24, 2020

/cc @liggitt

@zhouya0
Copy link
Contributor Author

zhouya0 commented Sep 28, 2020

@oomichi Can you approve this or let me know someone could review this PR?

@michaelgugino
Copy link
Contributor

I'm not sure this is the right fix. Looking at the logs:

taint-eviction-b2  bootstrap-e2e-minion-group-bn5b  Pending  30s

Seems the second pod never successfully got scheduled to the node. Need to wait for pods to become ready before applying the taint.

@ScrapCodes
Copy link
Contributor

Isn't it enough to check if the pod is in running state before applying the taint.

	// Wait for pods to be running state before eviction happens
		framework.ExpectNoError(e2epod.WaitForPodRunningInNamespace(cs, pod1))
		framework.ExpectNoError(e2epod.WaitForPodRunningInNamespace(cs, pod2))
		framework.Logf("Pod2 is running on %v. Tainting Node", nodeName)

		// 2. Taint the nodes running those pods with a no-execute taint
		ginkgo.By("Trying to apply a taint on the Node")
		testTaint := getTestTaint()
		e2enode.AddOrUpdateTaintOnNode(cs, nodeName, testTaint)
		framework.ExpectNodeHasTaint(cs, nodeName, &testTaint)
		defer e2enode.RemoveTaintOffNode(cs, nodeName, testTaint)

What is more surprising is, how can a pod be in Pending phase after passing the check for pod is in Running state. I could be missing something here, but happy to be corrected.

@michaelgugino
Copy link
Contributor

We can see both pods go running around the same time (I just include b2 logs, b1 right below this line) in kubelet log:

I0920 14:59:10.205944    1233 status_manager.go:572] Status for pod "taint-eviction-b2_taint-multiple-pods-7572(919d65d9-5ea5-49c4-887a-242f9d66c575)" updated successfully: (2, {Phase:Running Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:08 +0000 UTC Reason: Message:} {Type:Ready Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:10 +0000 UTC Reason: Message:} {Type:ContainersReady Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:10 +0000 UTC Reason: Message:} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:08 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:10.138.0.3 PodIP:10.64.2.172 PodIPs:[{IP:10.64.2.172}] StartTime:2020-09-20 14:59:08 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:pause State:{Waiting:nil Running:&ContainerStateRunning{StartedAt:2020-09-20 14:59:09 +0000 UTC,} Terminated:nil} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:true RestartCount:0 Image:k8s.gcr.io/pause:3.2 ImageID:k8s.gcr.io/pause@sha256:927d98197ec1141a368550822d18fa1c60bdae27b78b0c004f705f548c07814f ContainerID:containerd://c497305368e0e359d8f1bfec1543f899dfe61e994306f35488df4f63f3a4990a Started:0xc001ba9aac}] QOSClass:BestEffort EphemeralContainerStatuses:[]})

So wait for running is working as intended.

Later, after the pods both receive the DELETE, the kubelet patches them back to pending.

Process between delete calls for pod b1 and b2 is about 20 seconds on the kubelet:

I0920 14:59:16.049133 1233 kubelet.go:1848] SyncLoop (DELETE, "api"): "taint-eviction-b1_taint-multiple-pods-7572(9b05492a-5393-42f0-9323-7d264a63b8c1)"

I0920 14:59:36.050043 1233 kubelet.go:1848] SyncLoop (DELETE, "api"): "taint-eviction-b2_taint-multiple-pods-7572(919d65d9-5ea5-49c4-887a-242f9d66c575)"

There's some errors around missing paths and other innocuous things in the kubelet log. It seems that the pods go pending after they are marked for deletion, but the deletion event is delayed quite a bit until all containers are deleted. I'm concerned if we adjust the timing, we might be masking something else.

The pending transition I think is another issue and is documented elsewhere (possibly?). Probably need sig-node to look at the kubelet logs.

@RobertKielty
Copy link
Member

RobertKielty commented Nov 17, 2020

@hasheddan @zhouya0 can we get together and seek out someone on @kubernetes/sig-node-bugs to help us out

We need a root cause analysis done to understand why the pods remain in a PENDING state as long as they do, see @michaelgugino's commentary in the last comment here.

Posted an assist request here https://kubernetes.slack.com/archives/C0BP8PW9G/p1605598026147600

Many thanks

@ScrapCodes
Copy link
Contributor

We can see both pods go running around the same time (I just include b2 logs, b1 right below this line) in kubelet log:

I0920 14:59:10.205944    1233 status_manager.go:572] Status for pod "taint-eviction-b2_taint-multiple-pods-7572(919d65d9-5ea5-49c4-887a-242f9d66c575)" updated successfully: (2, {Phase:Running Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:08 +0000 UTC Reason: Message:} {Type:Ready Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:10 +0000 UTC Reason: Message:} {Type:ContainersReady Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:10 +0000 UTC Reason: Message:} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2020-09-20 14:59:08 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:10.138.0.3 PodIP:10.64.2.172 PodIPs:[{IP:10.64.2.172}] StartTime:2020-09-20 14:59:08 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:pause State:{Waiting:nil Running:&ContainerStateRunning{StartedAt:2020-09-20 14:59:09 +0000 UTC,} Terminated:nil} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:true RestartCount:0 Image:k8s.gcr.io/pause:3.2 ImageID:k8s.gcr.io/pause@sha256:927d98197ec1141a368550822d18fa1c60bdae27b78b0c004f705f548c07814f ContainerID:containerd://c497305368e0e359d8f1bfec1543f899dfe61e994306f35488df4f63f3a4990a Started:0xc001ba9aac}] QOSClass:BestEffort EphemeralContainerStatuses:[]})

So wait for running is working as intended.

Later, after the pods both receive the DELETE, the kubelet patches them back to pending.

Process between delete calls for pod b1 and b2 is about 20 seconds on the kubelet:

I0920 14:59:16.049133 1233 kubelet.go:1848] SyncLoop (DELETE, "api"): "taint-eviction-b1_taint-multiple-pods-7572(9b05492a-5393-42f0-9323-7d264a63b8c1)"

I0920 14:59:36.050043 1233 kubelet.go:1848] SyncLoop (DELETE, "api"): "taint-eviction-b2_taint-multiple-pods-7572(919d65d9-5ea5-49c4-887a-242f9d66c575)"

There's some errors around missing paths and other innocuous things in the kubelet log. It seems that the pods go pending after they are marked for deletion, but the deletion event is delayed quite a bit until all containers are deleted. I'm concerned if we adjust the timing, we might be masking something else.

The pending transition I think is another issue and is documented elsewhere (possibly?). Probably need sig-node to look at the kubelet logs.

@michaelgugino Sorry it is like way to late response. But, can you help us chase that another issue you are pointing out above.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pacoxu, zhouya0
To complete the pull request process, please assign oomichi after the PR has been reviewed.
You can assign the PR to them by writing /assign @oomichi in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@SergeyKanzhelev
Copy link
Member

should we merge #94958 instead? It has a comment and bigger timeout.

@dchen1107
Copy link
Member

@zhouya0 Can you reply what @SergeyKanzhelev requested above?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 12, 2021
@k8s-ci-robot
Copy link
Contributor

@zhouya0: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zhouya0
Copy link
Contributor Author

zhouya0 commented Jan 13, 2021

@zhouya0 Can you reply what @SergeyKanzhelev requested above?

Seems someone has fixed this.
#96866
/close

@k8s-ci-robot
Copy link
Contributor

@zhouya0: Closed this PR.

In response to this:

@zhouya0 Can you reply what @SergeyKanzhelev requested above?

Seems someone has fixed this.
#96866
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[Flaky Test][k8s.io] [sig-node] NoExecuteTaintManager Multiple Pods [Serial] evicts pods with minTolerationSeconds [Disruptive] [Conformance]

9 participants