Flaky in-tree local volume plugin tests #74229

cofyc · 2019-02-19T03:32:44Z

Which jobs are failing:

https://gubernator.k8s.io/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/34994

Which test(s) are failing:

[sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: block] [Testpattern: Pre-provisioned PV (default fs)] volumes should be mountable

Since when has it been failing:

Testgrid link:

https://testgrid.k8s.io/sig-storage-kubernetes#gce

Reason for failure:

Volume of previous pod is not unmounted before the second pod is trying to mount its volume.

Anything else we need to know:

cofyc · 2019-02-19T03:33:12Z

/sig storage
/assign

cofyc · 2019-02-19T03:54:18Z

The problem is when client is deleting first pod, it is deleted immediately.

I0219 00:28:14.843692       1 wrap.go:47] DELETE /api/v1/namespaces/volumes-8868/pods/in-tree-injector-rms5: (93.019997ms) 200 [e2e.test/v1.14.0 (linux/amd64) kubernetes/197941a -- [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: block] [Testpattern: Pre-provisioned PV (default fs)] volumes should be mountable 35.232.4.199:49772]
I0219 00:28:14.854369       1 graph_populator.go:167] deletePod volumes-8868/in-tree-injector-rms5 for node bootstrap-e2e-minion-group-j6dk

This is because when pod status phase is Succeeded, pod object is deleted immediately. But pod is set to Succeeded before its volumes are unmounted and detached.

# Pod `in-tree-injector-rms5` is set Phase=Succeeded
I0219 00:28:14.475494    1587 status_manager.go:506] Status for pod "in-tree-injector-rms5_volumes-8868(3d92bacb-33dd-11e9-9d1f-42010a8a0002)" updated successfully: (2, {Phase:Succeeded Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-02-19 00:28:12 +0000 UTC Reason:PodCompleted Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-02-19 00:28:12 +0000 UTC Reason:PodCompleted Message:} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-02-19 00:28:12 +0000 UTC Reason:PodCompleted Message:} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2019-02-19 00:28:12 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:10.138.0.3 PodIP:10.64.2.252 StartTime:2019-02-19 00:28:12 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:in-tree-injector State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:0,Signal:0,Reason:Completed,Message:,StartedAt:2019-02-19 00:28:13 +0000 UTC,FinishedAt:2019-02-19 00:28:13 +0000 UTC,ContainerID:docker://cfb092d2f4c59901c91401a762b3541209ee3a69b182258a6d22d9bafbee4118,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:busybox:1.29 ImageID:docker-pullable://busybox@sha256:e004c2cc521c95383aebb1fb5893719aa7a8eae2e7a71f316a4410784edb00a9 ContainerID:docker://cfb092d2f4c59901c91401a762b3541209ee3a69b182258a6d22d9bafbee4118}] QOSClass:BestEffort})
# Pod `in-tree-injector-rms5` volumes are going to be cleaned
I0219 00:28:15.579385    1587 desired_state_of_world_populator.go:265] Removing volume from desired state for volume "local-59fjw" (UniqueName: "kubernetes.io/local-volume/local-59fjw") pod "in-tree-injector-rms5" (UID: "3d92bacb-33dd-11e9-9d1f-42010a8a0002")
I0219 00:28:15.579594    1587 desired_state_of_world_populator.go:265] Removing volume from desired state for volume "default-token-7svtn" (UniqueName: "kubernetes.io/secret/3d92bacb-33dd-11e9-9d1f-42010a8a0002-default-token-7svtn") pod "in-tree-injector-rms5" (UID: "3d92bacb-33dd-11e9-9d1f-42010a8a0002")

in-tree-injector pod object may be deleted from apiserver before its volumes are unmounted, so it may have conflicts in setting fsGroup for second pod `in-tree-client.

If we want to wait for the pod to be deleted by kubelet instead of apiserver, we need to make it a long-running pod. Alternatively, we can make the two pods use the same fsgroup as you said.

cc @msau42 What do you think?

msau42 · 2019-02-19T18:28:34Z

cc @dashpole is this expected? I thought kubelet is supposed to delete the pod object only after the volumes are unmounted. This also impacts the StorageProtection feature, which assumes that if the Pod object is deleted, then the volume has been unmounted. For a storage backend like nfs, it may result in unmount hanging. cc @kubernetes/sig-storage-bugs

@cofyc to fix the flakiness, the simplest thing may be to make the two pods use the same fsgroup. I don't see a strong use case for two pods using different fsgroups.

dashpole · 2019-02-19T19:47:55Z

I thought kubelet is supposed to delete the pod object only after the volumes are unmounted.

That is only during graceful deletion. If a pod is Succeeded, do we begin unmounting volumes? Or do we wait until the deletion timestamp is set? Would it be good enough to start unmounting when the pod is failed/succeeded? If we want to be 100% sure this isn't a problem, we would need to always do graceful deletion, even when the pod is failed/succeeded.

msau42 · 2019-02-19T20:05:41Z

Yes, we do start unmounting when pod is succeeded or failed.

The API is setting grace period to 0 in the case of pod success or failed. Maybe that needs to be removed if we want to use Pod deletion as a signal that volumes are unmounted.

cofyc · 2019-02-21T07:25:34Z

To always do graceful deletion even when the pod is failed/succeeded is simple, then we can assume that normally when pod objects are deleted (except forcefully), volumes are already unmounted by kubelet. But this feature is added long ago, and it helps to improve performance in deleting jobs in batch.

For this flaky test only, what I need is to make sure volumes are unmounted before the second pod is going to mount them, but local volumes are not reported in VolumesInUse or other fields of node status, I can't make sure of this by checking node status either.

Possible workarounds:

make the first a long-running pod (too hacky)
wait for a while before starting the second pod (short period may be flaky too, long period increases test time)
use same fsGroup for both pods (do not test that fsGroup can be changed)

I'm going to use same fsGroup for both pods until we have a reliable way to detect volumes are unmounted or not.

cofyc added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label Feb 19, 2019

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 19, 2019

k8s-ci-robot assigned cofyc Feb 19, 2019

k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 19, 2019

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 19, 2019

cofyc mentioned this issue Feb 21, 2019

Use same fsGroup for both pods #74335

Merged

k8s-ci-robot closed this as completed in #74335 Feb 21, 2019

msau42 mentioned this issue Feb 21, 2019

Pod object can be deleted before volumes have unmounted #74372

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky in-tree local volume plugin tests #74229

Flaky in-tree local volume plugin tests #74229

cofyc commented Feb 19, 2019

cofyc commented Feb 19, 2019

cofyc commented Feb 19, 2019 •

edited

msau42 commented Feb 19, 2019

dashpole commented Feb 19, 2019

msau42 commented Feb 19, 2019

cofyc commented Feb 21, 2019 •

edited

Flaky in-tree local volume plugin tests #74229

Flaky in-tree local volume plugin tests #74229

Comments

cofyc commented Feb 19, 2019

cofyc commented Feb 19, 2019

cofyc commented Feb 19, 2019 • edited

msau42 commented Feb 19, 2019

dashpole commented Feb 19, 2019

msau42 commented Feb 19, 2019

cofyc commented Feb 21, 2019 • edited

cofyc commented Feb 19, 2019 •

edited

cofyc commented Feb 21, 2019 •

edited