When volume is not marked in-use, do not backoff #106853

gnufied · 2021-12-07T16:52:54Z

We unnecessarily trigger exp. backoff when volume is not marked in-use. Instead we can wait for volume to be marked as in-use before triggering operation_executor. This could result in reduced time when mounting attached volumes.

/sig storage
/kind bug

cc @jsafrane @jingxu97

Allow attached volumes to be mounted quicker by skipping exp. backoff when checking for reported-in-use volumes

gnufied · 2021-12-07T16:59:53Z

/triage accepted

gnufied · 2021-12-07T19:21:59Z

/assign @jingxu97

jsafrane · 2021-12-08T08:50:44Z

/approve

Maybe a little context: right now, when a pod lands on a node, kubelet does two things in parallel:

Updates node.status.volumesInUse every 10 seconds.
VolumeManager calls VerifyControllerAttachedVolume. VerifyControllerAttachedVolume is an operation with exp. backoff and checks for the volume to be in node.status.volumesInUse. At the time the point 1. writes node status, the exp. backoff of VerifyControllerAttachedVolume can be at 5-10 seconds already, which delays volume mount.

With this PR, the exp. backoff starts when the VolumeManager knows the volume already is in node.status.volumesInUse and node.status.volumeattached (might not be latest value, so still need a final check by getting api object directly), speeding up pod startup time by few seconds.

@jingxu97, PTAL

gnufied · 2021-12-09T12:58:28Z

@jsafrane @jingxu97 so I went ahead and implemented a similar mechanism for avoiding exp. backoff while checking node.Status.VolumeAttached too. I am not yet sure if we should split the PR in two but please carefully review anyway, we can decide on that later.

gnufied · 2021-12-09T20:02:22Z

/retest

pkg/volume/util/operationexecutor/operation_generator.go

jingxu97 · 2021-12-16T20:52:22Z

In desired state, volumeToMount struct already has ReportInUse information. Whenever node status is updated, kubelet will mark volumeToMount.ReportInUse in desired state desiredStateOfWorld.MarkVolumesReportedInUse. So reconciler can directly check this, no need to check node status?

Does ReportedInUse also means volume is already attached to the node? I thought we need to check both. node.Status.VolumesAttached is updated by attach-detach-controller whereas node.Status.VolumeInUse is updated by kubelet.

The current logic in VerifyControllerAttachedVolumeFunc, we check both whether ReportedInUse and VolumesAttached. To avoid checking ReportedInUse trigger backoff, now we can move the check ReportedInUse in reconciler before calling VerifyControllerAttachedVolumeFunc. Only after ReportedInUse is added, we can go ahead to continue the previous logic to check volumeAttached.

gnufied · 2021-12-16T20:54:16Z

The current logic in VerifyControllerAttachedVolumeFunc, we check both whether ReportedInUse and VolumesAttached. To avoid checking ReportedInUse trigger backoff, now we can move the check ReportedInUse in reconciler before calling VerifyControllerAttachedVolumeFunc.

Yes I already did that. See - https://github.com/kubernetes/kubernetes/pull/106853/files#diff-e9392bf9a117fa5eda2756ca13783db044a18ad4585b9b1fad776f660dc13068R206

gnufied · 2021-12-17T02:29:54Z

@jingxu97 if the PR looks alright can you plz lgtm?

jingxu97 · 2021-12-17T05:54:11Z

pkg/kubelet/volumemanager/reconciler/reconciler_test.go

+	assert.NoError(t, volumetesting.VerifyWaitForAttachCallCount(
+		0 /* expectedWaitForAttachCallCount */, fakePlugin))
+	assert.NoError(t, volumetesting.VerifyMountDeviceCallCount(
+		0 /* expectedMountDeviceCallCount */, fakePlugin))


without this code change to pre-check volumeInUse and volumeattached, the test can also pass? It will fail VerifyControllerAttachedVolume, it will not go to waitforattachcall too?

yes that sounds right. But then it will have IsOperationSafeToRetry check failing. But since we can't count VerifyControllerAttachedVolume calls, this test here should be fine?

yeah, it is ok. we could try to improve some test logic later.

jingxu97 · 2021-12-17T05:58:27Z

pkg/kubelet/volume_host.go

@@ -270,6 +270,20 @@ func (kvh *kubeletVolumeHost) GetNodeLabels() (map[string]string, error) {
 	return node.Labels, nil
 }

+func (kvh *kubeletVolumeHost) GetAttachedVolumes() (map[v1.UniqueVolumeName]string, error) {


GetAttachedVolumes() were used in actual_state_world for both controller and kubelet side.
how about use a different name like GetAttachedVolumesFromNodeStatus?

jingxu97 · 2021-12-17T06:01:21Z

pkg/volume/util/operationexecutor/operation_generator.go

@@ -1514,6 +1515,16 @@ func (og *operationGenerator) GenerateVerifyControllerAttachedVolumeFunc(
 		return volumetypes.GeneratedOperations{}, volumeToMount.GenerateErrorDetailed("VerifyControllerAttachedVolume.FindPluginBySpec failed", err)
 	}

+	if volumeToMount.PluginIsAttachable {
+		cachedAttachedVolumes, _ := og.volumePluginMgr.Host.GetAttachedVolumes()


maybe add comment indicating this is early check attached volume from node status using nodelister to avoid back off? Later we will directly call api server to get the latest node status to confirm the VolumeAttached list to avoid race condition.

Added a comment.

jingxu97 · 2021-12-17T06:02:59Z

the code looks good to me, just a few small comments.

Thank you for the PR! I think this change can really improve some performance. Hopefully we can get some data about it. We should also consider to cherrypick this.

fix unit tests

gnufied · 2021-12-20T19:53:56Z

The usual improvement from measurments

Before:

volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.1"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.25"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.5"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="1"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="2.5"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="5"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="10"} 10
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="15"} 10
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="25"} 12
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="50"} 12
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="120"} 12
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="300"} 12
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="600"} 12
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="+Inf"} 12
volume_operation_total_seconds_sum{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 123.11098267599999
volume_operation_total_seconds_count{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 12

After
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.25"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.5"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="1"} 0
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="2.5"} 1
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="5"} 2
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="10"} 10
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="15"} 13
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="25"} 13
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="50"} 13
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="120"} 13
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="300"} 13
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="600"} 13
volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="+Inf"} 13
volume_operation_total_seconds_sum{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 107.01795236
volume_operation_total_seconds_count{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 13

So overall 10.25 vs 8.23. Also as you can see most operations now complete within 15s whereas before some operations took 20s to complete.

jsafrane · 2021-12-21T09:53:03Z

/lgtm

jingxu97 · 2021-12-22T23:01:06Z

The usual improvement from measurments

Before:

volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.1"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.25"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.5"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="1"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="2.5"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="5"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="10"} 10 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="15"} 10 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="25"} 12 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="50"} 12 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="120"} 12 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="300"} 12 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="600"} 12 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="+Inf"} 12 volume_operation_total_seconds_sum{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 123.11098267599999 volume_operation_total_seconds_count{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 12

After volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.25"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="0.5"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="1"} 0 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="2.5"} 1 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="5"} 2 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="10"} 10 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="15"} 13 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="25"} 13 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="50"} 13 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="120"} 13 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="300"} 13 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="600"} 13 volume_operation_total_seconds_bucket{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume",le="+Inf"} 13 volume_operation_total_seconds_sum{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 107.01795236 volume_operation_total_seconds_count{operation_name="volume_mount",plugin_name="kubernetes.io/vsphere-volume"} 13

So overall 10.25 vs 8.23. Also as you can see most operations now complete within 15s whereas before some operations took 20s to complete.

that's great, already 20% improvement!
/lgtm
/approve

jingxu97 · 2021-12-22T23:03:42Z

/assign @Random-Liu could you help review and approve this PR? Thanks!

Random-Liu · 2021-12-23T00:38:44Z

/lgtm
/approve

k8s-ci-robot · 2021-12-23T00:42:36Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, jingxu97, jsafrane, Random-Liu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/controller/volume/attachdetach/OWNERS~~ [gnufied,jsafrane]
~~pkg/controller/volume/expand/OWNERS~~ [gnufied,jsafrane]
~~pkg/controller/volume/persistentvolume/OWNERS~~ [gnufied,jsafrane]
~~pkg/kubelet/OWNERS~~ [Random-Liu]
~~pkg/volume/OWNERS~~ [gnufied,jsafrane]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pacoxu · 2021-12-23T03:07:44Z

Unrelated flake in pull-kubernetes-integration: FAIL: TestCronJobLaunchesPodAndCleansUp

/retest

…853-upstream-release-1.22 Automated cherry pick of #106853: When volume is not marked in-use, do not backoff

…853-upstream-release-1.23 Automated cherry pick of #106853: When volume is not marked in-use, do not backoff

jingxu97 · 2022-04-09T18:49:37Z

@gnufied should we consider cherrypick this change?

gnufied · 2022-04-12T17:05:16Z

@jingxu97 I already backported to 1.23 and 1.22 , were you thinking earlier versions than those?

jingxu97 · 2022-04-12T17:20:37Z

oh, I missed it.
Maybe 1.21 can be also useful.

When volume is not marked in-use, do not backoff

5b7b2e2

k8s-ci-robot requested review from davidz627 and jingxu97 December 7, 2021 16:53

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Dec 7, 2021

k8s-ci-robot assigned jingxu97 Dec 7, 2021

gnufied force-pushed the disable-exp-backoff-volume-not-inuse branch 2 times, most recently from f7d760a to 90bb32f Compare December 9, 2021 12:56

jsafrane reviewed Dec 10, 2021

View reviewed changes

pkg/volume/util/operationexecutor/operation_generator.go Show resolved Hide resolved

ehashman added this to Triage in SIG Node PR Triage Dec 11, 2021

jingxu97 reviewed Dec 17, 2021

View reviewed changes

use node informer to check volumes attachment status before backoff

7989f27

fix unit tests

gnufied force-pushed the disable-exp-backoff-volume-not-inuse branch from 90bb32f to 7989f27 Compare December 20, 2021 16:57

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 20, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 21, 2021

k8s-ci-robot assigned Random-Liu Dec 23, 2021

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 23, 2021

k8s-ci-robot merged commit f0dbc32 into kubernetes:master Dec 23, 2021

SIG Node PR Triage automation moved this from Triage to Done Dec 23, 2021

k8s-ci-robot added this to the v1.24 milestone Dec 23, 2021

This was referenced Jan 6, 2022

Automated cherry pick of #106853: When volume is not marked in-use, do not backoff #107390

Merged

Automated cherry pick of #106853: When volume is not marked in-use, do not backoff #107391

Merged

k8s-ci-robot added a commit that referenced this pull request Jan 25, 2022

Merge pull request #107391 from gnufied/automated-cherry-pick-of-#106…

89d0f0d

…853-upstream-release-1.22 Automated cherry pick of #106853: When volume is not marked in-use, do not backoff

k8s-ci-robot added a commit that referenced this pull request Jan 25, 2022

Merge pull request #107390 from gnufied/automated-cherry-pick-of-#106…

400f0b0

…853-upstream-release-1.23 Automated cherry pick of #106853: When volume is not marked in-use, do not backoff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When volume is not marked in-use, do not backoff #106853

When volume is not marked in-use, do not backoff #106853

gnufied commented Dec 7, 2021 •

edited

gnufied commented Dec 7, 2021

gnufied commented Dec 7, 2021

jsafrane commented Dec 8, 2021 •

edited by jingxu97

gnufied commented Dec 9, 2021

gnufied commented Dec 9, 2021

jingxu97 commented Dec 16, 2021 •

edited

gnufied commented Dec 16, 2021

gnufied commented Dec 17, 2021

jingxu97 Dec 17, 2021

gnufied Dec 20, 2021

jingxu97 Dec 20, 2021

jingxu97 Dec 17, 2021

gnufied Dec 20, 2021

jingxu97 Dec 17, 2021

gnufied Dec 20, 2021

jingxu97 commented Dec 17, 2021

gnufied commented Dec 20, 2021

jsafrane commented Dec 21, 2021

jingxu97 commented Dec 22, 2021

jingxu97 commented Dec 22, 2021

Random-Liu commented Dec 23, 2021

k8s-ci-robot commented Dec 23, 2021

pacoxu commented Dec 23, 2021

jingxu97 commented Apr 9, 2022

gnufied commented Apr 12, 2022

jingxu97 commented Apr 12, 2022

When volume is not marked in-use, do not backoff #106853

When volume is not marked in-use, do not backoff #106853

Conversation

gnufied commented Dec 7, 2021 • edited

gnufied commented Dec 7, 2021

gnufied commented Dec 7, 2021

jsafrane commented Dec 8, 2021 • edited by jingxu97

gnufied commented Dec 9, 2021

gnufied commented Dec 9, 2021

jingxu97 commented Dec 16, 2021 • edited

gnufied commented Dec 16, 2021

gnufied commented Dec 17, 2021

jingxu97 Dec 17, 2021

Choose a reason for hiding this comment

gnufied Dec 20, 2021

Choose a reason for hiding this comment

jingxu97 Dec 20, 2021

Choose a reason for hiding this comment

jingxu97 Dec 17, 2021

Choose a reason for hiding this comment

gnufied Dec 20, 2021

Choose a reason for hiding this comment

jingxu97 Dec 17, 2021

Choose a reason for hiding this comment

gnufied Dec 20, 2021

Choose a reason for hiding this comment

jingxu97 commented Dec 17, 2021

gnufied commented Dec 20, 2021

jsafrane commented Dec 21, 2021

jingxu97 commented Dec 22, 2021

jingxu97 commented Dec 22, 2021

Random-Liu commented Dec 23, 2021

k8s-ci-robot commented Dec 23, 2021

pacoxu commented Dec 23, 2021

jingxu97 commented Apr 9, 2022

gnufied commented Apr 12, 2022

jingxu97 commented Apr 12, 2022

gnufied commented Dec 7, 2021 •

edited

jsafrane commented Dec 8, 2021 •

edited by jingxu97

jingxu97 commented Dec 16, 2021 •

edited