Enabling timeout based usage of docker client API during startup #1097

pshahzeb · 2017-03-28T00:31:43Z

Using timout based context while using docker client apis on plugin side to
popluate the existing volume data
upgrading the docker client api version to 1.25
Removing environment variable to enable the volume discovery

Testing:

Restarting docker
Create volume test1
Attach it to three containers and keep containers running in -d mode
Following is the refcount from the logs

2017-03-27 16:24:19.548291542 -0700 PDT [INFO] Mounting volume name=test1
2017-03-27 16:24:19.548342906 -0700 PDT [INFO] Already mounted, skipping mount. refcount=3 name=test1

restart docker using
service docker restart

Timeout based context prevents deadlock

2017-03-27 16:30:24.331275264 -0700 PDT [INFO] Plugin options - port=1019
2017-03-27 16:30:24.331309301 -0700 PDT [INFO] Getting volume data from unix:///var/run/docker.sock
2017-03-27 16:30:26.332302249 -0700 PDT [INFO] Can't connect to unix:///var/run/docker.sock due to (An error occurred trying to connect: context deadline exceeded), skipping discovery
2017-03-27 16:30:26.332612952 -0700 PDT [INFO] Docker VMDK plugin started version="vSphere Volume Driver v0.4" port=1019 mock_esx=false

docker engine initializes properly and volume created can be seen

Restarting docker-volume-vsphere managed plugin

Create volume test1
Attach it to three containers and keep containers running in -d mode
Following is the refcount from the logs

2017-03-27 16:34:25.864897963 -0700 PDT [INFO] Mounting volume name=test1
2017-03-27 16:34:25.864943268 -0700 PDT [INFO] Already mounted, skipping mount. refcount=3 name=test1

kill plugin using docker-runc
plugin is restarted by docker

discovery complete

2017-03-27 17:03:36.418383937 -0700 PDT [INFO] Starting plugin driver=vsphere log_level=info config="/etc/docker-volume-vsphere.conf"
2017-03-27 17:03:36.418642322 -0700 PDT [INFO] Plugin options - port=1019
2017-03-27 17:03:36.418855025 -0700 PDT [INFO] Getting volume data from unix:///var/run/docker.sock
2017-03-27 17:03:36.434455603 -0700 PDT [INFO] Discovered 1 volumes in use.
2017-03-27 17:03:36.434493635 -0700 PDT [INFO] Volume name=test1 count=3 mounted=true device='/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0'
2017-03-27 17:03:36.434502803 -0700 PDT [INFO] Docker VMDK plugin started version="vSphere Volume Driver v0.4" port=1019 mock_esx=false
2017-03-27 17:03:36.434768084 -0700 PDT [INFO] Going into ServeUnix - Listening on Unix socket address="/run/docker/plugins/vsphere.sock"

Fixes #1050

1. Using timout based context while using docker client apis on plugin side to popluate the existing volume data 2. upgrading the docker client api version to 1.25 3. Removing environment variable to enable the volume discovery Fixes #1050

pdhamdhere

Thanks for detail PR description. Some comments below.

pdhamdhere · 2017-03-28T05:25:16Z

plugin/config.json

-      "value": "info",
-      "Settable": [ "value"]
-	},
+	"Env": [
 	{"name": "VDVS_DISCOVER_VOLUMES",


You took out wrong config option. Keep VDVS_LOG_LEVEL and remove VDVS_DISCOVER_VOLUMES

pdhamdhere · 2017-03-28T05:26:05Z

vmdk_plugin/utils/refcount/refcnt.go

@@ -88,9 +88,11 @@ import (
 )

 const (
-	ApiVersion              = "v1.22"
+	ApiVersion              = "v1.24"


Which Docker Daemon (1.12/1.13?) supports this API version. Please add a comment.

pdhamdhere · 2017-03-28T05:26:53Z

vmdk_plugin/utils/refcount/refcnt.go

 	DockerUSocket           = "unix:///var/run/docker.sock"
 	defaultSleepIntervalSec = 1
+	timeoutInSecs           = 2


timeoutInSecs => dockerConnTimeout = 2 //Secs ?

pdhamdhere · 2017-03-28T05:30:34Z

vmdk_plugin/utils/refcount/refcnt.go

 	DockerUSocket           = "unix:///var/run/docker.sock"
 	defaultSleepIntervalSec = 1
+	timeoutInSecs           = 2
+	mountLocation           = "/mnt/vmdk"


There is already "mountRoot"

govint · 2017-03-28T05:50:30Z

Per issue 31903 on docker the plugin restore will not be able to use the Docker API (yes?). In which case this change will not work as the docker API isn't available?

Believe a simpler approach is to use a refcount-restore task to run the "post init" for the plugin (essentially restoring any ref counts for volumes) and skip doing any ref count resotre during plugin startup itself. The plugin initialization starts the task and completes initialization and let Docker continue on its way.

The refcount-retore task attempts to connect and use Docker API within some number of attempts with an "exponential backoff" and ultimately restores or gives up on the refcounts.

The plugin can then service volume requests based on what Docker is asking while the refcount restore goes on. As long as refcount-restore task is running, the plugin will skip doing unmounts. The refcount-restore task checks all mounted volumes and those that aren't having any refcounts will be unmounted by it. While those volumes in use have their refcounts restored.

pdhamdhere · 2017-03-28T05:51:17Z

BTW, you also need to enable Tests

pshahzeb · 2017-03-28T19:05:25Z

@govint thanks for pointing this.
Essentially, this PR will not let plugin enter into a deadlock where plugin hangs on docker api and docker waits to talk to plugin.

If docker doesn't respond to api calls, we timeout and mark ref counts to be zero.
I get your point of docker being up but could be busy trying to talk to plugin with exponential backoff (docker might do this due to some volume command it receives in the middle when plugin is initializing).And we timeout on our docker client api call and give up on refcounts.

Agree that a safer way to work with refcounts would be to calculate them post init of the plugin
But I think that change can be done separately.
CC @pdhamdhere @msterin

pshahzeb · 2017-03-29T06:46:51Z

@pdhamdhere addressed your comments.

pdhamdhere

Minor comments. LGTM.

pdhamdhere · 2017-03-29T16:07:38Z

misc/scripts/refcnt_test.sh

-    # TBD: bring it back when bringing discovery back (see PR #1050)
-    echo "Skippint crash recovery check (see #1050)... "
-    return
+    alive_containers = `docker ps | grep busybox | wc -l`


Why you had to add this?

pdhamdhere · 2017-03-29T16:09:50Z

vmdk_plugin/utils/refcount/refcnt.go

 	DockerUSocket           = "unix:///var/run/docker.sock"
 	defaultSleepIntervalSec = 1
+	dockerConnTimeout       = 2 // Seconds


Let's keep it consistent with L93 and rename it to dockerConnTimeoutSec

govint · 2017-03-29T18:40:10Z

vmdk_plugin/utils/refcount/refcnt.go

+			// check if the mount location belongs to vmdk plugin
+			// managed plugin has mount source in format:
+			// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}'
+			if strings.Contains(mount.Source, mountRoot) {


Can the check to compare with the driver name be retained? Is that changed with the container based plugin?

yes, the driver name in managed plugin is unpredictable and depends on location in dockerhub

Another point: the check above has many holes, I suggest tightening it down a bit, e.h. "starts with /mnt/vmdk OR (starts with /var/lib/docker/plugins/ AND has /mnt/vmdk "

govint · 2017-03-29T18:55:28Z

vmdk_plugin/utils/refcount/refcnt.go

 		return err
 	}

 	log.Debugf("Found %d running or paused containers", len(containers))
 	for _, ct := range containers {
-		containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID)
+		ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second)
+		defer cancel_inspect()


Does the cancel_inspect var given its re-assigned in each iteration work ok to remove each created context? Would just the last context created be removed? Since its a loop should the cancel func be called explicitly vs. reassigning the var.

Yes it removes each created context. Could you please explain the last point? the cancel func is called explicity through the reassigned cancel_inspect var

Ok, my doubt was the reassignment of cancel_inspect var. Will the defer statement ensure that the cancel function is called for each created context or just the last one? Looks like defer statement evaluates the args when the statement is executed and saves the defer'ed function to call. It should be ok then.

govint · 2017-03-29T19:00:04Z

misc/scripts/refcnt_test.sh

+        echo "FAILED CRASH RECOVERY TEST. Not all test containers are running"
+        exit 1
+    fi
+    echo "$count containers are running with $vname attached"


The way this test script works the timeouts for each of the containers has been kept so that they are still around by the time the plugin restarts and verifies the volume ref count. Does the alive_containers include the plugin also? Do we need this check at all?

govint

A few questions on the use of contexts inside a loop. Can it be confirmed that with current Docker release the plugin is able to connect with Docker during plugin initialization. And doesn't timeout in this code.

msterin

Looks good, with the comments below.
Also , please file an issue for ./vendor update, we probably need to update all out dependencies and fix fvt documentation right after 0.13 is cut

msterin · 2017-03-29T19:05:41Z

vmdk_plugin/utils/refcount/refcnt.go

+			// check if the mount location belongs to vmdk plugin
+			// managed plugin has mount source in format:
+			// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}'
+			if strings.Contains(mount.Source, mountRoot) {


yes, the driver name in managed plugin is unpredictable and depends on location in dockerhub

Another point: the check above has many holes, I suggest tightening it down a bit, e.h. "starts with /mnt/vmdk OR (starts with /var/lib/docker/plugins/ AND has /mnt/vmdk "

msterin · 2017-03-29T19:06:55Z

vmdk_plugin/utils/refcount/refcnt.go

-		containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID)
+		ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second)
+		defer cancel_inspect()
+		containerJSONInfo, err := c.ContainerInspect(ctx_inspect, ct.ID)


need to return (not continue) on error.
It would be usefule to mention in the comment that we delibertly leave the refcount table half-populated in this case, since it will improve the chances of correct operation after this specific failure to communicate to Docker

msterin · 2017-03-29T19:07:10Z

vmdk_plugin/utils/refcount/refcnt.go

@@ -272,25 +272,34 @@ func (r RefCountsMap) discoverAndSync(c *client.Client, d drivers.VolumeDriver)
 	filters.Add("status", "running")
 	filters.Add("status", "paused")
 	filters.Add("status", "restarting")
-	containers, err := c.ContainerList(context.Background(), types.ContainerListOptions{
+
+	ctx, cancel := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second)


the context can be reused down in this function

The cancel function releases the context resources. from here. So if the docker api (in which we use the context) returns before timeout, the context is cancelled. This context if used again won't timeout again after the specified timeout, because it was cancelled once. Tried this with a small go code to reuse the context created once and used again and again in a loop. So I think we can't reuse the context.

msterin · 2017-03-29T19:07:47Z

misc/scripts/refcnt_test.sh

-    return
+    alive_containers = `docker ps | grep busybox | wc -l`
+    if [ $alive_containers -ne $count] ; then
+        echo "FAILED CRASH RECOVERY TEST. Not all test containers are running"


+1. Unless there is something we want to find out with this count, I suggest to drop it.

pshahzeb · 2017-03-30T02:28:47Z

Addressed the changes requested in latest commit.

msterin

LGTM.
A few small questions/nits. feel free to address or ignore.

msterin · 2017-03-30T05:00:42Z

misc/scripts/refcnt_test.sh

@@ -133,13 +128,15 @@ echo $last_line | $GREP -q refcount=$count ; if [ $? -ne 0 ] ; then
   exit 1
 fi

+'
+Disabling this check due to race. See issue #1112


is it still failing ?

Yes. Everytime.

msterin · 2017-03-30T05:02:15Z

vmdk_plugin/utils/refcount/refcnt.go

@@ -257,6 +257,24 @@ func (r RefCountsMap) Decr(vol string) (uint, error) {
 	}
 	return rc.count, nil
 }
+// check if volume with source as mount_source belongs to vsphere plugin
+func matchNameforVMDK(mount_source string) bool {
+	var managedPluginMountStart string = "/var/lib/docker/plugins/"


nit: I suspect that just managedPluginMountStart := "/var/lib/docker/plugins/" will work equally fine

msterin · 2017-03-30T05:03:11Z

vmdk_plugin/utils/refcount/refcnt.go

 		return err
 	}

 	log.Debugf("Found %d running or paused containers", len(containers))
 	for _, ct := range containers {
-		containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID)
+		ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeoutSec*time.Second)
+		defer cancel_inspect()


so , is it really needed or can you reuse ctx objec from Line 294 ?

can't reuse. Cancel releases the resources of the context.

The context will be released only after the function returns or in each loop? It should be once the function returns.

It will be released after function returns in each iteration of the loop.

can't reuse. Cancel releases the resources of the context.

of course it does. The question is - why do you need to cancel before the final exit from the function ? Context defined on L263 will do exactly that... what am I missing ?

pdhamdhere · 2017-03-30T05:11:49Z

vmdk_plugin/utils/refcount/refcnt.go

+	// if plugin is used as managed plugin
+	// managed plugin has mount source in format:
+	// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}'
+	if strings.HasPrefix(mount_source, managedPluginMountStart) && strings.Contains(mount_source, mountRoot) {


This will break Photon driver which continues to use RPM. @govint

Yes, till PhotonOS upgrades to Docker 1.13 or later. Not sure that the plugin is used with Photon Controller yet. I'll make a separate issue to support the refcounter with Photon once this change is submitted.

vmwclabot added the cla-not-required label Mar 28, 2017

pshahzeb changed the title ~~Enabling timeout baesd usage of docker client API during startup~~ Enabling timeout based usage of docker client API during startup Mar 28, 2017

Client api version fix for CI machines

997badf

pdhamdhere suggested changes Mar 28, 2017

View reviewed changes

pshahzeb added 2 commits March 28, 2017 13:51

Enabling refcount test. And minor changes.

db7cc5c

Removing .swp files

f7b2d10

pdhamdhere approved these changes Mar 29, 2017

View reviewed changes

govint reviewed Mar 29, 2017

View reviewed changes

msterin reviewed Mar 29, 2017

View reviewed changes

Addressing comments to correct refCnt test and tigher mount name checks

9f46039

msterin approved these changes Mar 30, 2017

View reviewed changes

Merge branch 'master' into enable_discovery.pshahzeb

4146dd9

pdhamdhere reviewed Mar 30, 2017

View reviewed changes

Addressing nits

485f6e5

pshahzeb merged commit 83ef8b1 into master Mar 30, 2017

shuklanirdesh82 deleted the enable_discovery.pshahzeb branch March 31, 2017 07:59

Enabling timeout based usage of docker client API during startup #1097

Enabling timeout based usage of docker client API during startup #1097

Conversation

pshahzeb commented Mar 28, 2017

pdhamdhere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint commented Mar 28, 2017

pdhamdhere commented Mar 28, 2017

pshahzeb commented Mar 28, 2017

pshahzeb commented Mar 29, 2017

pdhamdhere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint left a comment

Choose a reason for hiding this comment

msterin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pshahzeb commented Mar 30, 2017

msterin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment