Skip to content
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.

Enabling timeout based usage of docker client API during startup #1097

Merged
merged 7 commits into from
Mar 30, 2017

Conversation

pshahzeb
Copy link
Contributor

  1. Using timout based context while using docker client apis on plugin side to
    popluate the existing volume data

  2. upgrading the docker client api version to 1.25

  3. Removing environment variable to enable the volume discovery

Testing:

  1. Restarting docker
    Create volume test1
    Attach it to three containers and keep containers running in -d mode
    Following is the refcount from the logs
2017-03-27 16:24:19.548291542 -0700 PDT [INFO] Mounting volume name=test1
2017-03-27 16:24:19.548342906 -0700 PDT [INFO] Already mounted, skipping mount. refcount=3 name=test1

restart docker using
service docker restart

Timeout based context prevents deadlock

2017-03-27 16:30:24.331275264 -0700 PDT [INFO] Plugin options - port=1019
2017-03-27 16:30:24.331309301 -0700 PDT [INFO] Getting volume data from unix:///var/run/docker.sock
2017-03-27 16:30:26.332302249 -0700 PDT [INFO] Can't connect to unix:///var/run/docker.sock due to (An error occurred trying to connect: context deadline exceeded), skipping discovery
2017-03-27 16:30:26.332612952 -0700 PDT [INFO] Docker VMDK plugin started version="vSphere Volume Driver v0.4" port=1019 mock_esx=false

docker engine initializes properly and volume created can be seen

  1. Restarting docker-volume-vsphere managed plugin

Create volume test1
Attach it to three containers and keep containers running in -d mode
Following is the refcount from the logs

2017-03-27 16:34:25.864897963 -0700 PDT [INFO] Mounting volume name=test1
2017-03-27 16:34:25.864943268 -0700 PDT [INFO] Already mounted, skipping mount. refcount=3 name=test1

kill plugin using docker-runc
plugin is restarted by docker

discovery complete

2017-03-27 17:03:36.418383937 -0700 PDT [INFO] Starting plugin driver=vsphere log_level=info config="/etc/docker-volume-vsphere.conf"
2017-03-27 17:03:36.418642322 -0700 PDT [INFO] Plugin options - port=1019
2017-03-27 17:03:36.418855025 -0700 PDT [INFO] Getting volume data from unix:///var/run/docker.sock
2017-03-27 17:03:36.434455603 -0700 PDT [INFO] Discovered 1 volumes in use.
2017-03-27 17:03:36.434493635 -0700 PDT [INFO] Volume name=test1 count=3 mounted=true device='/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0'
2017-03-27 17:03:36.434502803 -0700 PDT [INFO] Docker VMDK plugin started version="vSphere Volume Driver v0.4" port=1019 mock_esx=false
2017-03-27 17:03:36.434768084 -0700 PDT [INFO] Going into ServeUnix - Listening on Unix socket address="/run/docker/plugins/vsphere.sock"

Fixes #1050

1. Using timout based context while using docker client apis on plugin side to
   popluate the existing volume data

2. upgrading the docker client api version to 1.25

3. Removing environment variable to enable the volume discovery

Fixes #1050
@pshahzeb pshahzeb changed the title Enabling timeout baesd usage of docker client API during startup Enabling timeout based usage of docker client API during startup Mar 28, 2017
Copy link
Contributor

@pdhamdhere pdhamdhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for detail PR description. Some comments below.

"value": "info",
"Settable": [ "value"]
},
"Env": [
{"name": "VDVS_DISCOVER_VOLUMES",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You took out wrong config option. Keep VDVS_LOG_LEVEL and remove VDVS_DISCOVER_VOLUMES

@@ -88,9 +88,11 @@ import (
)

const (
ApiVersion = "v1.22"
ApiVersion = "v1.24"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which Docker Daemon (1.12/1.13?) supports this API version. Please add a comment.

DockerUSocket = "unix:///var/run/docker.sock"
defaultSleepIntervalSec = 1
timeoutInSecs = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeoutInSecs => dockerConnTimeout = 2 //Secs ?

DockerUSocket = "unix:///var/run/docker.sock"
defaultSleepIntervalSec = 1
timeoutInSecs = 2
mountLocation = "/mnt/vmdk"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already "mountRoot"

@govint
Copy link
Contributor

govint commented Mar 28, 2017

Per issue 31903 on docker the plugin restore will not be able to use the Docker API (yes?). In which case this change will not work as the docker API isn't available?

Believe a simpler approach is to use a refcount-restore task to run the "post init" for the plugin (essentially restoring any ref counts for volumes) and skip doing any ref count resotre during plugin startup itself. The plugin initialization starts the task and completes initialization and let Docker continue on its way.

The refcount-retore task attempts to connect and use Docker API within some number of attempts with an "exponential backoff" and ultimately restores or gives up on the refcounts.

The plugin can then service volume requests based on what Docker is asking while the refcount restore goes on. As long as refcount-restore task is running, the plugin will skip doing unmounts. The refcount-restore task checks all mounted volumes and those that aren't having any refcounts will be unmounted by it. While those volumes in use have their refcounts restored.

@pdhamdhere
Copy link
Contributor

BTW, you also need to enable Tests

@pshahzeb
Copy link
Contributor Author

@govint thanks for pointing this.
Essentially, this PR will not let plugin enter into a deadlock where plugin hangs on docker api and docker waits to talk to plugin.

If docker doesn't respond to api calls, we timeout and mark ref counts to be zero.
I get your point of docker being up but could be busy trying to talk to plugin with exponential backoff (docker might do this due to some volume command it receives in the middle when plugin is initializing).And we timeout on our docker client api call and give up on refcounts.

Agree that a safer way to work with refcounts would be to calculate them post init of the plugin
But I think that change can be done separately.
CC @pdhamdhere @msterin

@pshahzeb
Copy link
Contributor Author

@pdhamdhere addressed your comments.

Copy link
Contributor

@pdhamdhere pdhamdhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. LGTM.

# TBD: bring it back when bringing discovery back (see PR #1050)
echo "Skippint crash recovery check (see #1050)... "
return
alive_containers = `docker ps | grep busybox | wc -l`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why you had to add this?

DockerUSocket = "unix:///var/run/docker.sock"
defaultSleepIntervalSec = 1
dockerConnTimeout = 2 // Seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it consistent with L93 and rename it to dockerConnTimeoutSec

// check if the mount location belongs to vmdk plugin
// managed plugin has mount source in format:
// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}'
if strings.Contains(mount.Source, mountRoot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the check to compare with the driver name be retained? Is that changed with the container based plugin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the driver name in managed plugin is unpredictable and depends on location in dockerhub

Another point: the check above has many holes, I suggest tightening it down a bit, e.h. "starts with /mnt/vmdk OR (starts with /var/lib/docker/plugins/ AND has /mnt/vmdk "

return err
}

log.Debugf("Found %d running or paused containers", len(containers))
for _, ct := range containers {
containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID)
ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second)
defer cancel_inspect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the cancel_inspect var given its re-assigned in each iteration work ok to remove each created context? Would just the last context created be removed? Since its a loop should the cancel func be called explicitly vs. reassigning the var.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it removes each created context. Could you please explain the last point? the cancel func is called explicity through the reassigned cancel_inspect var

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, my doubt was the reassignment of cancel_inspect var. Will the defer statement ensure that the cancel function is called for each created context or just the last one? Looks like defer statement evaluates the args when the statement is executed and saves the defer'ed function to call. It should be ok then.

echo "FAILED CRASH RECOVERY TEST. Not all test containers are running"
exit 1
fi
echo "$count containers are running with $vname attached"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this test script works the timeouts for each of the containers has been kept so that they are still around by the time the plugin restarts and verifies the volume ref count. Does the alive_containers include the plugin also? Do we need this check at all?

Copy link
Contributor

@govint govint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions on the use of contexts inside a loop. Can it be confirmed that with current Docker release the plugin is able to connect with Docker during plugin initialization. And doesn't timeout in this code.

Copy link
Contributor

@msterin msterin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, with the comments below.
Also , please file an issue for ./vendor update, we probably need to update all out dependencies and fix fvt documentation right after 0.13 is cut

// check if the mount location belongs to vmdk plugin
// managed plugin has mount source in format:
// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}'
if strings.Contains(mount.Source, mountRoot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the driver name in managed plugin is unpredictable and depends on location in dockerhub

Another point: the check above has many holes, I suggest tightening it down a bit, e.h. "starts with /mnt/vmdk OR (starts with /var/lib/docker/plugins/ AND has /mnt/vmdk "

containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID)
ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second)
defer cancel_inspect()
containerJSONInfo, err := c.ContainerInspect(ctx_inspect, ct.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to return (not continue) on error.
It would be usefule to mention in the comment that we delibertly leave the refcount table half-populated in this case, since it will improve the chances of correct operation after this specific failure to communicate to Docker

@@ -272,25 +272,34 @@ func (r RefCountsMap) discoverAndSync(c *client.Client, d drivers.VolumeDriver)
filters.Add("status", "running")
filters.Add("status", "paused")
filters.Add("status", "restarting")
containers, err := c.ContainerList(context.Background(), types.ContainerListOptions{

ctx, cancel := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the context can be reused down in this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cancel function releases the context resources. from here. So if the docker api (in which we use the context) returns before timeout, the context is cancelled. This context if used again won't timeout again after the specified timeout, because it was cancelled once. Tried this with a small go code to reuse the context created once and used again and again in a loop. So I think we can't reuse the context.

return
alive_containers = `docker ps | grep busybox | wc -l`
if [ $alive_containers -ne $count] ; then
echo "FAILED CRASH RECOVERY TEST. Not all test containers are running"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Unless there is something we want to find out with this count, I suggest to drop it.

@pshahzeb
Copy link
Contributor Author

Addressed the changes requested in latest commit.

Copy link
Contributor

@msterin msterin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
A few small questions/nits. feel free to address or ignore.

@@ -133,13 +128,15 @@ echo $last_line | $GREP -q refcount=$count ; if [ $? -ne 0 ] ; then
exit 1
fi

'
Disabling this check due to race. See issue #1112
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it still failing ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Everytime.

@@ -257,6 +257,24 @@ func (r RefCountsMap) Decr(vol string) (uint, error) {
}
return rc.count, nil
}
// check if volume with source as mount_source belongs to vsphere plugin
func matchNameforVMDK(mount_source string) bool {
var managedPluginMountStart string = "/var/lib/docker/plugins/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suspect that just managedPluginMountStart := "/var/lib/docker/plugins/" will work equally fine

return err
}

log.Debugf("Found %d running or paused containers", len(containers))
for _, ct := range containers {
containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID)
ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeoutSec*time.Second)
defer cancel_inspect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so , is it really needed or can you reuse ctx objec from Line 294 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't reuse. Cancel releases the resources of the context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context will be released only after the function returns or in each loop? It should be once the function returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be released after function returns in each iteration of the loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't reuse. Cancel releases the resources of the context.

of course it does. The question is - why do you need to cancel before the final exit from the function ? Context defined on L263 will do exactly that... what am I missing ?

// if plugin is used as managed plugin
// managed plugin has mount source in format:
// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}'
if strings.HasPrefix(mount_source, managedPluginMountStart) && strings.Contains(mount_source, mountRoot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break Photon driver which continues to use RPM. @govint

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, till PhotonOS upgrades to Docker 1.13 or later. Not sure that the plugin is used with Photon Controller yet. I'll make a separate issue to support the refcounter with Photon once this change is submitted.

@pshahzeb pshahzeb merged commit 83ef8b1 into master Mar 30, 2017
@shuklanirdesh82 shuklanirdesh82 deleted the enable_discovery.pshahzeb branch March 31, 2017 07:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants