-
Notifications
You must be signed in to change notification settings - Fork 95
Enabling timeout based usage of docker client API during startup #1097
Conversation
1. Using timout based context while using docker client apis on plugin side to popluate the existing volume data 2. upgrading the docker client api version to 1.25 3. Removing environment variable to enable the volume discovery Fixes #1050
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for detail PR description. Some comments below.
plugin/config.json
Outdated
"value": "info", | ||
"Settable": [ "value"] | ||
}, | ||
"Env": [ | ||
{"name": "VDVS_DISCOVER_VOLUMES", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You took out wrong config option. Keep VDVS_LOG_LEVEL and remove VDVS_DISCOVER_VOLUMES
vmdk_plugin/utils/refcount/refcnt.go
Outdated
@@ -88,9 +88,11 @@ import ( | |||
) | |||
|
|||
const ( | |||
ApiVersion = "v1.22" | |||
ApiVersion = "v1.24" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which Docker Daemon (1.12/1.13?) supports this API version. Please add a comment.
vmdk_plugin/utils/refcount/refcnt.go
Outdated
DockerUSocket = "unix:///var/run/docker.sock" | ||
defaultSleepIntervalSec = 1 | ||
timeoutInSecs = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeoutInSecs => dockerConnTimeout = 2 //Secs ?
vmdk_plugin/utils/refcount/refcnt.go
Outdated
DockerUSocket = "unix:///var/run/docker.sock" | ||
defaultSleepIntervalSec = 1 | ||
timeoutInSecs = 2 | ||
mountLocation = "/mnt/vmdk" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already "mountRoot"
Per issue 31903 on docker the plugin restore will not be able to use the Docker API (yes?). In which case this change will not work as the docker API isn't available? Believe a simpler approach is to use a refcount-restore task to run the "post init" for the plugin (essentially restoring any ref counts for volumes) and skip doing any ref count resotre during plugin startup itself. The plugin initialization starts the task and completes initialization and let Docker continue on its way. The refcount-retore task attempts to connect and use Docker API within some number of attempts with an "exponential backoff" and ultimately restores or gives up on the refcounts. The plugin can then service volume requests based on what Docker is asking while the refcount restore goes on. As long as refcount-restore task is running, the plugin will skip doing unmounts. The refcount-restore task checks all mounted volumes and those that aren't having any refcounts will be unmounted by it. While those volumes in use have their refcounts restored. |
BTW, you also need to enable Tests |
@govint thanks for pointing this. If docker doesn't respond to api calls, we timeout and mark ref counts to be zero. Agree that a safer way to work with refcounts would be to calculate them post init of the plugin |
@pdhamdhere addressed your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments. LGTM.
misc/scripts/refcnt_test.sh
Outdated
# TBD: bring it back when bringing discovery back (see PR #1050) | ||
echo "Skippint crash recovery check (see #1050)... " | ||
return | ||
alive_containers = `docker ps | grep busybox | wc -l` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you had to add this?
vmdk_plugin/utils/refcount/refcnt.go
Outdated
DockerUSocket = "unix:///var/run/docker.sock" | ||
defaultSleepIntervalSec = 1 | ||
dockerConnTimeout = 2 // Seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep it consistent with L93 and rename it to dockerConnTimeoutSec
vmdk_plugin/utils/refcount/refcnt.go
Outdated
// check if the mount location belongs to vmdk plugin | ||
// managed plugin has mount source in format: | ||
// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}' | ||
if strings.Contains(mount.Source, mountRoot) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the check to compare with the driver name be retained? Is that changed with the container based plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the driver name in managed plugin is unpredictable and depends on location in dockerhub
Another point: the check above has many holes, I suggest tightening it down a bit, e.h. "starts with /mnt/vmdk OR (starts with /var/lib/docker/plugins/ AND has /mnt/vmdk "
vmdk_plugin/utils/refcount/refcnt.go
Outdated
return err | ||
} | ||
|
||
log.Debugf("Found %d running or paused containers", len(containers)) | ||
for _, ct := range containers { | ||
containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID) | ||
ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second) | ||
defer cancel_inspect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the cancel_inspect var given its re-assigned in each iteration work ok to remove each created context? Would just the last context created be removed? Since its a loop should the cancel func be called explicitly vs. reassigning the var.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it removes each created context. Could you please explain the last point? the cancel func is called explicity through the reassigned cancel_inspect var
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, my doubt was the reassignment of cancel_inspect var. Will the defer statement ensure that the cancel function is called for each created context or just the last one? Looks like defer statement evaluates the args when the statement is executed and saves the defer'ed function to call. It should be ok then.
misc/scripts/refcnt_test.sh
Outdated
echo "FAILED CRASH RECOVERY TEST. Not all test containers are running" | ||
exit 1 | ||
fi | ||
echo "$count containers are running with $vname attached" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this test script works the timeouts for each of the containers has been kept so that they are still around by the time the plugin restarts and verifies the volume ref count. Does the alive_containers include the plugin also? Do we need this check at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions on the use of contexts inside a loop. Can it be confirmed that with current Docker release the plugin is able to connect with Docker during plugin initialization. And doesn't timeout in this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with the comments below.
Also , please file an issue for ./vendor update, we probably need to update all out dependencies and fix fvt documentation right after 0.13 is cut
vmdk_plugin/utils/refcount/refcnt.go
Outdated
// check if the mount location belongs to vmdk plugin | ||
// managed plugin has mount source in format: | ||
// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}' | ||
if strings.Contains(mount.Source, mountRoot) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the driver name in managed plugin is unpredictable and depends on location in dockerhub
Another point: the check above has many holes, I suggest tightening it down a bit, e.h. "starts with /mnt/vmdk OR (starts with /var/lib/docker/plugins/ AND has /mnt/vmdk "
vmdk_plugin/utils/refcount/refcnt.go
Outdated
containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID) | ||
ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second) | ||
defer cancel_inspect() | ||
containerJSONInfo, err := c.ContainerInspect(ctx_inspect, ct.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to return (not continue) on error.
It would be usefule to mention in the comment that we delibertly leave the refcount table half-populated in this case, since it will improve the chances of correct operation after this specific failure to communicate to Docker
vmdk_plugin/utils/refcount/refcnt.go
Outdated
@@ -272,25 +272,34 @@ func (r RefCountsMap) discoverAndSync(c *client.Client, d drivers.VolumeDriver) | |||
filters.Add("status", "running") | |||
filters.Add("status", "paused") | |||
filters.Add("status", "restarting") | |||
containers, err := c.ContainerList(context.Background(), types.ContainerListOptions{ | |||
|
|||
ctx, cancel := context.WithTimeout(context.Background(), dockerConnTimeout*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the context can be reused down in this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cancel function releases the context resources. from here. So if the docker api (in which we use the context) returns before timeout, the context is cancelled. This context if used again won't timeout again after the specified timeout, because it was cancelled once. Tried this with a small go code to reuse the context created once and used again and again in a loop. So I think we can't reuse the context.
misc/scripts/refcnt_test.sh
Outdated
return | ||
alive_containers = `docker ps | grep busybox | wc -l` | ||
if [ $alive_containers -ne $count] ; then | ||
echo "FAILED CRASH RECOVERY TEST. Not all test containers are running" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Unless there is something we want to find out with this count, I suggest to drop it.
Addressed the changes requested in latest commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
A few small questions/nits. feel free to address or ignore.
@@ -133,13 +128,15 @@ echo $last_line | $GREP -q refcount=$count ; if [ $? -ne 0 ] ; then | |||
exit 1 | |||
fi | |||
|
|||
' | |||
Disabling this check due to race. See issue #1112 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it still failing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Everytime.
vmdk_plugin/utils/refcount/refcnt.go
Outdated
@@ -257,6 +257,24 @@ func (r RefCountsMap) Decr(vol string) (uint, error) { | |||
} | |||
return rc.count, nil | |||
} | |||
// check if volume with source as mount_source belongs to vsphere plugin | |||
func matchNameforVMDK(mount_source string) bool { | |||
var managedPluginMountStart string = "/var/lib/docker/plugins/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I suspect that just managedPluginMountStart := "/var/lib/docker/plugins/"
will work equally fine
return err | ||
} | ||
|
||
log.Debugf("Found %d running or paused containers", len(containers)) | ||
for _, ct := range containers { | ||
containerJSONInfo, err := c.ContainerInspect(context.Background(), ct.ID) | ||
ctx_inspect, cancel_inspect := context.WithTimeout(context.Background(), dockerConnTimeoutSec*time.Second) | ||
defer cancel_inspect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so , is it really needed or can you reuse ctx objec from Line 294 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't reuse. Cancel releases the resources of the context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context will be released only after the function returns or in each loop? It should be once the function returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be released after function returns in each iteration of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't reuse. Cancel releases the resources of the context.
of course it does. The question is - why do you need to cancel before the final exit from the function ? Context defined on L263 will do exactly that... what am I missing ?
// if plugin is used as managed plugin | ||
// managed plugin has mount source in format: | ||
// '/var/lib/docker/plugins/{plugin uuid}/rootfs/mnt/vmdk/{volume name}' | ||
if strings.HasPrefix(mount_source, managedPluginMountStart) && strings.Contains(mount_source, mountRoot) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break Photon driver which continues to use RPM. @govint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, till PhotonOS upgrades to Docker 1.13 or later. Not sure that the plugin is used with Photon Controller yet. I'll make a separate issue to support the refcounter with Photon once this change is submitted.
Using timout based context while using docker client apis on plugin side to
popluate the existing volume data
upgrading the docker client api version to 1.25
Removing environment variable to enable the volume discovery
Testing:
Create volume test1
Attach it to three containers and keep containers running in -d mode
Following is the refcount from the logs
restart docker using
service docker restart
Timeout based context prevents deadlock
docker engine initializes properly and volume created can be seen
Create volume test1
Attach it to three containers and keep containers running in -d mode
Following is the refcount from the logs
kill plugin using docker-runc
plugin is restarted by docker
discovery complete
Fixes #1050