Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Re-enable plugin filter #1517

Closed
wants to merge 1 commit into from

Conversation

nishanttotla
Copy link
Contributor

@nishanttotla nishanttotla commented Sep 9, 2016

Fix #1231. It was disabled in #1224, due to lazy loading by the engine.

Using the plugins API, it is possible to list installed plugins.

One issue is that we don't have a way of storing arbitrary plugins in the TaskSpec, from which to match them against the node on which PluginFilter is applied. For volumes and networks, this happens via a separate code path that allows for attaching to networks or mounting volumes. This is something we might want to build to support general plugins, and will require proto changes. This PR will work for volume and network v2 plugins.

Signed-off-by: Nishant Totla nishanttotla@gmail.com

@codecov-io
Copy link

codecov-io commented Sep 9, 2016

Current coverage is 55.12% (diff: 0.00%)

Merging #1517 into master will decrease coverage by 0.01%

@@             master      #1517   diff @@
==========================================
  Files           102        102          
  Lines         16940      16945     +5   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           9340       9341     +1   
+ Misses         6438       6434     -4   
- Partials       1162       1170     +8   

Sunburst

Powered by Codecov. Last update 4e3982b...1651614

@aaronlehmann
Copy link
Collaborator

Isn't the issue that there's a catch-22? I'm not sure my memory is correct, but I thought requiring a certain plugin wouldn't schedule the task on a node until the plugin was loaded on that node, but the plugin might not get loaded until a task which uses it runs there.

@nishanttotla
Copy link
Contributor Author

@aaronlehmann I wasn't aware of this catch-22 situation. I was under the impression that a plugin is loaded externally on a node (say by an operator) and has nothing to do with SwarmKit. I'll try to get this clarified.

@nishanttotla
Copy link
Contributor Author

@stevvooe @mrjana what do you think about this change?

@stevvooe
Copy link
Contributor

The preconditions for adding this back are in the TODO:

We can add this back when we can schedule plugins in the future.

I think the problem is that there are cases where the plugin isn't reported until it is actually in use.

What testing have we done to confirm this works?

@aluzzardi
Copy link
Member

Welp @tiborvass @vieux

@nishanttotla
Copy link
Contributor Author

nishanttotla commented Sep 20, 2016

After some discussion with @vieux, here's an update:
In the experimental part of the Docker API, we have a function called PluginList, which lists all installed plugins on the engine (loaded or not). This is what we want.

There are two caveats though

  1. PluginList will only list v2 plugins, there is no way to get a list of v1 plugins that haven't been loaded. So as long as users aren't using v1 plugins, we're good.
  2. The v2 plugin functionality is still experimental, so nodes will need to be running engine with the experimental flag. We could either wait for this to get out of experimental, or else tell users that if they want plugin filter to work, they will have to run engines with experimental flag.

WDYT @aluzzardi @stevvooe @aaronlehmann?

@nishanttotla
Copy link
Contributor Author

cc @anusha-ragunathan

@anusha-ragunathan
Copy link
Contributor

@nishanttotla : PluginsV2 can be queried using PluginList API endpoint. Also pluginsV2 will be stable in 1.13, so you dont need to worry about experimental support only.

@nishanttotla
Copy link
Contributor Author

@anusha-ragunathan thanks! Is there a PR to track, that's moving PluginsV2 out of experimental? We can then update this PR and merge after that one is.

@nishanttotla nishanttotla added this to the 1.13.0 milestone Sep 20, 2016
@anusha-ragunathan
Copy link
Contributor

@nishanttotla : There's no PR yet, but an issue tracking it. moby/moby#26760

@nishanttotla
Copy link
Contributor Author

Also related: #1545

@nishanttotla
Copy link
Contributor Author

Picking this up again, after moby/moby#28226 was merged.

I will update this PR using PluginList to list plugins instead of relying on docker info. This will make installed plugins available. We will need to vendor in docker/docker for that though. Should I do it, or will it cause other issues? cc @aaronlehmann @LK4D4 @mrjana

@anusha-ragunathan
Copy link
Contributor

In 1.13, we will support pluginv1 (where plugins are queried via docker info) as well as pluginv2 (where plugins are queried via docker plugin list). So the correct approach in swarm would be to query both, rather than use pluginv2 alone.

@nishanttotla
Copy link
Contributor Author

@anusha-ragunathan thanks for the heads up. In the case of Swarm mode though, v1 plugins are tricky because they won't show up in docker info unless they're actually loaded/running, but the scheduler will not schedule until the plugin actually is running (which won't happen unless a task uses it).

So we might have to stick to v2 plugins for Swarm mode.

@anusha-ragunathan
Copy link
Contributor

What about the case where the v1 plugin was loaded and running before the node joined the swarm? Node had some containers using the plugin. So plugin was loaded and running and docker info shows the plugin. Then node joins a swarm. For such a node, we should honor plugin v1.

@nishanttotla
Copy link
Contributor Author

@anusha-ragunathan you're right, I hadn't considered that case. Fortunately, it won't be hard to support both, so I will update the PR accordingly.

@nishanttotla
Copy link
Contributor Author

Are these test failures related? @LK4D4

@aaronlehmann
Copy link
Collaborator

Looks similar to #1662

It is unrelated.

addPlugins("Volume", info.Plugins.Volume)
// Add builtin driver "overlay" (the only builtin multi-host driver) to
// the plugin list by default.
addPlugins("Network", append([]string{"overlay"}, info.Plugins.Network...))
addPlugins("Authorization", info.Plugins.Authorization)

// retrieve v2 plugins
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @tiborvass @vieux @anusha-ragunathan

Could you please take a look at this logic?

@aluzzardi
Copy link
Member

@nishanttotla This needs testing (at the very least manual). e.g. install a plugin on a machine and then deploy a service using that plugin (and make sure it gets deployed ONLY to that machine)

for _, plgn := range v2plugins {
for _, typ := range plgn.Config.Interface.Types {
plugins[api.PluginDescription{
Type: typ.String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typ.String() will return something of the format docker/VolumeDriver:1.0. However, test code in scheduler_test.go: L1404 still sets the type to be Volume. Where are the changes in swarmkit to accomodate this new format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this new microformat documented anywhere?

Do plugins no long have a type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nishanttotla nishanttotla changed the title Re-enable plugin filter [WIP] Re-enable plugin filter Nov 29, 2016
@nishanttotla nishanttotla force-pushed the enable-plugin-filter branch 4 times, most recently from b7d2cd2 to 9a5c348 Compare December 6, 2016 21:17
@nishanttotla
Copy link
Contributor Author

@tiborvass @stevvooe @anusha-ragunathan Updated the PR to save plugins based on driver type. PTAL.

@@ -41,12 +41,35 @@ func (e *executor) Describe(ctx context.Context) (*api.NodeDescription, error) {
}
}

// add v1 plugins to 'plugins'
addPlugins("Volume", info.Plugins.Volume)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anusha-ragunathan do we want to add v2 plugins to docker info or not? If we do, then info.Plugins.Volume will contain both built-in local driver, v1 plugins, and v2 plugins.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont feel strongly towards adding v2 plugins to docker info.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird not to add them imho. Why would only built-in drivers + v1 plugins appear there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a way to query them from docker using docker plugin ls and hence there's no reason to duplicate the info.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but now there is a field called Plugins in docker info, that is essentially deprecated without being deprecated. Anyway, not sure what's the best solution.

Copy link
Contributor

@anusha-ragunathan anusha-ragunathan Dec 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason docker info historically had a Plugins section is because, there was no other way to query for that information through docker. Now that plugins are a first-class docker citizen, we should not continue down a legacy path of exposing that info.

We should document that Plugins in docker info lists only pluginsv1 and to query for pluginsv2 use docker plugin ls. When pluginsv1 is deprecated, we should consider removing Plugins from docker info.

// add v2 plugins to 'plugins'
for _, plgn := range v2plugins {
for _, typ := range plgn.Config.Interface.Types {
plgnTyp := typ.Capability

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to also verify that typ.Prefix == "docker"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for _, typ := range plgn.Config.Interface.Types {
plgnTyp := typ.Capability
if typ.Capability == "volumedriver" {
plgnTyp = "Volume"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nishanttotla what happens if we remove these ifs ? The capability is volumedriver. I actually wonder if it shouldn't even be the whole type: docker/volumedriver:1.0. cc @anusha-ragunathan

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we want to rename it to Volume here because of this line that was shipped in 1.12: https://github.com/docker/swarmkit/pull/1517/files#diff-0c043559989dc54f88aaac9214b99260R45
which in turn was there because that's the nomenclature in docker info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiborvass Not exactly. The way I understand, the reason is in filter.go. Notice the Check() function for PluginFilter

// Check returns true if the task can be scheduled into the given node.
func (f *PluginFilter) Check(n *NodeInfo) bool {
	// Get list of plugins on the node
	nodePlugins := n.Description.Engine.Plugins

	// Check if all volume plugins required by task are installed on node
	container := f.t.Spec.GetContainer()
	if container != nil {
		for _, mount := range container.Mounts {
			if mount.VolumeOptions != nil && mount.VolumeOptions.DriverConfig != nil {
				if !f.pluginExistsOnNode("Volume", mount.VolumeOptions.DriverConfig.Name, nodePlugins) {
					return false
				}
			}
		}
	}

	// Check if all network plugins required by task are installed on node
	for _, tn := range f.t.Networks {
		if !f.pluginExistsOnNode("Network", tn.Network.DriverState.Name, nodePlugins) {
			return false
		}
	}
	return true
}

In particular, I'm pointing out to the fact that volumes are stored as mounts, and networks stored separately, and we check if the corresponding Driver exists on that node. This is why I wanted to store with a simple "Volume" or "Network" key, instead of the longer form returned by type.String(). For other plugins, this PR is storing the long form version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On that note, the issue that we don't have a way of storing arbitrary plugins in the TaskSpec, from which to match them against the node on which PluginFilter is applied. For volumes and networks, this happens via a separate code path that allows for attaching to networks or mounting volumes. This is something we might want to build to support general plugins, and will require proto changes. This PR will work for volume and network v2 plugins.

cc @stevvooe

@nishanttotla nishanttotla force-pushed the enable-plugin-filter branch 2 times, most recently from 94a85f7 to 1651614 Compare December 7, 2016 23:10
@stevvooe
Copy link
Contributor

stevvooe commented Dec 7, 2016

I think there may be fundamental problems with the plugin data model that need to be worked out before going forward with this.

For example, the plugin list reported via NodeDescription is actually a list of drivers (we should probably rename this), but the new plugin model creates "capabilities". What are these capabilities? How do they work? I have seen a few examples in discussion but the results look foreign. Do we have a list of capabilities for a plugin?

For this to work properly, we really need a feedback loop of installation of a plugin to whatever the result is for a running plugin. This only lists the plugins and reports them, and doesn't provide any indication if they are the right state or the correct plugin.


// add v2 plugins to 'plugins'
for _, plgn := range v2plugins {
for _, typ := range plgn.Config.Interface.Types {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be looking for enabled (aka running) plugins. For this, use plgn.Enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was addressed.

@anusha-ragunathan
Copy link
Contributor

@nishanttotla @aluzzardi : Any updates on this? Are you blocked on something? Is there something I can help with. We need this PR for 1.13-rc5.

Signed-off-by: Nishant Totla <nishanttotla@gmail.com>
v2plugins, err := e.client.PluginList(ctx)
if err != nil {
return nil, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be changed not to make Describe error out if the error contains "plugins are not supported on this platform".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm going to make it ignore all errors. This will also fail against daemons that are too old to support this endpoint.

@aaronlehmann
Copy link
Collaborator

I am going to carry this PR. I've fixed some of the outstanding issues, but I've discovered a big problem that prevents this from being practically useful for volumes.

  1. Creating a volume is supposed to be idempotent. Some parts of the engine code assume this, for example registerMountPoints. However, an inconsistency around tags causes problems here. If I try to create a volume using the driver vieux/sshfs, using the same parameters I used to create it before, it will fail because the volume is already associated with vieux/sshfs:latest. This is a big problem because it means I can't create a service using vieux/sshfs as a volume driver without including the latest tag.

  2. ...but including :latest doesn't work either. In this case, no suitable node will be found to run the service, because the vieux/sshfs plugin reported by the agent (based on the Name field) won't match vieux/sshfs:latest.

This is tricky to get right. We should be consistent in the way we refer to plugins. We shouldn't just strip the tag off, because that loses information. We also need to make sure that we handle other cases like name:tag@digest matching name@digest (since the tag is purely advisory in this case).

What we probably need as a starting point is a canonical function that compares plugin references, adding a default tag where necessary and handling digest cases properly. However, even this won't fix (1), since the volume code is where this comparison is happening, and it does a simple string comparison on the volume driver name, which already had the tag included. I'm not sure what the right solution to this is.

cc @anusha-ragunathan @vieux @dmcgowan

@aaronlehmann
Copy link
Collaborator

Discussed with @anusha-ragunathan and @tonistiigi and learned that digests will never be used in plugin naming. The canonical name of a plugin is name:tag. If no tag was specified at pull time, latest is implicit.

I will fix (1) in docker/docker by fixing the driver comparison in the volumes code.

I will fix (2) in this PR (or a carried version) by comparing tags in the plugin filter, and assuming latest if no tag is specified.

@aaronlehmann
Copy link
Collaborator

Carried in #1808

aaronlehmann added a commit that referenced this pull request Dec 14, 2016
[Carry #1517] Support v2 plugins; re-enable scheduler plugin filter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants