Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use probe based plugin watcher mechanism in Device Manager #58755

Merged
merged 1 commit into from
Jul 27, 2018

Conversation

vikaschoudhary16
Copy link
Contributor

@vikaschoudhary16 vikaschoudhary16 commented Jan 24, 2018

What this PR does / why we need it:
Uses this probe based utility in the device plugin manager.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #56944

Notes For Reviewers:
Changes are backward compatible and existing device plugins will continue to work. At the same time, any new plugins that has required support for probing model (Identity service implementation), will also work.

Release note

Add support kubelet plugin watcher in device manager.

/sig node
/area hw-accelerators
/cc /cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. area/hw-accelerators cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 24, 2018
// Schedulable resource name. As of now it's expected to be a DNS Label
string resource_name = 3;
string resource_name = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kubelet shouldn't be telling which resource_name the device plugin has.
You need a RegisterResponse

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for authentication. kubelet will probe a canonical path and will treat part of socket name as resource name. Device plugin can further authenticate that.

sock := strings.Split(r.SockPath, "/")
sock = strings.Split(sock[len(sock)-1], ".")
if len(sock) != 3 {
// NOTE: socket path must be in the form of /var/lib/kubelet/device-plugins/<domain-name>.<resource-name>.sock
Copy link
Contributor

@RenaudWasTaken RenaudWasTaken Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have the name be explicit in the gRPC protocol rather than try to fit that into the FS naming convention and then having to manipulate the string

}

func WatchPlugins(path string, outChan chan PluginInfo) error {
watcher, err := fsnotify.NewWatcher()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for implementing this change very quickly!

Haven't looked through the whole PR, but looking at this part, I am curious whether kubelet can re-discover all of plugins after it restarts. Do we need to scan through the directory initially?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done. PTAL!

@vikaschoudhary16 vikaschoudhary16 force-pushed the probing-mode branch 2 times, most recently from 6f8b6f4 to cf8d6ba Compare January 25, 2018 06:30
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 25, 2018
// GetResourceNameFrom returns base resource name from the plugin socket absolute path
func GetResourceNameFrom(sockPath string) (string, error) {
if !((strings.Contains(sockPath, ".sock")) && (strings.Contains(sockPath, "/"))) {
return "", fmt.Errorf("invalid sock name: %s. Expected format: <sock-dir>/<domain>.<resource>.sock", sockPath)
Copy link
Contributor

@lichuqiang lichuqiang Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the probe func is meant to be built as a common mechanism for both device-plugin and other potential components of kubelet.
So, it’s specially for device-plugin indeed?

Copy link
Contributor

@RenaudWasTaken RenaudWasTaken Jan 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I was mentioning higher, we shouldn't impose a name format for the sockets. We're trying to fit a DNS name into an FS naming convention, that's not a good model (as shown by the further transformation we have to do on the string) and is prone to implementation errors and user errors.

I also don't think it matches the simple API format we're trying to introduce here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pkg is to be shared by other components, like CSI, as well. +1 to avoid specific naming conventions.

@saad-ali @verult could you comment on any socket naming requirements/preference from CSI or flexvolume side?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, according to CSI proposal, the expected naming convention is /var/lib/kubelet/plugins/[SanitizedCSIDriverName]/csi.sock

For simplicity, wonder whether we can have csi plugins go to /var/lib/kubelet/plugins/csi/ directory and device plugins go to /var/lib/kubelet/plugins/device_plugins/? The parent directory can be a parameter passed in to NewPluginWatcher. Perhaps we can just pass the absolute path in the channel, and the PluginWatcher caller can do its own naming convention verification?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me to that convention ? I can't seem to find any directive in the spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flexvolume doesn't use sockets to communicate with kubelet; instead, kubelet executes Flexvolume drivers directly on host. I don't think this model works well for Flex, and IMO that's OK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For simplicity, wonder whether we can have csi plugins go to /var/lib/kubelet/plugins/csi/ directory and device plugins go to /var/lib/kubelet/plugins/device_plugins/?

Hmm, we mwntioned moving the checkpointing data out of this directory, I think that's a good idea. In that case where would we put it?
Wouldn't it make more sense to have:

  • var/lib/kubelet/csi/
  • var/lib/kubelet/device-plugin/
  • var/lib/kubelet/device-plugin/data.checkpoint
  • var/lib/kubelet/device-plugin/plugins/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use csi/plugins to keep the plugin hierarchy same and just in case CSI would need any checkpointing it has a place to do it.


err = watcher.Add(path)
if err != nil {
watcher.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if err == nil?
Maybe things like defer watcher.Close() should be addressed outside to ensure its collection

return nil
}

// Dial establishes the gRPC communication with the picked up plugin socket. https://godoc.org/google.golang.org/grpc#Dial
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: put reference link in a newline.

if err != nil {
return err
}
watcher, err := fsnotify.NewWatcher()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about use package filesystem .FSWatcher here:

// FSWatcher is a callback-based filesystem watcher abstraction for fsnotify.
type FSWatcher interface {
	// Initializes the watcher with the given watch handlers.
	// Called before all other methods.
	Init(FSEventHandler, FSErrorHandler) error

	// Starts listening for events and errors.
	// When an event or error occurs, the corresponding handler is called.
	Run()

	// Add a filesystem path to watch
	AddWatch(path string) error
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Maybe take a look at the existing pkg/volume/flexvolume/probe.go and see whether we can follow a similar flow. Some example functionalities that the current pkg/volume/flexvolume/probe.go has and we may also consider:

  • create the plugin directory if it doesn't exist
  • recursively add watcher for sub directories
  • watchEventInterval and watchEventLimit
  • watch remove events

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend taking advantage of this interface, and consider extending it if it's insufficient.

watchEventInterval and watchEventLimit are there for rate limiting: it prevents a malicious actor from overwhelming kubelet by triggering too many inotify events.

In Flexvolume the requirement is that if ANYTHING inside a driver directory changes, the driver should update. For this reason, the Flexvolume prober recursively watches all subdirectories. But here AFAIK each device plugin is a single socket, so recursive watch won't be necessary.

My understanding is it's not necessary to watch remove events because gRPC will terminate when socket is deleted, triggering the endpoint deletion in managerImpl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that code lacks a stop function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@verult Thanks a lot for your helpful inputs! Agree we should try to reuse FSWatcher interface. For now, I think we may leave remove events and recursively watch out of this PR. For possible future extension, I would suggest that PluginWatcher passes the absolute path in the channel, and the PluginWatcher caller can do its own naming convention verification.

For CSI, is it possible that we use /var/lib/kubelet/plugins/csi/[SanitizedCSIDriverName].sock naming convention to work with a flat directory? If we have to use the proposed /var/lib/kubelet/plugins/[SanitizedCSIDriverName]/csi.sock format, we can add recursive watch in PluginWatcher in a later PR.

For device plugin, I think the reason why we can't just pass resource name in Registration response is because the advertised resource name may fail Kubelet's validation, if e.g., it doesn't follow extended resource name convention, or it collides with an existing name. I am fine with the naming format currently used in this PR. Something we may consider is whether we want to allow socket name with some timestamp suffix to allow a new instance to come up without tearing down the old plugin instance to ensure zero downtime. But I think we can leave that part out for now, given that we expect most plugins to be deployed as DaemonSet containers whose upgrades require container restarts.

Copy link
Contributor

@RenaudWasTaken RenaudWasTaken Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I thought about it more and I think a good solution could be to have to following model:

  • The socket has the same format as the resource name (e.g: /var/lib/kubelet/plugins/device-plugins/nvidia.com/gpu.sock for the socket)
  • The gRPC server advertises the same name as it's socket (e.g: "nvidia.com/gpu")
  • Kubelet validates after calling Register that the socket name matches the advertised name (and applies resourceName validation)

or it collides with an existing name

That would be the same behavior as re-registration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also fine with this model. It is just that ProbeWatcher has to support recursive watching then, and I don't know whether that would create some performance concerns as inotify man page mentions that "Inotify monitoring of directories is not recursive: to monitor subdirectories under a directory, additional watches must be created. This can take a significant amount time for large directory trees."

Copy link
Contributor

@RenaudWasTaken RenaudWasTaken Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inotify monitoring of directories is not recursive: to monitor subdirectories under a directory, additional watches must be created. This can take a significant amount time for large directory trees.

Pretty sure that's meant for large amount of directories (>10000) rather than our 3 device plugins (or the not so many CSI sockets).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSI drivers should in almost all the cases be < 10.

Copy link
Contributor

@RenaudWasTaken RenaudWasTaken Feb 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something we may consider is whether we want to allow socket name with some timestamp suffix to allow a new instance to come up without tearing down the old plugin instance to ensure zero downtime

We could just require the directory to match the vendor name instead of exact match of the resource name. e.g:

  • /var/lib/kubelet/plugins/device-plugins/nvidia.com/gpu-v1.8.sock
  • /var/lib/kubelet/plugins/device-plugins/nvidia.com/gpu-v1.9.sock

Could both advertise resources under nvidia.com/*
Naming convention after nvidia.com/ would then be left to the vendor.

}

// GetResourceNameFrom returns base resource name from the plugin socket absolute path
func GetResourceNameFrom(sockPath string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this GetResourceNameFrom() only used for device plugin? Or other plugin too?


go func() {
for _, f := range files {
outChan <- PluginInfo{path + f.Name()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this return checkpoint files? as we use the same dir:

utilstore.NewFileStore(socketDir, utilfs.DefaultFs{})

// Tests that the device plugin manager correctly handles registration and re-registration by
// making sure that after registration, devices are correctly updated and if a re-registration
// happens, we will NOT delete devices; and no orphaned devices left.
func TestDevicePluginReRegistration(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReRegistration is still possible with this behavior and needs to be tested

@jiayingz
Copy link
Contributor

@verult with your experience on pkg/volume/flexvolume/probe.go, it would be great if you can help take a look at pkg/kubelet/util/pluginwatcher/ part of this PR. Thanks a lot!

}

_, err := e.client.Register(context.Background(), reqt)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just return err?

service Registration {
rpc Register(RegisterRequest) returns (Empty) {}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incompatible API change. I think we will need a new API directory. Is it too early to create a deviceplugin/v1beta1 directory? I know we haven't flipped the feature gate flag yet, but it seems generally api version is a pre-requirement for feature gate version. Perhaps we should discuss this on this week's RW meeting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is an important question. Unfortunately missed the opportunity to discuss it at last meeting. Lets target for next meeting and meanwhile will try to discuss and get opinion on slack and list.

}
return nil
}

const (
// kubeletDevicePluginCheckpoint is the file name of device plugin checkpoint
kubeletDevicePluginCheckpoint = "kubelet_internal_checkpoint"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider to move checkpoint file to a separate location, like /var/run/kubelet/device-plugins/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be in next PR?

// GetResourceNameFrom returns base resource name from the plugin socket absolute path
func GetResourceNameFrom(sockPath string) (string, error) {
if !((strings.Contains(sockPath, ".sock")) && (strings.Contains(sockPath, "/"))) {
return "", fmt.Errorf("invalid sock name: %s. Expected format: <sock-dir>/<domain>.<resource>.sock", sockPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, according to CSI proposal, the expected naming convention is /var/lib/kubelet/plugins/[SanitizedCSIDriverName]/csi.sock

For simplicity, wonder whether we can have csi plugins go to /var/lib/kubelet/plugins/csi/ directory and device plugins go to /var/lib/kubelet/plugins/device_plugins/? The parent directory can be a parameter passed in to NewPluginWatcher. Perhaps we can just pass the absolute path in the channel, and the PluginWatcher caller can do its own naming convention verification?

name = "all-srcs",
srcs = [":package-srcs"],
tags = ["automanaged"],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a plugin_watcher_test.go with some unit test coverage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@vikaschoudhary16
Copy link
Contributor Author

Thanks guys for the inputs. Will soon update the PR.

if err != nil {
return err
}
watcher, err := fsnotify.NewWatcher()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSI drivers should in almost all the cases be < 10.

// GetResourceNameFrom returns base resource name from the plugin socket absolute path
func GetResourceNameFrom(sockPath string) (string, error) {
if !((strings.Contains(sockPath, ".sock")) && (strings.Contains(sockPath, "/"))) {
return "", fmt.Errorf("invalid sock name: %s. Expected format: <sock-dir>/<domain>.<resource>.sock", sockPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use csi/plugins to keep the plugin hierarchy same and just in case CSI would need any checkpointing it has a place to do it.


// GetResourceNameFrom returns base resource name from the plugin socket absolute path
func GetResourceNameFrom(sockPath string) (string, error) {
if !((strings.Contains(sockPath, ".sock")) && (strings.Contains(sockPath, "/"))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the check for "Contains (sockPath, "/")"? May be add a stricter prefix check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This serves for extended resources naming format. Do you have something in mind related to a stricter prefix match?

"fmt"
"io/ioutil"
"os"
"reflect"
"sync/atomic"
// "sync/atomic"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*Nit - remove

return &pluginapi.Empty{}, nil
existingDevs := make(map[string]pluginapi.Device)
e := newEndpointImpl(r.SockPath, name, conn, existingDevs, m.callback)
err = e.register(name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect the plugins to support register call? CSI does not support any register call today.

Copy link
Contributor

@RenaudWasTaken RenaudWasTaken Jan 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should re-use the service Identity of the CSI spec.
The Register call doesn't need to send the name to the plugin.

@chakri-nelluri
Copy link
Contributor

@vikaschoudhary16 Added comments.
CSI today doesn't support extra register call today. Other than that, it looks ok to me. Let me circle with the other folks and see what we can do to support the register call.

@chakri-nelluri
Copy link
Contributor

@vikaschoudhary16 @jingxu97 I am looking for more information on how different components of Kubelet like VolumeManager(CSI) & Network(CNI) can consume this? Do we have a write-up on how different components can consume this?

@chakri-nelluri
Copy link
Contributor

@vikaschoudhary16 @jingxu97 Any update on the consumption model from different Kubelet components?

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 12, 2018
@@ -76,6 +77,12 @@ func (cm *containerManagerStub) GetCapacity() v1.ResourceList {
return c
}

func (cm *containerManagerStub) GetPluginRegistrationHandlerCbkFunc() pluginwatcher.RegisterCallbackFn {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@@ -28,12 +28,16 @@ import (
"google.golang.org/grpc"

pluginapi "k8s.io/kubernetes/pkg/kubelet/apis/deviceplugin/v1beta1"
watcherapi "k8s.io/kubernetes/pkg/kubelet/apis/pluginregistration/v1alpha1"
//"k8s.io/kubernetes/pkg/kubelet/util/pluginwatcher"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the commented line?

}
}

func (m *ManagerImpl) isVersionCompatibleWithPlugin(versions []string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is fine as we only have a single supported version. When we do need to support multiple versions in the future, we may need to extend this function to return a supported version. E.g., say kubelet supports v1beta1 and v1beta2, and we get v1alpha1 and v1beta1 from a device plugin, this function should return v1beta1 and we probably should pass that version to addEndpointProbeMode so that we can use correct version for communication. Maybe we can add some TODO here to make sure we don't forget in the future?

}
m, p1, w := setupInProbeMode(t, devs, callback, socketName, pluginSocketName)
atomic.StoreInt32(&expCallbackCount, 1)
//p1.Register(socketName, testResourceName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the commented line?

err = p2.Start()
require.NoError(t, err)
atomic.StoreInt32(&expCallbackCount, 2)
//p2.Register(socketName, testResourceName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line?

@@ -239,6 +240,33 @@ func (m *ManagerImpl) Start(activePods ActivePodsFunc, sourcesReady config.Sourc
return nil
}

// GetWatcherCallback returns callback function to be registered with plugin watcher
func (m *ManagerImpl) GetWatcherCallback() watcher.RegisterCallbackFn {
return func(name string, endpoint string, versions []string, sockPath string) (chan bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think whether we should create DEPRECATION file under m.socketDir before returning the callback function as we discussed in #58755 (comment)? This way, people can start testing the migration with pluginwatcher enabled and we can extend the device plugin e2e_node test and e2e test to start getting test coverage.


//By("Register resources")
//err = dp1.Register(pluginapi.KubeletSocket, resourceName)
//framework.ExpectNoError(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the commented lines?

@@ -160,6 +162,112 @@ var _ = framework.KubeDescribe("Device Plugin [Feature:DevicePlugin] [Serial] [D
f.PodClient().DeleteSync(pod1.Name, &metav1.DeleteOptions{}, framework.DefaultPodDeletionTimeout)
f.PodClient().DeleteSync(pod2.Name, &metav1.DeleteOptions{}, framework.DefaultPodDeletionTimeout)
})
It("Verifies the Kubelet device plugin functionality with Probe-Mode plugin discovery mechanism", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we do need to test both cases, but instead of having a separate similar test case, what do you think that we create a DEPRECATION file under /var/lib/kubelet/device_plugins/ directory in device manager GetWatcherCallback(), and add a helper function here that calls dp.Register only if that file doesn't exist? Perhaps even better is to do this in stub device plugin, and then we may add some test on upgrade scenario.

Copy link
Contributor

@jiayingz jiayingz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple nit comments. Overall lgtm.

func (m *Stub) Register(kubeletEndpoint, resourceName string, pluginSockDir string) error {
if pluginSockDir != "" {
if _, err := os.Stat(pluginSockDir + "DEPRECATION"); err == nil {
log.Println("Deprecation file found. Registration from plugin side.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be the other way around, i.e., if Deprecation file found, skip Register gRPC call with Kubelet; if not, proceed?

}

return func(name string, endpoint string, versions []string, sockPath string) (chan bool, error) {
//chanForAckOfNotification := make(chan bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this commented line?

initialConfig.FeatureGates[string(features.KubeletPluginsWatcher)] = true
})
devicePluginSockPaths := []string{pluginapi.DevicePluginPath, "/var/lib/kubelet/plugins/"}
//devicePluginSockPaths := []string{pluginapi.DevicePluginPath}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this?

)

// Serial because the test restarts Kubelet
var _ = framework.KubeDescribe("Device Plugin [Feature:DevicePlugin][NodeFeature:DevicePlugin][Serial]", func() {
f := framework.NewDefaultFramework("device-plugin-errors")

Context("DevicePlugin", func() {
By("Enabling support for Kubelet Plugins Watcher")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think if it is possible to wrap lines 52-205 in a testDevicePlugin(f *framework.Framework, enableProbeWatcher bool) function? Then we can get test coverage on both settings with the following test config:

var _ = framework.KubeDescribe("NVIDIA GPU Device Plugin [Feature:GPUDevicePlugin][NodeFeature:GPUDevicePlugin][Serial] [Disruptive]", func() {
f := framework.NewDefaultFramework("device-plugin-gpus-errors")
testDevicePlugin(f, false)
}

var _ = framework.KubeDescribe("NVIDIA GPU Device Plugin [Feature:GPUDevicePluginProbe][NodeFeature:GPUDevicePluginProbe][Serial] [Disruptive]", func() {
f := framework.NewDefaultFramework("device-plugin-gpus-errors-probe-mode")
testDevicePlugin(f, true)
}

@vikaschoudhary16 vikaschoudhary16 changed the title Use probe based plugin watcher mechanism in Device Manager [WIP]Use probe based plugin watcher mechanism in Device Manager Jul 17, 2018
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 17, 2018
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 17, 2018
@vikaschoudhary16 vikaschoudhary16 changed the title [WIP]Use probe based plugin watcher mechanism in Device Manager Use probe based plugin watcher mechanism in Device Manager Jul 17, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 17, 2018
@jiayingz
Copy link
Contributor

/lgtm
/approve
Thanks a lot for the work!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 17, 2018
@vikaschoudhary16
Copy link
Contributor Author

/assign @derekwaynecarr for higher level approval

@vikaschoudhary16
Copy link
Contributor Author

/test pull-kubernetes-e2e-kops-aws
/test pull-kubernetes-local-e2e-containerized

@vikaschoudhary16
Copy link
Contributor Author

/test pull-kubernetes-local-e2e-containerized

2 similar comments
@vikaschoudhary16
Copy link
Contributor Author

/test pull-kubernetes-local-e2e-containerized

@vikaschoudhary16
Copy link
Contributor Author

/test pull-kubernetes-local-e2e-containerized

@vikaschoudhary16
Copy link
Contributor Author

ping @derekwaynecarr

@Random-Liu
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jiayingz, Random-Liu, vikaschoudhary16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 27, 2018
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 58755, 66414). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 32e38b6 into kubernetes:master Jul 27, 2018
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jul 27, 2018

@vikaschoudhary16: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-local-e2e-containerized a584250 link /test pull-kubernetes-local-e2e-containerized
pull-kubernetes-e2e-gce a584250 link /test pull-kubernetes-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-github-robot pushed a commit that referenced this pull request Jul 30, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Simplify device manager: make endpoint stateless

While reviewing devicemanager code, found the caching layer on endpoint is redundant.
Here are the 3 related objects in picture:

**devicemanager  <->  endpoint <-> plugin**

plugin is the source of truth for devices and device health status.
devicemanager maintain healthyDevices, unhealthyDevices, allocatedDevices based on updates
from plugin.

So there is no point for endpoint to cache devices, this patch is removing the cache layer,
endpoint becomes stateless, which i believe should be the case (but i do welcome review
if i missed something here).

also removing the Manager.Devices() since i didn't find any caller of this other than test.

if we need to get all devices from manager in future, it just need to return healthyDevices + unhealthyDevices, so don't have to call endpoint after all.

This patch makes code more readable, data model been simplified.



**What this PR does / why we need it**:
this patch simplify the device manager code, make it more maintainable.

**Which issue(s) this PR fixes** *:
this is a refactor of device manager code

**Special notes for your reviewer**:
will need to rebase the code if #58755 get checked-in first.

**Release note**:

```release-note
None
```

/sig node
/cc @jiayingz @RenaudWasTaken @vishh @saad-ali @vikaschoudhary16 @vladimirvivien @anfernee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/hw-accelerators cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A common Kubelet Plugin Communication Establish Model