Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add kubelet client for Pod resource info #231

Merged
merged 2 commits into from
Mar 22, 2019

Conversation

ahalimx86
Copy link
Contributor

This change introduces kubelet client to get allocated device
information of a Pod from newly added Kubelet grpc service.
For more information please see:
kubernetes/kubernetes#70508

With this kubelet service we no longer need to rely on kubelet
checkpoint file to get device information.

Change-Id: I3ba098cfea9a0e798f065a4d9b39d0085e125407
Signed-off-by: Abdul Halim abdul.halim@intel.com

@coveralls
Copy link

coveralls commented Jan 8, 2019

Pull Request Test Coverage Report for Build 729

  • 55 of 77 (71.43%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+1.2%) to 52.841%

Changes Missing Coverage Covered Lines Changed/Added Lines %
checkpoint/checkpoint.go 3 5 60.0%
k8sclient/k8sclient.go 4 9 44.44%
kubeletclient/kubeletclient.go 48 63 76.19%
Totals Coverage Status
Change from base Build 727: 1.2%
Covered Lines: 651
Relevant Lines: 1232

💛 - Coveralls

@rkamudhan
Copy link
Member

@dougbtv @s1061123 Targeting for master branch.

@dcbw
Copy link
Member

dcbw commented Jan 9, 2019

This brings in a ton of vendored deps, +2million lines. Are we really sure we need all those deps?

@ahalimx86 ahalimx86 force-pushed the dev/kubelet_client branch 2 times, most recently from 7304cfa to 32a3a7c Compare January 9, 2019 16:09
@ahalimx86
Copy link
Contributor Author

ahalimx86 commented Jan 9, 2019

This brings in a ton of vendored deps, +2million lines. Are we really sure we need all those deps?

This PR has following dependencies:

	"k8s.io/kubernetes/pkg/kubelet/apis/podresources"
	"k8s.io/kubernetes/pkg/kubelet/apis/podresources/v1alpha1"
	"k8s.io/kubernetes/pkg/kubelet/util"

So, I've added only these sub-packages in glide.yaml with correct version then run glide up -v. Looks like even though all I needed is those sub-packages only, glide added pretty much everything from Kubernetes! That's why the vendor became gigantic! Any suggestions/alternatives how to overcome this?

@s1061123
Copy link
Member

Hi @ahalim-intel ,

How about to use '--skip-test' option in glide? It may reduce dependent packages.

@ahalimx86
Copy link
Contributor Author

@s1061123

How about to use '--skip-test' option in glide? It may reduce dependent packages.

Actually glide '--skip-test' is counter-intuitive. It does the opposite. From 'glide up --help':

   --skip-test              Resolve dependencies in test files.

I've tried it anyway and it didn't change the vendor/ at all.
I've also tried glide up -v --no-recursive. This reduces vendor/ size dramatically but build fails.

@ahalimx86
Copy link
Contributor Author

Folks, what is the status of this PR?

@dcbw
Copy link
Member

dcbw commented Feb 21, 2019

/lgtm, as long as we get the backwards compat questions answered.

Copy link
Member

@rkamudhan rkamudhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s1061123
Copy link
Member

@ahalim-intel

  • as Dan's comment, your code removes checkpoint code and you remove checkpoint feature. This change removes previous feature and don't have compatibility. How do you support checkpoint?
  • your diff is slightly old so could you please rebase it to latest?

@ahalimx86
Copy link
Contributor Author

@ahalim-intel

  • as Dan's comment, your code removes checkpoint code and you remove checkpoint feature. This change removes previous feature and don't have compatibility. How do you support checkpoint?
  • your diff is slightly old so could you please rebase it to latest?

@s1061123
I will be working on this in coming days. I am thinking of introducing an option field in Multus netconf something like resourceInfoProvider: with values checkpoint(default) OR kubelet-api. What do you think?

@zshi-redhat
Copy link
Contributor

@ahalim-intel

  • as Dan's comment, your code removes checkpoint code and you remove checkpoint feature. This change removes previous feature and don't have compatibility. How do you support checkpoint?
  • your diff is slightly old so could you please rebase it to latest?

Tomo, just want to mention that checkpoint file is a temporary solution for getting device information from kubelet and it will be deprecated in future kube releases. we'd like to switch to kubelet endpoint with kube-1.13 and 1.13+. This feature is mainly used for passing device info to SR-IOV CNI, I dont' know if any other CNI use this feature. I understand that Multus needs maintain backward compatibility, do you know how we can deprecate a feature like checkpoint file in Multus? Like how many kube releases Multus need to maintain backward compat with?

adding new imported package dependencies in vendor which is required
for Kubelet Pod Resource api client.

Change-Id: If6c74598e12af5f8659df69371e72dd064823f49
@ahalimx86
Copy link
Contributor Author

@s1061123 @dougbtv This PR is rebased now. Also addressed the backward compatibility. If Kubelet resource API is available Multus will use that OR else fallback in Kubelet checkpoint.

@rkamudhan
Copy link
Member

@zshi-redhat Could you please test this PR ?

@ahalimx86
Copy link
Contributor Author

@s1061123

I'm bit confused with the travis test output there.

Output from travisci:

# github.com/intel/multus-cni/kubeletclient
kubeletclient/kubeletclient_test.go:106:3: client declared but not used
vet: typecheck failures
FAIL	github.com/intel/multus-cni/kubeletclient [build failed]

If you see the closure variable client on line 106, it is being used within other context block. It's not 'not used'. What do you suggest?

And running test.sh locally with same file(kubeletclient/kubeletclient_test.go):

=== RUN   TestKubeletclient                                                             
Running Suite: Kubeletclient Suite                                                      
==================================                                                      
Random Seed: 1553184260                                                                 
Will run 5 of 5 specs                                                                   

•••••
Ran 5 of 5 Specs in 0.018 seconds
SUCCESS! -- 5 Passed | 0 Failed | 0 Pending | 0 Skipped --- PASS: TestKubeletclient (0.02s)
PASS                                                                                       
coverage: 77.5% of statements                                                              
ok      github.com/intel/multus-cni/kubeletclient       0.023s  coverage: 77.5% of statements

@ahalimx86
Copy link
Contributor Author

Ok, nevermind it. I've move the variable from there and updated the PR.

@s1061123 s1061123 force-pushed the master branch 2 times, most recently from 1b0b39d to c319f6b Compare March 22, 2019 03:14
@zshi-redhat
Copy link
Contributor

zshi-redhat commented Mar 22, 2019

@zshi-redhat Could you please test this PR ?

I ran the following tests with this PR applied to multus master branch, all passed:

  1. Enable KubeletPodResources feature gate with kube-1.13 env, Multus can detect that kubelet podResource endpoint exists, query and pass the correct device ID that allocated by SR-IOV Device Plugin to SR-IOV CNI.
  2. Disable KubeletPodResources (default config) with kube-1.13 env, Multus can fall back to checkpoint file solution to get Device ID information and pass it to SR-IOV CNI.

// Check for kubelet resource API socket file
kubeletSocket = filepath.Join(defaultPodResourcesPath, defaultKubeletSocketFile)
if _, err := os.Stat(kubeletSocket); err != nil {
logging.Errorf("hasKubeletAPIEndpoint(): error looking up kubelet resource api socket file: %q", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to have a warning message instead of error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to have a warning message instead of error.

Agree! I will change this to Printf() as the logging has no Warningf()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, Printf() shall be fine.

Copy link
Contributor

@zshi-redhat zshi-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the CI failure is due to that kubeletSocket is overwritten in hasKubeletAPIEndpoint to the value of defaultPodResourcesPath + defaultKubeletSocketFile, perhaps we can just set socketDir and socketName to the same value of defaultPodResourcesPath and defaultKubeletSocketFile.


func getKubeletClient() (types.ResourceClient, error) {
newClient := &kubeletClient{}
if kubeletSocket == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always be false as kubeletSocket is set in hasKubeletAPIEndpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always be false as kubeletSocket is set in hasKubeletAPIEndpoint.

Good catch! I've corrected the variable in hasKubeletAPIEndpoint which should be only local.


func hasKubeletAPIEndpoint() bool {
// Check for kubelet resource API socket file
kubeletSocket = filepath.Join(defaultPodResourcesPath, defaultKubeletSocketFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the same as util.LocalEndpoint(defaultPodResourcesPath, podresources.Socket)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the same as util.LocalEndpoint(defaultPodResourcesPath, podresources.Socket)?

util.LocalEndpoint(defaultPodResourcesPath, podresources.Socket) returns path with unix:// URL scheme prefix. All we need here simple path string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha, thanks!

)

var (
kubeletSocket string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is a global var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need this global here for unit-tests to override it.

This change introduces kubelet client to get allocated device
information of a Pod from newly added Kubelet grpc service.
For more information please see:
[kubernetes/kubernetes#70508](kubernetes/kubernetes#70508)

Change-Id: I11e58ccdd52662601f445fa24c7d55c225441efc
Signed-off-by: Abdul Halim <abdul.halim@intel.com>
@ahalimx86
Copy link
Contributor Author

@s1061123 @zshi-redhat @dougbtv

I've addressed pretty much all the comments and feedback. Also fixed the travis-ci tests. All checks now pass. It's ready to be merged!

Thanks everyone for help reviewing/improving this PR... much appreciated!

@zshi-redhat
Copy link
Contributor

/lgtm

@dougbtv dougbtv merged commit d3c92b4 into k8snetworkplumbingwg:master Mar 22, 2019
@ahalimx86 ahalimx86 deleted the dev/kubelet_client branch March 22, 2019 15:06
@dougbtv dougbtv added the release-v3 PRs to make it in release branch label Mar 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-v3 PRs to make it in release branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants