Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to enable Windows CNI #51063

Merged
merged 1 commit into from Nov 3, 2017
Merged

Conversation

madhanrm
Copy link
Contributor

What this PR does / why we need it:
This PR has fixed which enables Kubelet to use Windows CNI plugin.
Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #
#49646
Special notes for your reviewer:

Release note:

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 22, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @madhanrm. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Aug 22, 2017
@thockin thockin assigned dnardo, freehan and dcbw and unassigned smarterclayton and thockin Aug 22, 2017
@freehan
Copy link
Contributor

freehan commented Aug 22, 2017

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 22, 2017
@dineshgovindasamy
Copy link

Tagging @jdumars for FYI

return nil
}

// GetPodNetworkStatus : Assuming addToNetwork is idempotent, we can call this API as many times as required to get the IPAddress
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addToNetwork is definitely not idempotent. The CNI spec doesn't require it and no plugins that I'm aware of implement ADD in this manner. The comment from before is... optimistic.

We're currently working on a spec change for a GET method within CNI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugin that I have is implemented in that way, assuming no one would require adding 2 endpoints of the same type. This also simplifies the implementation of the windows cni plugin, which doesn't have to keep a repo of these data and simply pass these down to the platform based on those assumptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, as soon as GET method is in CNI, we can switch GetPodNetworkStatus to use the new API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's a plugin system. You cannot guarantee that users will use your plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with squeed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Plugin doesn't enter the namespace and get the IP directly. It calls into Host Networking Service, to get the data. Kubernetes can also do that directly, but then we would destroy the very idea of abstracting that logic out behind the plugin and not bringing in a lot of platform specific APIs.

The main reason behind this approach is that, current containers are launched with process isolation. It could also be launched with hyper-v isolation (not supported in kubernetes but supported by docker and platform). To keep the implementation same across these modes, it is better to call the centralized to CNI plugin to get back the data, the current API would seamlessly support these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@madhanrm I agree with @squeed. The CNI specification does not declare ADD idempotent. That has nothing to do with the platform at all, that's just how it is. Making Windows act differently is simply confusing. So can we discuss this a bit more so that we all understand the restrictions?

Saying "This is a requirement for Windows CNI Plugins." is a solution to a problem, but there may be other solutions that do not require making ADD idempotent for Windows CNI.

IIRC Windows has no shared "infra" (eg net namespace) container. Instead, every container gets network setup called on it. What I think we're arguing about here, and correct me if I'm wrong, is that in the current PR here we'll get an ADD for the same container/namespace/netns tuple multiple times, which would in fact require idempotency that CNI does not guarantee.

Instead of that, could we figure out some other way to pass the data that your Windows plugin requires to tie the containers together?

One other thing I'm not clear on. What actual CNI plugin are you working on? And why couldn't some other 3rd party Windows plugin have a completely different mechanism for tying containers together, that doesn't require ADD to be idempotent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcbw Just for my own understanding, what is the argument against making ADD idempotent in the spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcbw
The current Idempotent nature of ADD is only required because any subsequent ADD is used as a GET (PodStatus) to get the IP. Once we have GET implemented, that will not be a criteria for ADD to be idempotent. (But IMO, it doesn't make sense for ADD not to be idempotent, which is a separate discussion altogether)

Also, your understanding of "every container gets network setup" is getting changed by this PR. This PR is also to start using shared infra container, for which CNI is a requirement.

You should also understand the difference between the CNI plugin in Linux and in Windows. You really do not have to write individual CNI plugins on Windows unlike Linux. Instead CNI plugin is like a proxy to program the Host Networking Service (HNS) on Windows, which takes care of everything. The information is simply constructed and passed on to HNS. So a single CNI plugin would work for most of the network modes that can work independently on a single host. Other may develop similar plugins, but they have to go thro HNS, anything other than that would not be supported in Windows.
The idempotency is required here, because the CNI plugin doenst have to manage anything and simply pass it on to HNS. Else, the CNI would also have to manage redundant data as HNS. This would simplify the requirement of the Windows CNI plugins.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any open items here? Can we not relax the idempotent nature of ADD for Windows here?
Am not seeing any progress in this PR.

@@ -319,6 +263,10 @@ func (plugin *cniNetworkPlugin) buildCNIRuntimeConf(podName string, podNs string
glog.V(4).Infof("Got netns path %v", podNetnsPath)
glog.V(4).Infof("Using netns path %v", podNs)

if podNetnsPath == "" {
podNetnsPath = "none"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"none" is theoretically a valid netns path on Linux - can you just use the empty string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can remove this change.

@jdumars
Copy link
Member

jdumars commented Aug 24, 2017

/sig windows

@k8s-ci-robot k8s-ci-robot added the sig/windows Categorizes an issue or PR as relevant to SIG Windows. label Aug 24, 2017
@jdumars jdumars added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Aug 24, 2017
@jdumars
Copy link
Member

jdumars commented Aug 24, 2017

/sig node

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Aug 24, 2017
@jdumars
Copy link
Member

jdumars commented Aug 24, 2017

/sig network

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Aug 24, 2017
@jdumars
Copy link
Member

jdumars commented Aug 24, 2017

@caseydavenport could you or a delegate from SIG Network take a look at this implementation + #51064

Thanks!

@freehan
Copy link
Contributor

freehan commented Aug 24, 2017

Thanks for the PR!

This come up late in the 1.8 cycle and I am not comfortable to put it in. There is no test coverage for windows cni plugin and the test coverage for linux cni plugin is not great either. I would recommend we get it in during early 1.9 cycle and improve test coverage. I would also recommend you to join the sig-network meeting to get involved in feature planning. That would help the feature that you are working on to gain more visibility in the k8s community.

Same applies to #51064

@JMesser81
Copy link

@freehan - if you look at the code, these are actually minor changes and we've isolated by separating into Windows and Linux helper function. AFAIK, there is no CNI plugin support on Windows today so this is net-new. We are very close to getting tests to pass.

Can we discuss in sig-network meet-up? This and #51064 ?

@madhanrm
Copy link
Contributor Author

/test pull-kubernetes-e2e-kops-aws

@madhanrm
Copy link
Contributor Author

/test pull-kubernetes-unit

@dineshgovindasamy
Copy link

@thockin @freehan Could you please review this PR?

@freehan
Copy link
Contributor

freehan commented Nov 1, 2017

Will take a look today. Thanks!

if err != nil {
glog.Errorf("Error while adding to cni lo network: %s", err)
return err
if plugin.loNetwork != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment to say linux comes with loNetwork and windows doesn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taken care of

@freehan
Copy link
Contributor

freehan commented Nov 2, 2017

Just a nit.
/approve

At some point, we might just want to separate cni windows and linux. But it still seems maintainable for now.

func getNetworkNamespace(c *dockertypes.ContainerJSON) (string, error) {
if c.State.Pid == 0 {
// Docker reports pid 0 for an exited container.
return "", fmt.Errorf("Cannot find network namespace for the terminated container %q", c.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your fault, but since you moved it.
s/Cannot/cannot

@@ -19,6 +19,7 @@ limitations under the License.
package dockershim

import (
"fmt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new line after this to separate imports of builtin packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -47,3 +48,7 @@ func (ds *dockerService) determinePodIPBySandboxID(uid string) string {
glog.Warningf("determinePodIPBySandboxID is unsupported in this build")
return ""
}

func getNetworkNamespace(c *dockertypes.ContainerJSON) (string, error) {
return "", fmt.Errorf("Unsupported platform")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Unsupported/unsupported

// +build !windows

/*
Copyright 2014 The Kubernetes Authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/2014/2017

// +build windows

/*
Copyright 2014 The Kubernetes Authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/2014/2017

@@ -47,6 +47,8 @@ func (ds *dockerService) updateCreateConfig(
podSandboxID string, securityOptSep rune, apiVersion *semver.Version) error {
if networkMode := os.Getenv("CONTAINER_NETWORK"); networkMode != "" {
createConfig.HostConfig.NetworkMode = dockercontainer.NetworkMode(networkMode)
} else {
modifyHostNetworkOptionForContainer(false, podSandboxID, createConfig.HostConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is in security_context.go. Calling this in directly seems weird. I understand that this part of code for windows and linux are very different, maybe refactor/clean this up in the future.

glog.Errorf("Error while cni parsing result: %s", err)
return nil, err
}
// Parse the result and get the IPAddress
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't see much parsing here, so perhaps you can just remove the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe the comment was meant for line 51-55?


result, err := plugin.addToNetwork(plugin.getDefaultNetwork(), name, namespace, id, netnsPath)

glog.V(5).Infof("GetPodNetworkStatus result %+v", result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why printing the result before checking the error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to know the result either way. Error could contain some meaningful result to investigate issues.

//
// This place is chosen as a hack for now, until we have a better place to do this,
// like immediately after ContainerCreation,
// Also this is very windows specific for now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove Also this is very windows specific for now
I think it's pretty clear from the context that this is windows specific.

// Using the CONTAINER_NETWORK flag to determine to choose old impl or current impl.
// CONTAINER_NETWORK is only kept for backward compatibility, since old OS version doens;t support
// Sandbox concept
if networkMode := os.Getenv("CONTAINER_NETWORK"); networkMode == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you at least document what version relies on this env variable?

@yujuhong
Copy link
Contributor

yujuhong commented Nov 2, 2017

Mostly nits since @freehan already lgtm'd it.

Please ping me for approval after the comments are addressed.

@madhanrm
Copy link
Contributor Author

madhanrm commented Nov 2, 2017

@yujuhong I have addressed all the comments. But while testing I found an issue in the latest mainline code, where os.Rename is failing with file already in use.
I see that util/store/filestore.go is newly added and func writeFile(fs utilfs.Filesystem, path string, data []byte) (retErr error) is calling fs.Rename at the end of the method, without closing the file (tmpFile.Close()). File closing is done in defer.

Can you check?

Following are part of this commit
+++++++++++++++++++++++++++++++++

* Windows CNI Support
	(1) Support to use --network-plugin=cni
	(2) Handled platform requirement of calling CNI ADD for all the containers.
	(2.1) For POD Infra container, netNs has to be empty
	(2.2) For all other containers, sharing the network namespace of POD container,
	      should pass netNS name as "container:<Pod Infra Container Id>", same as the
              NetworkMode of the current container
	(2.3) The Windows CNI plugin has to handle this to call into Platform.
              Sample Windows CNI Plugin code to be shared soon.
* Sandbox support for Windows
	(1) Sandbox support for Windows. Works only with Docker runtime.
	(2) Retained CONTAINER_NETWORK as a backward compatibilty flag,
	    to not break existing deployments using it.
	(3) Works only with CNI plugin enabled.

(*) Changes to reinvoke CNI ADD for every new container created. This is hooked up with PodStatus,
    but would be ideal to move it outside of this, once we have CNI GET support
@yujuhong
Copy link
Contributor

yujuhong commented Nov 3, 2017

Can you check?

Oops...I sent a fix #55034. Sorry about that.

@madhanrm
Copy link
Contributor Author

madhanrm commented Nov 3, 2017

/test pull-kubernetes-node-e2e

@madhanrm
Copy link
Contributor Author

madhanrm commented Nov 3, 2017

@yujuhong Validated your fix. Thanks for the quick response.
Can you approve this PR?

@yujuhong
Copy link
Contributor

yujuhong commented Nov 3, 2017

@yujuhong I have addressed all the comments. But while testing I found an issue in the latest mainline code, where os.Rename is failing with file already in use.
I see that util/store/filestore.go is newly added and func writeFile(fs utilfs.Filesystem, path string, data []byte) (retErr error) is calling fs.Rename at the end of the method, without closing the file (tmpFile.Close()). File closing is done in defer.

@madhanrm were you testing on Windows?

@madhanrm
Copy link
Contributor Author

madhanrm commented Nov 3, 2017

@yujuhong Yes, testing was done on windows.

@yujuhong
Copy link
Contributor

yujuhong commented Nov 3, 2017

/approve
based on @freehan's LGTM

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 3, 2017
@madhanrm
Copy link
Contributor Author

madhanrm commented Nov 3, 2017

@freehan @yujuhong can one of you add a lgtm label.

@freehan
Copy link
Contributor

freehan commented Nov 3, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: freehan, jdumars, madhanrm, yujuhong

Associated issue: 49646

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 53679, 51063). If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet