New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes to enable Windows CNI #51063
Fixes to enable Windows CNI #51063
Conversation
Hi @madhanrm. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/ok-to-test |
Tagging @jdumars for FYI |
return nil | ||
} | ||
|
||
// GetPodNetworkStatus : Assuming addToNetwork is idempotent, we can call this API as many times as required to get the IPAddress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addToNetwork
is definitely not idempotent. The CNI spec doesn't require it and no plugins that I'm aware of implement ADD in this manner. The comment from before is... optimistic.
We're currently working on a spec change for a GET method within CNI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plugin that I have is implemented in that way, assuming no one would require adding 2 endpoints of the same type. This also simplifies the implementation of the windows cni plugin, which doesn't have to keep a repo of these data and simply pass these down to the platform based on those assumptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as soon as GET method is in CNI, we can switch GetPodNetworkStatus to use the new API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's a plugin system. You cannot guarantee that users will use your plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with squeed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Plugin doesn't enter the namespace and get the IP directly. It calls into Host Networking Service, to get the data. Kubernetes can also do that directly, but then we would destroy the very idea of abstracting that logic out behind the plugin and not bringing in a lot of platform specific APIs.
The main reason behind this approach is that, current containers are launched with process isolation. It could also be launched with hyper-v isolation (not supported in kubernetes but supported by docker and platform). To keep the implementation same across these modes, it is better to call the centralized to CNI plugin to get back the data, the current API would seamlessly support these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@madhanrm I agree with @squeed. The CNI specification does not declare ADD idempotent. That has nothing to do with the platform at all, that's just how it is. Making Windows act differently is simply confusing. So can we discuss this a bit more so that we all understand the restrictions?
Saying "This is a requirement for Windows CNI Plugins." is a solution to a problem, but there may be other solutions that do not require making ADD idempotent for Windows CNI.
IIRC Windows has no shared "infra" (eg net namespace) container. Instead, every container gets network setup called on it. What I think we're arguing about here, and correct me if I'm wrong, is that in the current PR here we'll get an ADD for the same container/namespace/netns tuple multiple times, which would in fact require idempotency that CNI does not guarantee.
Instead of that, could we figure out some other way to pass the data that your Windows plugin requires to tie the containers together?
One other thing I'm not clear on. What actual CNI plugin are you working on? And why couldn't some other 3rd party Windows plugin have a completely different mechanism for tying containers together, that doesn't require ADD to be idempotent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcbw Just for my own understanding, what is the argument against making ADD idempotent in the spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dcbw
The current Idempotent nature of ADD is only required because any subsequent ADD is used as a GET (PodStatus) to get the IP. Once we have GET implemented, that will not be a criteria for ADD to be idempotent. (But IMO, it doesn't make sense for ADD not to be idempotent, which is a separate discussion altogether)
Also, your understanding of "every container gets network setup" is getting changed by this PR. This PR is also to start using shared infra container, for which CNI is a requirement.
You should also understand the difference between the CNI plugin in Linux and in Windows. You really do not have to write individual CNI plugins on Windows unlike Linux. Instead CNI plugin is like a proxy to program the Host Networking Service (HNS) on Windows, which takes care of everything. The information is simply constructed and passed on to HNS. So a single CNI plugin would work for most of the network modes that can work independently on a single host. Other may develop similar plugins, but they have to go thro HNS, anything other than that would not be supported in Windows.
The idempotency is required here, because the CNI plugin doenst have to manage anything and simply pass it on to HNS. Else, the CNI would also have to manage redundant data as HNS. This would simplify the requirement of the Windows CNI plugins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any open items here? Can we not relax the idempotent nature of ADD for Windows here?
Am not seeing any progress in this PR.
pkg/kubelet/network/cni/cni.go
Outdated
@@ -319,6 +263,10 @@ func (plugin *cniNetworkPlugin) buildCNIRuntimeConf(podName string, podNs string | |||
glog.V(4).Infof("Got netns path %v", podNetnsPath) | |||
glog.V(4).Infof("Using netns path %v", podNs) | |||
|
|||
if podNetnsPath == "" { | |||
podNetnsPath = "none" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"none" is theoretically a valid netns path on Linux - can you just use the empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can remove this change.
/sig windows |
/sig node |
/sig network |
@caseydavenport could you or a delegate from SIG Network take a look at this implementation + #51064 Thanks! |
Thanks for the PR! This come up late in the 1.8 cycle and I am not comfortable to put it in. There is no test coverage for windows cni plugin and the test coverage for linux cni plugin is not great either. I would recommend we get it in during early 1.9 cycle and improve test coverage. I would also recommend you to join the sig-network meeting to get involved in feature planning. That would help the feature that you are working on to gain more visibility in the k8s community. Same applies to #51064 |
@freehan - if you look at the code, these are actually minor changes and we've isolated by separating into Windows and Linux helper function. AFAIK, there is no CNI plugin support on Windows today so this is net-new. We are very close to getting tests to pass. Can we discuss in sig-network meet-up? This and #51064 ? |
/test pull-kubernetes-e2e-kops-aws |
/test pull-kubernetes-unit |
Will take a look today. Thanks! |
pkg/kubelet/network/cni/cni.go
Outdated
if err != nil { | ||
glog.Errorf("Error while adding to cni lo network: %s", err) | ||
return err | ||
if plugin.loNetwork != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a comment to say linux comes with loNetwork and windows doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taken care of
Just a nit. At some point, we might just want to separate cni windows and linux. But it still seems maintainable for now. |
func getNetworkNamespace(c *dockertypes.ContainerJSON) (string, error) { | ||
if c.State.Pid == 0 { | ||
// Docker reports pid 0 for an exited container. | ||
return "", fmt.Errorf("Cannot find network namespace for the terminated container %q", c.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not your fault, but since you moved it.
s/Cannot/cannot
@@ -19,6 +19,7 @@ limitations under the License. | |||
package dockershim | |||
|
|||
import ( | |||
"fmt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new line after this to separate imports of builtin packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -47,3 +48,7 @@ func (ds *dockerService) determinePodIPBySandboxID(uid string) string { | |||
glog.Warningf("determinePodIPBySandboxID is unsupported in this build") | |||
return "" | |||
} | |||
|
|||
func getNetworkNamespace(c *dockertypes.ContainerJSON) (string, error) { | |||
return "", fmt.Errorf("Unsupported platform") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Unsupported/unsupported
// +build !windows | ||
|
||
/* | ||
Copyright 2014 The Kubernetes Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/2014/2017
// +build windows | ||
|
||
/* | ||
Copyright 2014 The Kubernetes Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/2014/2017
@@ -47,6 +47,8 @@ func (ds *dockerService) updateCreateConfig( | |||
podSandboxID string, securityOptSep rune, apiVersion *semver.Version) error { | |||
if networkMode := os.Getenv("CONTAINER_NETWORK"); networkMode != "" { | |||
createConfig.HostConfig.NetworkMode = dockercontainer.NetworkMode(networkMode) | |||
} else { | |||
modifyHostNetworkOptionForContainer(false, podSandboxID, createConfig.HostConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is in security_context.go
. Calling this in directly seems weird. I understand that this part of code for windows and linux are very different, maybe refactor/clean this up in the future.
glog.Errorf("Error while cni parsing result: %s", err) | ||
return nil, err | ||
} | ||
// Parse the result and get the IPAddress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't see much parsing here, so perhaps you can just remove the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe the comment was meant for line 51-55?
|
||
result, err := plugin.addToNetwork(plugin.getDefaultNetwork(), name, namespace, id, netnsPath) | ||
|
||
glog.V(5).Infof("GetPodNetworkStatus result %+v", result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why printing the result before checking the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to know the result either way. Error could contain some meaningful result to investigate issues.
// | ||
// This place is chosen as a hack for now, until we have a better place to do this, | ||
// like immediately after ContainerCreation, | ||
// Also this is very windows specific for now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove Also this is very windows specific for now
I think it's pretty clear from the context that this is windows specific.
// Using the CONTAINER_NETWORK flag to determine to choose old impl or current impl. | ||
// CONTAINER_NETWORK is only kept for backward compatibility, since old OS version doens;t support | ||
// Sandbox concept | ||
if networkMode := os.Getenv("CONTAINER_NETWORK"); networkMode == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you at least document what version relies on this env variable?
Mostly nits since @freehan already lgtm'd it. Please ping me for approval after the comments are addressed. |
@yujuhong I have addressed all the comments. But while testing I found an issue in the latest mainline code, where os.Rename is failing with file already in use. Can you check? |
Following are part of this commit +++++++++++++++++++++++++++++++++ * Windows CNI Support (1) Support to use --network-plugin=cni (2) Handled platform requirement of calling CNI ADD for all the containers. (2.1) For POD Infra container, netNs has to be empty (2.2) For all other containers, sharing the network namespace of POD container, should pass netNS name as "container:<Pod Infra Container Id>", same as the NetworkMode of the current container (2.3) The Windows CNI plugin has to handle this to call into Platform. Sample Windows CNI Plugin code to be shared soon. * Sandbox support for Windows (1) Sandbox support for Windows. Works only with Docker runtime. (2) Retained CONTAINER_NETWORK as a backward compatibilty flag, to not break existing deployments using it. (3) Works only with CNI plugin enabled. (*) Changes to reinvoke CNI ADD for every new container created. This is hooked up with PodStatus, but would be ideal to move it outside of this, once we have CNI GET support
Oops...I sent a fix #55034. Sorry about that. |
/test pull-kubernetes-node-e2e |
@yujuhong Validated your fix. Thanks for the quick response. |
@madhanrm were you testing on Windows? |
@yujuhong Yes, testing was done on windows. |
/approve |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: freehan, jdumars, madhanrm, yujuhong Associated issue: 49646 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue (batch tested with PRs 53679, 51063). If you want to cherry-pick this change to another branch, please follow the instructions here. |
What this PR does / why we need it:
This PR has fixed which enables Kubelet to use Windows CNI plugin.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes ##49646
Special notes for your reviewer:
Release note: