-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubenet: disable DAD in the container. #55247
Conversation
Hi @squeed. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig network |
@shyamjvs FYI |
/ok-to-test |
@@ -820,3 +825,43 @@ func (plugin *kubenetNetworkPlugin) syncEbtablesDedupRules(macAddr net.HardwareA | |||
return | |||
} | |||
} | |||
|
|||
// disableContainerDAD disables duplicate address detection in the container. | |||
// DAD has a negative affect on pod creation latency, since we have to wait |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd link to the issue here.
@@ -304,6 +304,11 @@ func (plugin *kubenetNetworkPlugin) Capabilities() utilsets.Int { | |||
// TODO: Don't pass the pod to this method, it only needs it for bandwidth | |||
// shaping and hostport management. | |||
func (plugin *kubenetNetworkPlugin) setup(namespace string, name string, id kubecontainer.ContainerID, pod *v1.Pod, annotations map[string]string) error { | |||
// Disable DAD so we skip the kernel delay on bringing up new interfaces. | |||
if err := plugin.disableContainerDAD(id); err != nil { | |||
glog.V(4).Infof("Failed to disable DAD in container: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion - but this seems to be important enough to be V(2). It's important setup-related and one-off msg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it for a bit, but it doesn't break anything or affect functionality in any way - it only costs an extra second of pod latency. But I don't have a strong opinion.
Since kubenet externally guarantees that IP address will not conflict, we can short-circuit the kernel's normal wait. This lets us avoid the 1 second network wait.
f4722c2
to
23f4afc
Compare
/retest |
/test pull-kubernetes-e2e-gce |
/area ipv6 |
@bowei Could you or someone from @kubernetes/sig-network-misc review this soonish? Need to get this in asap. |
@@ -304,6 +304,11 @@ func (plugin *kubenetNetworkPlugin) Capabilities() utilsets.Int { | |||
// TODO: Don't pass the pod to this method, it only needs it for bandwidth | |||
// shaping and hostport management. | |||
func (plugin *kubenetNetworkPlugin) setup(namespace string, name string, id kubecontainer.ContainerID, pod *v1.Pod, annotations map[string]string) error { | |||
// Disable DAD so we skip the kernel delay on bringing up new interfaces. | |||
if err := plugin.disableContainerDAD(id); err != nil { | |||
glog.V(3).Infof("Failed to disable DAD in container: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's my understanding that a pod will not get an IPv6 address if hairpin and dad are enabled. If that is the case, should a check be implemented and have setup return an error if disableContainerDAD returns an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CNI bridge plugin correctly handles this case - but it's a bit more subtle. It tries to use enhanced_dad, which all recent kernels have. That includes a nonce so hairpin is correctly detected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I looked at the bridge plugin code and saw exactly what you describe. I am fine with kubenet not supporting enhanced_dad. However, I still think it makes sense to throw an error instead of a log message if hairpin is enabled and disableContainerDAD
returns an error. Otherwise, the pod will not get an IP address. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether or not this particular line succeeds or fails won't affect the plugin, which will raise errors appropriately. So, I don't think this needs to fail here, which is a minor performance tweak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... now it makes sense. kubenet is essentially backed by the bridge/local-ipam plugins. I got it.
/lgtm |
What about tests? Is there nothing worth testing here? |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danehans, squeed, thockin Associated issue: 54651 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
Since kubenet externally guarantees that IP address will not conflict, we can short-circuit the kernel's normal wait. This lets us avoid the 1 second network wait.
What this PR does / why we need it:
Fixes the pod startup latency identified in #54651 and #55060
Release note: