Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubenet: disable DAD in the container. #55247

Merged
merged 1 commit into from
Nov 9, 2017

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Nov 7, 2017

Since kubenet externally guarantees that IP address will not conflict, we can short-circuit the kernel's normal wait. This lets us avoid the 1 second network wait.

What this PR does / why we need it:
Fixes the pod startup latency identified in #54651 and #55060

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 7, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @squeed. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@squeed
Copy link
Contributor Author

squeed commented Nov 7, 2017

/sig network

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Nov 7, 2017
@porridge
Copy link
Member

porridge commented Nov 7, 2017

@shyamjvs FYI

@porridge
Copy link
Member

porridge commented Nov 7, 2017

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 7, 2017
@@ -820,3 +825,43 @@ func (plugin *kubenetNetworkPlugin) syncEbtablesDedupRules(macAddr net.HardwareA
return
}
}

// disableContainerDAD disables duplicate address detection in the container.
// DAD has a negative affect on pod creation latency, since we have to wait
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd link to the issue here.

@@ -304,6 +304,11 @@ func (plugin *kubenetNetworkPlugin) Capabilities() utilsets.Int {
// TODO: Don't pass the pod to this method, it only needs it for bandwidth
// shaping and hostport management.
func (plugin *kubenetNetworkPlugin) setup(namespace string, name string, id kubecontainer.ContainerID, pod *v1.Pod, annotations map[string]string) error {
// Disable DAD so we skip the kernel delay on bringing up new interfaces.
if err := plugin.disableContainerDAD(id); err != nil {
glog.V(4).Infof("Failed to disable DAD in container: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion - but this seems to be important enough to be V(2). It's important setup-related and one-off msg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it for a bit, but it doesn't break anything or affect functionality in any way - it only costs an extra second of pod latency. But I don't have a strong opinion.

Since kubenet externally guarantees that IP address will not conflict,
we can short-circuit the kernel's normal wait. This lets us avoid the 1
second network wait.
@porridge
Copy link
Member

porridge commented Nov 7, 2017

/retest

@porridge
Copy link
Member

porridge commented Nov 7, 2017

/test pull-kubernetes-e2e-gce

@danehans
Copy link

danehans commented Nov 7, 2017

/area ipv6

@shyamjvs
Copy link
Member

shyamjvs commented Nov 7, 2017

@bowei Could you or someone from @kubernetes/sig-network-misc review this soonish? Need to get this in asap.

@@ -304,6 +304,11 @@ func (plugin *kubenetNetworkPlugin) Capabilities() utilsets.Int {
// TODO: Don't pass the pod to this method, it only needs it for bandwidth
// shaping and hostport management.
func (plugin *kubenetNetworkPlugin) setup(namespace string, name string, id kubecontainer.ContainerID, pod *v1.Pod, annotations map[string]string) error {
// Disable DAD so we skip the kernel delay on bringing up new interfaces.
if err := plugin.disableContainerDAD(id); err != nil {
glog.V(3).Infof("Failed to disable DAD in container: %v", err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my understanding that a pod will not get an IPv6 address if hairpin and dad are enabled. If that is the case, should a check be implemented and have setup return an error if disableContainerDAD returns an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CNI bridge plugin correctly handles this case - but it's a bit more subtle. It tries to use enhanced_dad, which all recent kernels have. That includes a nonce so hairpin is correctly detected.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I looked at the bridge plugin code and saw exactly what you describe. I am fine with kubenet not supporting enhanced_dad. However, I still think it makes sense to throw an error instead of a log message if hairpin is enabled and disableContainerDAD returns an error. Otherwise, the pod will not get an IP address. Do you agree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not this particular line succeeds or fails won't affect the plugin, which will raise errors appropriately. So, I don't think this needs to fail here, which is a minor performance tweak.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... now it makes sense. kubenet is essentially backed by the bridge/local-ipam plugins. I got it.

@danehans
Copy link

danehans commented Nov 8, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2017
@danehans
Copy link

danehans commented Nov 8, 2017

@bowei @thockin would be much appreciated if you could review.

@thockin
Copy link
Member

thockin commented Nov 9, 2017

What about tests? Is there nothing worth testing here?
/approve

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans, squeed, thockin

Associated issue: 54651

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 9, 2017
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ipv6 cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/network Categorizes an issue or PR as relevant to SIG Network. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants