Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dynamic mac entry from fdb on endpoint deletion #1792

Merged
merged 1 commit into from
Jun 5, 2017

Conversation

sanimej
Copy link

@sanimej sanimej commented Jun 1, 2017

Fixes #1768

Noticed this issue when debugging few test case failures in e2e test suite. This is likely to be the root cause for docker #33076 and the service access issue during container bring up reported in docker #30321

This PR implements a different approach to fix this issue. #1783 has an alternative approach, which could have a performance impact.

libnetwork IPAM recycles the IP address when a task goes down on a node and brought up in another node. For remote tasks overlay network namespace has one static fdb entry programmed by the driver and one dynamic entry learned by the bridge from the data path when a packet is received from the remote container. The dynamic entry ages out after 300 seconds. If a task on a remote node goes down and gets scheduled on a node the dynamic fdb entry still remains. Unless the container generates some data traffic it won't be updated. This can lead to unpredictability in accessing the container; sometimes it will work pretty quickly if there is some traffic from the container and the mac entry gets updated. If the container is completely silent it can lead to upto 300 seconds of traffic loss.

Fix is to delete the dynamic fdb entry as well. But this doesn't work in some kernel versions because untagged fdb entries are assumed to be in default vlan 1. To address this, the default vlan for the bridge has to be set using the sysctl variable.

Signed-off-by: Santhosh Manohar santhosh@docker.com

Signed-off-by: Santhosh Manohar <santhosh@docker.com>
Copy link
Contributor

@mavenugo mavenugo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor comments.

LGTM otherwise.

@@ -505,6 +556,25 @@ func (n *network) setupSubnetSandbox(s *subnet, brName, vxlanName string) error
return fmt.Errorf("vxlan interface creation failed for subnet %q: %v", s.subnetIP.String(), err)
}

if !hostMode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this specific to non-hostMode ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reexec is to avoid calling the potential blocking sysfs mount operations causing the namespace corruption. Its not needed in the hostmode since there is no per overlay network namespace. But in the hostmode also we might have to set the default-vlan, directly from the daemon without a reexec. I will do that in a separate PR after trying out in the hostmode.

path := filepath.Join("/sys/class/net", brName, "bridge/default_pvid")
data := []byte{'0', '\n'}

if err = ioutil.WriteFile(path, data, 0644); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this works equally well across multiple Distros.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested Ubuntu 15.10 running 4.2 kernel and 3.19 kernel. And Centos running 3.10.0-327.10.1.el7.x86_64.

@mavenugo
Copy link
Contributor

mavenugo commented Jun 5, 2017

Thanks @sanimej

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

service discovery between service task and unmanaged container takes long at times
2 participants