retry after iptables failed #1573

Closed
chenyf opened this Issue Aug 17, 2013 · 7 comments

Projects

None yet

7 participants

@chenyf
chenyf commented Aug 17, 2013

during my pressure test, I found docker create may failed cause by iptables; the errno is 4 which mean "interrupt by signal", so I suggest add retry when iptables failed with errno 4.

// Wrapper around the iptables command
func iptables(args ...string) error {
path, err := exec.LookPath("iptables")
if err != nil {
return fmt.Errorf("command not found: iptables")
}
if err := exec.Command(path, args...).Run(); err != nil {
return fmt.Errorf("iptables failed: iptables %v", strings.Join(args, " "))
}
return nil
}

@jpetazzo
Contributor

Tagged as "bug" to make sure we pay attention to this after 0.7!

@apatil
Contributor
apatil commented Oct 11, 2013

You can reproduce the issue using this script.

@bosky101

You can also reproduce this via this script that creates 1000 containers ( even though 1000 is well below the ulimit).
https://gist.github.com/bosky101/7041254

The first error came after 374 containers were made on a fresh DO machine with 40GB hdd, 2GB RAM , docker 0.6.3

@jpetazzo
Contributor

Could this be a duplicate of #1319 ?

@jpoimboe
Contributor

Can anybody recreate it with the latest version of docker? I wasn't able to recreate with docker 0.7.6 on Ubuntu 12.04 with either script.

@crosbymichael crosbymichael modified the milestone: 1.0, 0.9.0 Mar 3, 2014
@eandre
eandre commented May 14, 2014

I can recreate this quite easily when spawning a lot of containers concurrently, all linked to the same container (we do this for integration testing; a lot of concurrent tests linked to a postgres database container).

The issue I am encountering is that iptables errors with exit code 4 if the xtables lock cannot be taken immediately. This happens in https://github.com/dotcloud/docker/blob/master/daemon/networkdriver/bridge/driver.go (LinkContainers).

To me it seems like a simple fix to pass "-w" to iptables to cause it to wait for the lock rather than exiting immediately if it cannot acquire the lock. If that sounds sane to you I can submit a pull request to wait for the lock.

@crosbymichael
Member

@eandre Do you think you could make a PR and help test that this actually fixes the issue?

@crosbymichael crosbymichael modified the milestone: 1.0, old_1.0 May 15, 2014
@crosbymichael crosbymichael pushed a commit to crosbymichael/docker that referenced this issue May 23, 2014
Michael Crosby Add wait flag to iptables
Fixes #1573
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
b315c38
@tiborvass tiborvass closed this in #5998 May 23, 2014
@vishh vishh added a commit to vishh/docker that referenced this issue May 28, 2014
@vishh Michael Crosby + vishh Add wait flag to iptables
Fixes #1573
Docker-DCO-1.1-Signed-off-by: Michael Crosby <michael@crosbymichael.com> (github: crosbymichael)
daf3127
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment