Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad test: CRI could not add IP address to \"cni0\": file exists #3507

Closed
crosbymichael opened this issue Aug 8, 2019 · 5 comments
Closed
Assignees
Labels

Comments

@crosbymichael
Copy link
Member

 Failure in Spec Setup (BeforeEach) [12.061 seconds]
[k8s.io] Container
/home/travis/gopath/src/github.com/kubernetes-sigs/cri-tools/pkg/framework/framework.go:72
  runtime should support basic operations on container
  /home/travis/gopath/src/github.com/kubernetes-sigs/cri-tools/pkg/validate/container.go:68
    runtime should support execSync with timeout [Conformance] [BeforeEach]
    /home/travis/gopath/src/github.com/kubernetes-sigs/cri-tools/pkg/validate/container.go:132
    failed to create PodSandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "aa2ebdf7cea1621c1c3bad8e96b68f8de1d772ab143bb63e4a8e5f2e7ad2e721": failed to set bridge addr: could not add IP address to "cni0": file exists
    Unexpected error:
        <*status.statusError | 0xc0004280a0>: {
            Code: 2,
            Message: "failed to setup network for sandbox \"aa2ebdf7cea1621c1c3bad8e96b68f8de1d772ab143bb63e4a8e5f2e7ad2e721\": failed to set bridge addr: could not add IP address to \"cni0\": file exists",
            Details: nil,
            XXX_NoUnkeyedLiteral: {},
            XXX_unrecognized: nil,
            XXX_sizecache: 0,
        }
        rpc error: code = Unknown desc = failed to setup network for sandbox "aa2ebdf7cea1621c1c3bad8e96b68f8de1d772ab143bb63e4a8e5f2e7ad2e721": failed to set bridge addr: could not add IP address to "cni0": file exists
    occurred
    /home/travis/gopath/src/github.com/kubernetes-sigs/cri-tools/pkg/framework/util.go:214
@Random-Liu
Copy link
Member

Random-Liu commented Aug 9, 2019

Talked with @cadmuxe from our networking team.

We believed that this is a bug in CNI bridge plugin.

When test starts up, multiple pods are started concurrently and containerd will setup their network with CNI.
Since the cni0 interface doesn't exist, several goroutines will try to create the bridge at the same time.

The bridge creation seems to be thread safe, because the syscall.EEXIST error is handled: https://github.com/containernetworking/plugins/blob/v0.7.5/plugins/main/bridge/bridge.go#L222.

However, it is not the case for bridge address assignment. Although the function tries to go through existing addresses of the bridge, but it is still possible that 2 goroutines both pass the check, and try to assign IP.
https://github.com/containernetworking/plugins/blob/v0.7.5/plugins/main/bridge/bridge.go#L143

Solutions:

  • Option 1: Fix cni plugin;
  • Option 2: Make sure first pod network setup runs in serial and blocks other network setup.

I prefer option 1, because:

  1. This is not a big issue in Kubernetes in production, because kubelet will usually retry creating the pod;
  2. Option 2 feels dangerous to me if the first pod network setup deadlock for some transient issues.

@Random-Liu
Copy link
Member

containernetworking/plugins#366 should fix this. (If they accept that patch)

@crosbymichael
Copy link
Member Author

To resolve this issue will it be a plugin bump in CRI?

@Random-Liu
Copy link
Member

@crosbymichael Yeah. containerd/cri#1237

@Random-Liu
Copy link
Member

Fixed by Update cni plugins to v0.7.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants