Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests TestDockerNetworkIPAMOptions and TestDockerNetworkCustomIPAM are flucky #42216

Open
Snorch opened this issue Mar 29, 2021 · 1 comment
Open

Comments

@Snorch
Copy link
Contributor

Snorch commented Mar 29, 2021

Description

Tests TestDockerNetworkIPAMOptions and TestDockerNetworkCustomIPAM are flucky

In our runs of docker integration tests (docker running in Virtuozzo system-containers) we see that testsuite can hang on those testcases until timeout (for a couple of hours).

Steps to reproduce the issue:

It reproduces sometimes in our normal test runs, so it reproduces with only:

make test-integration-cli

But to make reproduce 100% one can just do:

# Prepare docker development container (checkout v20.10.5)

make BIND_DIR=. shell

# Prepare dockerd inside

hack/make.sh binary install-binary
dockerd -D &>daemon.log &

# Create bridge with addr 172.28.0.1/16

ip link add name my_bridge type bridge
ip addr add dev my_bridge 172.28.0.1/16
ip link set my_bridge up

# Run test

TESTFLAGS='-test.run DockerNetworkSuite/TestDockerNetworkIPAMOptions' hack/make.sh test-integration-cli

Describe the results you received:

The test TestDockerNetworkIPAMOptions would hang forever on docker network create --ipam-driver dummy-ipam-driver ...

Describe the results you expected:

No hang.

Additional information you deem important (e.g. issue happens only occasionally):

That is because docker daemon would be busy-looping in requestPoolHelper trying to get pool, get pool "172.28.0.0/16" from dummyIPAMDriver, and netutils.FindAvailableNetwork would detect that pool overlaps with existing device address.

On actual reproduce (with make test-integration-cli) the address is held by some bridge leftover from swarm testsuite:

72: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:0d:75:2a:14 brd ff:ff:ff:ff:ff:ff
    inet 172.28.0.1/16 brd 172.28.255.255 scope global docker_gwbridge
       valid_lft forever preferred_lft forever

With dlv I can see stacks like:

goroutine 102121 [select (scan), 1 minutes]:
net/http.(*persistConn).roundTrip(0xc000f6a6c0, 0xc1185fe930, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/transport.go:2431 +0x8de
net/http.(*Transport).roundTrip(0xc001098000, 0xc1185de200, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/transport.go:535 +0x142d
net/http.(*Transport).RoundTrip(0xc001098000, 0xc1185de200, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/roundtrip.go:17 +0x60
net/http.send(0xc1185de200, 0x7ff748209ec0, 0xc001267110, 0x0, 0x0, 0x0, 0x0, 0x560de093a660, 0x0, 0x0)
	/usr/local/go/src/net/http/client.go:250 +0x4e3
net/http.(*Client).send(0xc001267140, 0xc1185de200, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/client.go:174 +0x1f2
net/http.(*Client).do(0xc001267140, 0xc1185de200, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/client.go:641 +0xfcd
net/http.(*Client).Do(0xc001267140, 0xc1185de200, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/http/client.go:509 +0x60
github.com/docker/docker/pkg/plugins.(*Client).callWithRetry(0xc0011bc200, 0xc11845f680, 0x16, 0x560de0981380, 0xc1185fe870, 0x1, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/pkg/plugins/client.go:172 +0x3c0
github.com/docker/docker/pkg/plugins.(*Client).CallWithOptions(0xc0011bc200, 0xc11845f680, 0x16, 0x560de03691c0, 0xc118604640, 0x560de0568480, 0xc118604680, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/pkg/plugins/client.go:113 +0x223
github.com/docker/docker/pkg/plugins.(*Client).Call(0xc0011bc200, 0xc11845f680, 0x16, 0x560de03691c0, 0xc118604640, 0x560de0568480, 0xc118604680, 0x0, 0x0)
	/go/src/github.com/docker/docker/pkg/plugins/client.go:102 +0x9e
github.com/docker/docker/vendor/github.com/docker/libnetwork/ipams/remote.(*allocator).call(0xc0011bc400, 0x560ddf270590, 0xb, 0x560de03691c0, 0xc118604640, 0x560de09b2a60, 0xc118604680, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/ipams/remote/remote.go:92 +0x12b
github.com/docker/docker/vendor/github.com/docker/libnetwork/ipams/remote.(*allocator).RequestPool(0xc0011bc400, 0xc0010739e8, 0x7, 0x0, 0x0, 0x0, 0x0, 0xc001b42ba0, 0xc001b43300, 0x0, ...)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/ipams/remote/remote.go:123 +0x205
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*network).requestPoolHelper(0xc000759880, 0x560de09fca60, 0xc0011bc400, 0xc0010739e8, 0x7, 0x0, 0x0, 0x0, 0x0, 0xc001b42ba0, ...)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/network.go:1563 +0x117
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*network).ipamAllocateVersion(0xc000759880, 0x4, 0x560de09fca60, 0xc0011bc400, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/network.go:1636 +0x774
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*network).ipamAllocate(0xc000759880, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/network.go:1542 +0x2f6
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*controller).NewNetwork(0xc00082a100, 0xc000e49a00, 0x6, 0xc000e49a06, 0x3, 0xc000e91680, 0x40, 0xc001352d20, 0x7, 0xc, ...)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/controller.go:818 +0x1ab2
github.com/docker/docker/daemon.(*Daemon).createNetwork(0xc00000c1e0, 0x1, 0xc000e49a00, 0x6, 0x0, 0x0, 0x0, 0xc001b42b70, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/daemon/network.go:365 +0x105e
github.com/docker/docker/daemon.(*Daemon).CreateNetwork(0xc00000c1e0, 0x1, 0xc000e49a00, 0x6, 0x0, 0x0, 0x0, 0xc001b42b70, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/daemon/network.go:286 +0xad
github.com/docker/docker/api/server/router/network.(*networkRouter).postNetworkCreate(0xc000c385c0, 0x560de09e0760, 0xc001b42a80, 0x560de09d21a0, 0xc000ba8380, 0xc00134ad00, 0xc001b42570, 0x0, 0x0)
	/go/src/github.com/docker/docker/api/server/router/network/network_routes.go:229 +0x540
github.com/docker/docker/api/server/middleware.ExperimentalMiddleware.WrapHandler.func1(0x560de09e0760, 0xc001b42a80, 0x560de09d21a0, 0xc000ba8380, 0xc00134ad00, 0xc001b42570, 0x0, 0x0)
	/go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0xf8
github.com/docker/docker/api/server/middleware.VersionMiddleware.WrapHandler.func1(0x560de09e0760, 0xc001b42a80, 0x560de09d21a0, 0xc000ba8380, 0xc00134ad00, 0xc001b42570, 0x0, 0x0)
	/go/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x5c0
github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1(0x560de09e0760, 0xc001b426c0, 0x560de09d21a0, 0xc000ba8380, 0xc00134ad00, 0xc001b42570, 0x0, 0x0)
	/go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0xec
github.com/docker/docker/api/server/middleware.DebugRequestMiddleware.func1(0x560de09e0760, 0xc001b426c0, 0x560de09d21a0, 0xc000ba8380, 0xc00134ad00, 0xc001b42570, 0x0, 0x0)
	/go/src/github.com/docker/docker/api/server/middleware/debug.go:53 +0x7a6
github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1(0x560de09d21a0, 0xc000ba8380, 0xc00134ad00)
	/go/src/github.com/docker/docker/api/server/server.go:141 +0x218
net/http.HandlerFunc.ServeHTTP(0xc000f017a0, 0x560de09d21a0, 0xc000ba8380, 0xc00134ac00)
	/usr/local/go/src/net/http/server.go:2036 +0x46
github.com/docker/docker/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc000490780, 0x560de09d21a0, 0xc000ba8380, 0xc00134ac00)
	/go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:210 +0x1e8
net/http.serverHandler.ServeHTTP(0xc00085e0e0, 0x560de09d21a0, 0xc000ba8380, 0xc00134aa00)
	/usr/local/go/src/net/http/server.go:2831 +0x211
net/http.(*conn).serve(0xc00196afa0, 0x560de09e06a0, 0xc00168d140)
	/usr/local/go/src/net/http/server.go:1919 +0x173c
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:2957 +0x933

and

syscall.ParseNetlinkMessage(0xc2c1953000, 0xf34, 0xf34, 0xc2c1953000, 0x0, 0xf34, 0x556e517ed4e0, 0xc2c0ecf1e0)
	/usr/local/go/src/syscall/netlink_linux.go:124 +0x1e6
github.com/docker/docker/vendor/github.com/vishvananda/netlink/nl.(*NetlinkSocket).Receive(0xc0000538f0, 0xc000004d43, 0x0, 0x0, 0x556e4ebbf601, 0x556e511d7e00, 0xc00094f500)
	/go/src/github.com/docker/docker/vendor/github.com/vishvananda/netlink/nl/nl_linux.go:645 +0x162
github.com/docker/docker/vendor/github.com/vishvananda/netlink/nl.(*NetlinkRequest).Execute(0xc001014218, 0x0, 0x18, 0x0, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/vishvananda/netlink/nl/nl_linux.go:425 +0x1ee
github.com/docker/docker/vendor/github.com/vishvananda/netlink.(*Handle).RouteListFiltered(0xc000b981d0, 0x2, 0x0, 0x40, 0xc1c5d02000, 0xc001014450, 0x556e4ec4c896, 0xc1c5d02000, 0x249249249249)
	/go/src/github.com/docker/docker/vendor/github.com/vishvananda/netlink/route_linux.go:717 +0x15a
github.com/docker/docker/vendor/github.com/vishvananda/netlink.(*Handle).RouteList(0xc000b981d0, 0x0, 0x0, 0x2, 0x2030b0, 0x7f0933d1f31f, 0x556e4ea6d101, 0x2030b0, 0x2030b0)
	/go/src/github.com/docker/docker/vendor/github.com/vishvananda/netlink/route_linux.go:701 +0x6f
github.com/docker/docker/vendor/github.com/docker/libnetwork/netutils.CheckRouteOverlaps(0xc2c195e090, 0x2, 0x2)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/netutils/utils_linux.go:29 +0x5e
github.com/docker/docker/vendor/github.com/docker/libnetwork/netutils.FindAvailableNetwork(0xc001014840, 0x1, 0x1, 0x0, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/netutils/utils_linux.go:122 +0xb4
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*network).requestPoolHelper(0xc000051500, 0x556e5187a960, 0xc0014226a0, 0xc000d122b8, 0x7, 0x0, 0x0, 0x0, 0x0, 0xc001023770, ...)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/network.go:1575 +0x1a9
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*network).ipamAllocateVersion(0xc000051500, 0x4, 0x556e5187a960, 0xc0014226a0, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/network.go:1636 +0x420
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*network).ipamAllocate(0xc000051500, 0x0, 0x0)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/network.go:1542 +0xfd
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*controller).NewNetwork(0xc000876100, 0xc000d6b7d8, 0x6, 0xc000d6b800, 0x3, 0xc007b60bc0, 0x40, 0xc0012538c0, 0x7, 0xc, ...)
	/go/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/controller.go:818 +0x9e9
github.com/docker/docker/daemon.(*Daemon).createNetwork(0xc00000c1e0, 0x1, 0xc000d6b7d8, 0x6, 0x0, 0x0, 0x0, 0xc001023740, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/daemon/network.go:365 +0x4c6
github.com/docker/docker/daemon.(*Daemon).CreateNetwork(0xc00000c1e0, 0x1, 0xc000d6b7d8, 0x6, 0x0, 0x0, 0x0, 0xc001023740, 0x0, 0x0, ...)
	/go/src/github.com/docker/docker/daemon/network.go:286 +0x7f
github.com/docker/docker/api/server/router/network.(*networkRouter).postNetworkCreate(0xc00009b1c0, 0x556e51852560, 0xc001023620, 0x556e518424e0, 0xc001555b20, 0xc000f3cc00, 0xc0010231a0, 0xc000074701, 0xc000a3de40)
	/go/src/github.com/docker/docker/api/server/router/network/network_routes.go:229 +0x34f
github.com/docker/docker/api/server/middleware.ExperimentalMiddleware.WrapHandler.func1(0x556e51852560, 0xc001023620, 0x556e518424e0, 0xc001555b20, 0xc000f3cc00, 0xc0010231a0, 0x556e51852560, 0xc001023620)
	/go/src/github.com/docker/docker/api/server/middleware/experimental.go:26 +0x177
github.com/docker/docker/api/server/middleware.VersionMiddleware.WrapHandler.func1(0x556e51852560, 0xc001023260, 0x556e518424e0, 0xc001555b20, 0xc000f3cc00, 0xc0010231a0, 0x0, 0x0)
	/go/src/github.com/docker/docker/api/server/middleware/version.go:62 +0x5fb
github.com/docker/docker/pkg/authorization.(*Middleware).WrapHandler.func1(0x556e51852560, 0xc001023260, 0x556e518424e0, 0xc001555b20, 0xc000f3cc00, 0xc0010231a0, 0x1, 0x556e51446a01)
	/go/src/github.com/docker/docker/pkg/authorization/middleware.go:59 +0x826
github.com/docker/docker/api/server/middleware.DebugRequestMiddleware.func1(0x556e51852560, 0xc001023260, 0x556e518424e0, 0xc001555b20, 0xc000f3cc00, 0xc0010231a0, 0x556e51852560, 0xc001023260)
	/go/src/github.com/docker/docker/api/server/middleware/debug.go:53 +0x4a0
github.com/docker/docker/api/server.(*Server).makeHTTPHandler.func1(0x556e518424e0, 0xc001555b20, 0xc000f3cb00)
	/go/src/github.com/docker/docker/api/server/server.go:141 +0x241
net/http.HandlerFunc.ServeHTTP(0xc000e109c0, 0x556e518424e0, 0xc001555b20, 0xc000f3cb00)
	/usr/local/go/src/net/http/server.go:2036 +0x46
github.com/docker/docker/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc000be8cc0, 0x556e518424e0, 0xc001555b20, 0xc000f3c900)
	/go/src/github.com/docker/docker/vendor/github.com/gorilla/mux/mux.go:210 +0xe4
net/http.serverHandler.ServeHTTP(0xc0001be2a0, 0x556e518424e0, 0xc001555b20, 0xc000f3c900)
	/usr/local/go/src/net/http/server.go:2831 +0xa6
net/http.(*conn).serve(0xc000e18280, 0x556e518524a0, 0xc001176cc0)
	/usr/local/go/src/net/http/server.go:1919 +0x877
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:2957 +0x386

I've seen an ooold issue moby/libnetwork#1386 about this but it was just closed. Probably we should not retry forever and have some timeout for request pool operation, or at least fix "dummy" somehow to use different IP in case it is intersecting with something?

Output of docker version:

# docker --version
Docker version 20.10.5, build 55c4c88

Output of docker info:

# docker --version
Docker version 20.10.5, build 55c4c88
[root@legasy-94-237 ~]# docker info     
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)

Server:
 Containers: 16
  Running: 1
  Paused: 0
  Stopped: 15
 Images: 13
 Server Version: 20.10.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
 runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 4GiB
 Name: legasy-94-237.sw.ru
 ID: SSFE:64TJ:5EQY:6362:QOUG:BM3K:B4X3:NFOQ:X4C6:SPCU:QLQL:C7PV
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Additional environment details (AWS, VirtualBox, physical, etc.):

Please don't mind kernel version, Virtuozzo kernel is RHEL7 based + many container related patches, we also run docker integration tests as you can see =). Synthetic reproduce should be working everywhere.

@Snorch
Copy link
Contributor Author

Snorch commented May 18, 2021

ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants