Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic (3.16) on debian jessie when running docker containers with healthchecks #30402

Closed
ko-christ opened this issue Jan 24, 2017 · 18 comments

Comments

@ko-christ
Copy link


BUG REPORT INFORMATION

Description

Kernel panic (3.16) on debian jessie when running docker containers with healthchecks.

Steps to reproduce the issue:

  1. Install latest debian jessie 8.7, apt update and install docker following the official instructions.
  2. pull solr:alpine image and build a new one with healthchecks using a Dockerfile.
  3. deploy 20 - 25 containers
  4. server crash / kernel panic in about 1h max

Describe the results you received:
kernel panic, server crashes.

Describe the results you expected:
no crash

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

$ docker version
Client:
 Version:      1.13.0
 API version:  1.25
 Go version:   go1.7.3
 Git commit:   49bf474
 Built:        Tue Jan 17 09:44:08 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.0
 API version:  1.25 (minimum version 1.12)
 Go version:   go1.7.3
 Git commit:   49bf474
 Built:        Tue Jan 17 09:44:08 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

$ docker info
Containers: 25
 Running: 3
 Paused: 0
 Stopped: 22
Images: 5
Server Version: 1.13.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 94
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 03e5862ec0d8d3b3f750e19fca3ee367e13c090e
runc version: 2f7393a47307a16f8cee44a37b262e8b81021e3e
init version: 949e6fa
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 5
Total Memory: 6.402 GiB
Name: deb00
ID: RE6C:VVHI:KH5X:ANQK:NCAC:APCP:JD47:OUBG:C4LZ:MGUR:AMPD:7FIH
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 33
 Goroutines: 34
 System Time: 2017-01-24T04:56:35.547683852-05:00
 EventsListeners: 0
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
Experimental: false
Insecure Registries:
 my-registry:41238
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

I was able to reproduce with both 1.12.5 and 1.13.0 docker versions on physical servers and VirtualBox VMs, all running jessie with 1.16 kernel. Kernel Panic screenshots are available.
I was able to reproduce with 1m healthcheck intervals as well.

20170124_kp_healthchecks_9

$ uname -a
Linux deb00 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1 (2016-12-30) x86_64 GNU/Linux
$ cat /etc/debian_version 
8.7
$ uname -a
Linux deb00 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1 (2016-12-30) x86_64 GNU/Linux

CPU/RAM

$ free -m
             total       used       free     shared    buffers     cached
Mem:          6555       5045       1509         10         18        390
-/+ buffers/cache:       4636       1919
Swap:            0          0          0
$ nproc
5

systemd config (running on Debug mode with private registry enabled)

$ cat /lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target docker.socket firewalld.service
Requires=docker.socket

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -D -H fd:// --insecure-registry=my-registry:41238
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process

[Install]
WantedBy=multi-user.target

Dockerfile and image build

$ cat df/Dockerfile
FROM solr:alpine
USER root
RUN apk --no-cache add curl
USER $SOLR_USER
HEALTHCHECK --interval=10s --timeout=30s --retries=3 \
  CMD curl -sb -H "Accept: application/json" "http://localhost:8983/solr/" | grep "solr" || exit 1


$ cd df/
$ docker build -t solr:alpine_foo .
$ cd ~
$ docker images | grep alpine_foo
solr                                        alpine_foo          de916b07eb1f        3 minutes ago       286 MB

deployment of 25 containers

$ for i in `seq 1 25`; do docker run --name=solr$i -d -P solr:alpine_foo solr-create -c mycore; done

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                             PORTS                     NAMES
0e2d4127b121        solr:alpine_foo     "docker-entrypoint..."   2 seconds ago       Up 1 second (health: starting)     0.0.0.0:32792->8983/tcp   solr25
d6e204e16e1e        solr:alpine_foo     "docker-entrypoint..."   5 seconds ago       Up 4 seconds (health: starting)    0.0.0.0:32791->8983/tcp   solr24
f27d85c6d85e        solr:alpine_foo     "docker-entrypoint..."   7 seconds ago       Up 6 seconds (health: starting)    0.0.0.0:32790->8983/tcp   solr23
edb99c47dce4        solr:alpine_foo     "docker-entrypoint..."   10 seconds ago      Up 10 seconds (health: starting)   0.0.0.0:32789->8983/tcp   solr22
6afe1b850284        solr:alpine_foo     "docker-entrypoint..."   23 seconds ago      Up 22 seconds (healthy)            0.0.0.0:32788->8983/tcp   solr21
c62ad83e267e        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32787->8983/tcp   solr20
785a185a0fcc        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32786->8983/tcp   solr19
6d5f0613b87a        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32785->8983/tcp   solr18
8956d70c7eba        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32784->8983/tcp   solr17
02a98144aa09        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32783->8983/tcp   solr16
16b5de44ba96        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32782->8983/tcp   solr15
d65215e558a5        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32781->8983/tcp   solr14
e625c1371df2        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32780->8983/tcp   solr13
73372f71b447        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32779->8983/tcp   solr12
f5972ccf1e91        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32778->8983/tcp   solr11
4ecc7b1f77dd        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32777->8983/tcp   solr10
d574d528446b        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32776->8983/tcp   solr9
54042f7bbb25        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32775->8983/tcp   solr8
c91e7be83158        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32774->8983/tcp   solr7
5c1dc6ef2984        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32773->8983/tcp   solr6
ba01f507100b        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32772->8983/tcp   solr5
df9dd6f20d85        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32771->8983/tcp   solr4
c6efe430c12c        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32770->8983/tcp   solr3
466711c8bb38        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32769->8983/tcp   solr2
9a72d27349b9        solr:alpine_foo     "docker-entrypoint..."   51 minutes ago      Up 51 minutes (healthy)            0.0.0.0:32768->8983/tcp   solr1

kern.log log

root@deb00:~# tail -f /var/log/kern.log 
Jan 24 05:14:13 deb00 kernel: [61066.239971] docker0: port 3(veth1f50700) entered disabled state
Jan 24 05:14:13 deb00 kernel: [61066.295894] docker0: port 3(veth1f50700) entered disabled state
Jan 24 05:14:13 deb00 kernel: [61066.296537] device veth1f50700 left promiscuous mode
Jan 24 05:14:13 deb00 kernel: [61066.296543] docker0: port 3(veth1f50700) entered disabled state
Jan 24 05:14:13 deb00 kernel: [61066.348712] docker0: port 1(veth7983bb8) entered disabled state
Jan 24 05:14:13 deb00 kernel: [61066.349502] device veth7983bb8 left promiscuous mode
Jan 24 05:14:13 deb00 kernel: [61066.349509] docker0: port 1(veth7983bb8) entered disabled state
Jan 24 05:14:13 deb00 kernel: [61066.451630] docker0: port 2(vethed02b3b) entered disabled state
Jan 24 05:14:13 deb00 kernel: [61066.452480] device vethed02b3b left promiscuous mode
Jan 24 05:14:13 deb00 kernel: [61066.452486] docker0: port 2(vethed02b3b) entered disabled state
 
Jan 24 06:05:06 deb00 kernel: [64119.060363] aufs au_opts_verify:1570:dockerd[7742]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:06 deb00 kernel: [64119.106266] aufs au_opts_verify:1570:dockerd[7742]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:06 deb00 kernel: [64119.138685] aufs au_opts_verify:1570:dockerd[5590]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:06 deb00 kernel: [64119.155874] device vethb111342 entered promiscuous mode
Jan 24 06:05:06 deb00 kernel: [64119.155903] IPv6: ADDRCONF(NETDEV_UP): vethb111342: link is not ready
Jan 24 06:05:06 deb00 kernel: [64119.155905] docker0: port 1(vethb111342) entered forwarding state
Jan 24 06:05:06 deb00 kernel: [64119.155907] docker0: port 1(vethb111342) entered forwarding state
Jan 24 06:05:06 deb00 kernel: [64119.156296] docker0: port 1(vethb111342) entered disabled state
Jan 24 06:05:06 deb00 kernel: [64119.915571] IPv6: ADDRCONF(NETDEV_CHANGE): vethb111342: link becomes ready
Jan 24 06:05:06 deb00 kernel: [64119.915591] docker0: port 1(vethb111342) entered forwarding state
Jan 24 06:05:06 deb00 kernel: [64119.915597] docker0: port 1(vethb111342) entered forwarding state
Jan 24 06:05:19 deb00 kernel: [64132.101267] aufs au_opts_verify:1570:dockerd[22281]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:19 deb00 kernel: [64132.132114] aufs au_opts_verify:1570:dockerd[22281]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:19 deb00 kernel: [64132.186436] aufs au_opts_verify:1570:dockerd[13684]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:19 deb00 kernel: [64132.187419] device vethc70fcda entered promiscuous mode
Jan 24 06:05:19 deb00 kernel: [64132.187448] IPv6: ADDRCONF(NETDEV_UP): vethc70fcda: link is not ready
Jan 24 06:05:19 deb00 kernel: [64132.187450] docker0: port 2(vethc70fcda) entered forwarding state
Jan 24 06:05:19 deb00 kernel: [64132.187453] docker0: port 2(vethc70fcda) entered forwarding state
Jan 24 06:05:19 deb00 kernel: [64132.188356] docker0: port 2(vethc70fcda) entered disabled state
Jan 24 06:05:19 deb00 kernel: [64132.415258] IPv6: ADDRCONF(NETDEV_CHANGE): vethc70fcda: link becomes ready
Jan 24 06:05:19 deb00 kernel: [64132.415282] docker0: port 2(vethc70fcda) entered forwarding state
Jan 24 06:05:19 deb00 kernel: [64132.415289] docker0: port 2(vethc70fcda) entered forwarding state
Jan 24 06:05:21 deb00 kernel: [64134.929534] docker0: port 1(vethb111342) entered forwarding state
Jan 24 06:05:22 deb00 kernel: [64135.107574] aufs au_opts_verify:1570:dockerd[13684]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:22 deb00 kernel: [64135.154440] aufs au_opts_verify:1570:dockerd[13684]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:22 deb00 kernel: [64135.214953] aufs au_opts_verify:1570:dockerd[22281]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:22 deb00 kernel: [64135.227971] device vethec89273 entered promiscuous mode
Jan 24 06:05:22 deb00 kernel: [64135.228005] IPv6: ADDRCONF(NETDEV_UP): vethec89273: link is not ready
Jan 24 06:05:22 deb00 kernel: [64135.228007] docker0: port 3(vethec89273) entered forwarding state
Jan 24 06:05:22 deb00 kernel: [64135.228011] docker0: port 3(vethec89273) entered forwarding state
Jan 24 06:05:22 deb00 kernel: [64135.228691] docker0: port 3(vethec89273) entered disabled state
Jan 24 06:05:22 deb00 kernel: [64135.439325] IPv6: ADDRCONF(NETDEV_CHANGE): vethec89273: link becomes ready
Jan 24 06:05:22 deb00 kernel: [64135.439350] docker0: port 3(vethec89273) entered forwarding state
Jan 24 06:05:22 deb00 kernel: [64135.439356] docker0: port 3(vethec89273) entered forwarding state
Jan 24 06:05:24 deb00 kernel: [64137.875979] aufs au_opts_verify:1570:dockerd[798]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:24 deb00 kernel: [64137.904122] aufs au_opts_verify:1570:dockerd[798]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:24 deb00 kernel: [64137.943760] aufs au_opts_verify:1570:dockerd[798]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:24 deb00 kernel: [64137.944868] device veth0e73444 entered promiscuous mode
Jan 24 06:05:24 deb00 kernel: [64137.944907] IPv6: ADDRCONF(NETDEV_UP): veth0e73444: link is not ready
Jan 24 06:05:24 deb00 kernel: [64137.944910] docker0: port 24(veth0e73444) entered forwarding state
Jan 24 06:05:24 deb00 kernel: [64137.944914] docker0: port 24(veth0e73444) entered forwarding state
Jan 24 06:05:24 deb00 kernel: [64137.945418] docker0: port 24(veth0e73444) entered disabled state
Jan 24 06:05:25 deb00 kernel: [64138.124250] IPv6: ADDRCONF(NETDEV_CHANGE): veth0e73444: link becomes ready
Jan 24 06:05:25 deb00 kernel: [64138.124273] docker0: port 24(veth0e73444) entered forwarding state
Jan 24 06:05:25 deb00 kernel: [64138.124279] docker0: port 24(veth0e73444) entered forwarding state
Jan 24 06:05:27 deb00 kernel: [64140.469756] aufs au_opts_verify:1570:dockerd[31573]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:27 deb00 kernel: [64140.528564] aufs au_opts_verify:1570:dockerd[31573]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:27 deb00 kernel: [64140.580007] aufs au_opts_verify:1570:dockerd[14822]: dirperm1 breaks the protection by the permission bits on the lower branch
Jan 24 06:05:27 deb00 kernel: [64140.580961] device veth4e45fad entered promiscuous mode
Jan 24 06:05:27 deb00 kernel: [64140.581001] IPv6: ADDRCONF(NETDEV_UP): veth4e45fad: link is not ready
Jan 24 06:05:27 deb00 kernel: [64140.581004] docker0: port 25(veth4e45fad) entered forwarding state
Jan 24 06:05:27 deb00 kernel: [64140.581008] docker0: port 25(veth4e45fad) entered forwarding state
Jan 24 06:05:27 deb00 kernel: [64140.583610] docker0: port 25(veth4e45fad) entered disabled state
Jan 24 06:05:27 deb00 kernel: [64140.799238] IPv6: ADDRCONF(NETDEV_CHANGE): veth4e45fad: link becomes ready
Jan 24 06:05:27 deb00 kernel: [64140.799281] docker0: port 25(veth4e45fad) entered forwarding state
Jan 24 06:05:27 deb00 kernel: [64140.799288] docker0: port 25(veth4e45fad) entered forwarding state
Jan 24 06:05:34 deb00 kernel: [64147.470937] docker0: port 2(vethc70fcda) entered forwarding state
Jan 24 06:05:37 deb00 kernel: [64150.478787] docker0: port 3(vethec89273) entered forwarding state
Jan 24 06:05:40 deb00 kernel: [64153.166907] docker0: port 24(veth0e73444) entered forwarding state
Jan 24 06:05:42 deb00 kernel: [64155.854926] docker0: port 25(veth4e45fad) entered forwarding state

journalctl docker.service log

Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.687202920-05:00" level=debug msg="Health check for container f27d85c6d85ecb5d4b4da33d2f80db6a0c1e6e25ff47322a9b27010d8286874a done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.826251206-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"start-process\", Id:\"5c1dc6ef2984b779b2908e65227a7b08340316a8d885c46af3026ce2b33a60fd\", Status:0x0, Pid:\"75cdbe367c5d4a39bcc36abe55d2075bd0e53c584b7ade767b541ac63a91af78\", Timestamp:(*timestamp.Timestamp)(0xc42258b540)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.826306190-05:00" level=debug msg="libcontainerd: event unhandled: type:\"start-process\" id:\"5c1dc6ef2984b779b2908e65227a7b08340316a8d885c46af3026ce2b33a60fd\" pid:\"75cdbe367c5d4a39bcc36abe55d2075bd0e53c584b7ade767b541ac63a91af78\" timestamp:<seconds:1485256066 nanos:825781474 > "
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.923630832-05:00" level=debug msg="containerd: process exited" id=c6efe430c12cb26df844ce89684fdf635094cf1d7753f47ef144e5c80d7a3961 pid=3acabb3e02b3a94208f0ad82ded31e35cf493d900131060242f3aad275b5eb37 status=0 systemPid=10874
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.923768909-05:00" level=debug msg="containerd: process exited" id=466711c8bb38384e2ded29ee65c74745938518bc3d154a36ea68dc810743701f pid=1d3c900a7bf1fb131551eb42df4946fa9aa8f53f9fa56c575bd9cdd34a1c4b2a status=0 systemPid=10899
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.923870866-05:00" level=debug msg="containerd: process exited" id=e625c1371df27b6b584aa6a89246dd317ae7284b63ee7a0f13b221cec9ebdf0e pid=1dcdfc1504b96d067b1709ebb6f5fc7cd0ce20bd9beeb247bbde1c79df3b8d3b status=0 systemPid=10924
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.923955838-05:00" level=debug msg="containerd: process exited" id=6d5f0613b87a0a6b0d9287244d060d465c4e8fc58b1f8e5e939897dfc1a02331 pid=da793799a6f09b2c08746d5a9541150fcd46655bf92e016e0cf1c845d0097a26 status=0 systemPid=10952
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.924061267-05:00" level=debug msg="containerd: process exited" id=f5972ccf1e91e227bca28daaf306a4428a352ff20f22e37917bd478e475c0474 pid=1d4d89df42b2727d75e2806049aaff4511cd05aa9bed9164e2c3c8fb912c9965 status=0 systemPid=10983
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.924145329-05:00" level=debug msg="containerd: process exited" id=d574d528446b370208a1a7bb6feb9c68c3286cbb5e709b5f4bc8ec0c0687224e pid=302e71450682dd715058183d75e921366cac12ef73163a275b579d5210b3217b status=0 systemPid=11009
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.924232799-05:00" level=debug msg="containerd: process exited" id=54042f7bbb25861e2c185fb1f688df3e387f6b3267033b3c7009795e0518ab65 pid=57fd314c8c3ca9014470a5f2fad63218dcd81382953029175f63060edec5b429 status=0 systemPid=11035
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.924313674-05:00" level=debug msg="containerd: process exited" id=4ecc7b1f77dde35cdc3e4f2b66af227a08d7002de55ed969c1fe73a1771e5e13 pid=44d26c7e44b3ba2d35e4e4917e1580279af80b76d770de13c8b52bb36f3cc712 status=0 systemPid=11060
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.92439678-05:00" level=debug msg="containerd: process exited" id=5c1dc6ef2984b779b2908e65227a7b08340316a8d885c46af3026ce2b33a60fd pid=75cdbe367c5d4a39bcc36abe55d2075bd0e53c584b7ade767b541ac63a91af78 status=0 systemPid=11087
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925035538-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"start-process\", Id:\"02a98144aa096b066ec8208cf20e8b73f97a12af117f8e71b0f009c3eb156803\", Status:0x0, Pid:\"d93650074d3bc447a15035e8b88b325187fad5491941d82eaa5c21daf76f470d\", Timestamp:(*timestamp.Timestamp)(0xc42258b780)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925082085-05:00" level=debug msg="libcontainerd: event unhandled: type:\"start-process\" id:\"02a98144aa096b066ec8208cf20e8b73f97a12af117f8e71b0f009c3eb156803\" pid:\"d93650074d3bc447a15035e8b88b325187fad5491941d82eaa5c21daf76f470d\" timestamp:<seconds:1485256066 nanos:923592156 > "
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925134521-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"5c1dc6ef2984b779b2908e65227a7b08340316a8d885c46af3026ce2b33a60fd\", Status:0x0, Pid:\"75cdbe367c5d4a39bcc36abe55d2075bd0e53c584b7ade767b541ac63a91af78\", Timestamp:(*timestamp.Timestamp)(0xc42258b980)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925198780-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"c6efe430c12cb26df844ce89684fdf635094cf1d7753f47ef144e5c80d7a3961\", Status:0x0, Pid:\"3acabb3e02b3a94208f0ad82ded31e35cf493d900131060242f3aad275b5eb37\", Timestamp:(*timestamp.Timestamp)(0xc42258ba80)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925245670-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"466711c8bb38384e2ded29ee65c74745938518bc3d154a36ea68dc810743701f\", Status:0x0, Pid:\"1d3c900a7bf1fb131551eb42df4946fa9aa8f53f9fa56c575bd9cdd34a1c4b2a\", Timestamp:(*timestamp.Timestamp)(0xc42258bb80)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925285513-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"e625c1371df27b6b584aa6a89246dd317ae7284b63ee7a0f13b221cec9ebdf0e\", Status:0x0, Pid:\"1dcdfc1504b96d067b1709ebb6f5fc7cd0ce20bd9beeb247bbde1c79df3b8d3b\", Timestamp:(*timestamp.Timestamp)(0xc42258bc80)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925336801-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"6d5f0613b87a0a6b0d9287244d060d465c4e8fc58b1f8e5e939897dfc1a02331\", Status:0x0, Pid:\"da793799a6f09b2c08746d5a9541150fcd46655bf92e016e0cf1c845d0097a26\", Timestamp:(*timestamp.Timestamp)(0xc42258bd80)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925389067-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"f5972ccf1e91e227bca28daaf306a4428a352ff20f22e37917bd478e475c0474\", Status:0x0, Pid:\"1d4d89df42b2727d75e2806049aaff4511cd05aa9bed9164e2c3c8fb912c9965\", Timestamp:(*timestamp.Timestamp)(0xc42258be80)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925453559-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"d574d528446b370208a1a7bb6feb9c68c3286cbb5e709b5f4bc8ec0c0687224e\", Status:0x0, Pid:\"302e71450682dd715058183d75e921366cac12ef73163a275b579d5210b3217b\", Timestamp:(*timestamp.Timestamp)(0xc42258bf90)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925493496-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"54042f7bbb25861e2c185fb1f688df3e387f6b3267033b3c7009795e0518ab65\", Status:0x0, Pid:\"57fd314c8c3ca9014470a5f2fad63218dcd81382953029175f63060edec5b429\", Timestamp:(*timestamp.Timestamp)(0xc422530090)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925560423-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"4ecc7b1f77dde35cdc3e4f2b66af227a08d7002de55ed969c1fe73a1771e5e13\", Status:0x0, Pid:\"44d26c7e44b3ba2d35e4e4917e1580279af80b76d770de13c8b52bb36f3cc712\", Timestamp:(*timestamp.Timestamp)(0xc422530190)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925628581-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925651890-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925672400-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925688682-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925702981-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925714115-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925725705-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925741190-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925760157-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925767679-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925783010-05:00" level=debug msg="Health check for container 4ecc7b1f77dde35cdc3e4f2b66af227a08d7002de55ed969c1fe73a1771e5e13 done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925798701-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925813227-05:00" level=debug msg="Health check for container 5c1dc6ef2984b779b2908e65227a7b08340316a8d885c46af3026ce2b33a60fd done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925826147-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925840091-05:00" level=debug msg="Health check for container c6efe430c12cb26df844ce89684fdf635094cf1d7753f47ef144e5c80d7a3961 done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925856902-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925870755-05:00" level=debug msg="Health check for container 466711c8bb38384e2ded29ee65c74745938518bc3d154a36ea68dc810743701f done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925883904-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925895322-05:00" level=debug msg="Health check for container e625c1371df27b6b584aa6a89246dd317ae7284b63ee7a0f13b221cec9ebdf0e done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925904500-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925914345-05:00" level=debug msg="Health check for container 6d5f0613b87a0a6b0d9287244d060d465c4e8fc58b1f8e5e939897dfc1a02331 done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925923895-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925932839-05:00" level=debug msg="Health check for container f5972ccf1e91e227bca28daaf306a4428a352ff20f22e37917bd478e475c0474 done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925941144-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925951870-05:00" level=debug msg="Health check for container d574d528446b370208a1a7bb6feb9c68c3286cbb5e709b5f4bc8ec0c0687224e done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925964701-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.925976209-05:00" level=debug msg="Health check for container 54042f7bbb25861e2c185fb1f688df3e387f6b3267033b3c7009795e0518ab65 done (exitCode=0)"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.931357764-05:00" level=debug msg="containerd: process exited" id=02a98144aa096b066ec8208cf20e8b73f97a12af117f8e71b0f009c3eb156803 pid=d93650074d3bc447a15035e8b88b325187fad5491941d82eaa5c21daf76f470d status=0 systemPid=11113
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.931753083-05:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"02a98144aa096b066ec8208cf20e8b73f97a12af117f8e71b0f009c3eb156803\", Status:0x0, Pid:\"d93650074d3bc447a15035e8b88b325187fad5491941d82eaa5c21daf76f470d\", Timestamp:(*timestamp.Timestamp)(0xc42278a850)}"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.931882110-05:00" level=debug msg="attach: stderr: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.931892183-05:00" level=debug msg="attach: stdout: end"
Jan 24 06:07:46 deb00 dockerd[510]: time="2017-01-24T06:07:46.931907482-05:00" level=debug msg="Health check for container 02a98144aa096b066ec8208cf20e8b73f97a12af117f8e71b0f009c3eb156803 done (exitCode=0)"
Jan 24 06:07:49 deb00 dockerd[510]: time="2017-01-24T06:07:49.327408426-05:00" level=debug msg="Running health check for container 9a72d27349b98d703be01c0824cd79651b4ba264a296bf795b76aeb13b2a7819 ..."
Jan 24 06:07:49 deb00 dockerd[510]: time="2017-01-24T06:07:49.327477546-05:00" level=debug msg="starting exec command 839b336433a84b11c21306cf91b8e1106e93322fe760ab409790a0c34d016b13 in container 9a72d27349b98d703be01c0824cd79651b4ba264a296bf795b76aeb13b2a7819"
Jan 24 06:07:49 deb00 dockerd[510]: time="2017-01-24T06:07:49.327964123-05:00" level=debug msg="attach: stdout: begin"
Jan 24 06:07:49 deb00 dockerd[510]: time="2017-01-24T06:07:49.328020158-05:00" level=debug msg="attach: stderr: begin"
@cpuguy83
Copy link
Member

Do you have the full kernel stack trace?

@ko-christ
Copy link
Author

Not as plain text, sorry. I have managed to capture 7 screenshots before resetting the VM though, with the following order.

20170124_kp_healthchecks_1
20170124_kp_healthchecks_2
20170124_kp_healthchecks_3
20170124_kp_healthchecks_4
20170124_kp_healthchecks_6
20170124_kp_healthchecks_7
20170124_kp_healthchecks_9

@justincormack
Copy link
Contributor

Can you try running Debian unstable or testing instead? The kernel in stable is pretty old, and the new Debian which will be out in a few months should be a lot better.

@ko-christ
Copy link
Author

Sure, I could try to reproduce on 3.19 and 4+ kernel versions and share my results.

The problem with 3.16 kernel apart from the fact that crashes with Docker and is on the latest stable release of Debian, is that it is still supported according to the official Docker installation guide, without the requirement to backport kernels.

https://docs.docker.com/engine/installation/linux/debian/

You need at least version 3.10 of the Linux kernel. Debian Wheezy ships with version 3.2, so you may need to update the kernel.

@ko-christ
Copy link
Author

I didn't manage to reproduce this on the following environments

  • ubuntu on 4.2 kernel running docker 1.12.6/aufs
  • a debian jessie on 4.8 kernel (backports) running docker 1.13.0/overlay2

It seems that kernel bug is exposed on 3.16 kernel. Changing various options such as MountFlags=slave or --network=host didn't help. My workaround for now is to drop healthchecks completely, but obviously the problem is still there.

@mbentley
Copy link
Contributor

I've had the same issues on a Debian Jessie box when using health checks:

[19932.895866] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[19932.895904] IP: [<ffffffff810a10d0>] check_preempt_wakeup+0xd0/0x1d0
[19932.895934] PGD 72efad067 PUD 54f96a067 PMD 0
[19932.895955] Oops: 0000 [#1] SMP
[19932.895970] Modules linked in: ipt_REJECT veth xt_nat xt_tcpudp ipt_MASQUERADE xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack nf_nat nf_conntrack aufs(C) vhost_net vhost macvtapmacvlan tun xt_multiport iptable_filter ip_tables x_tables binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc bridge stp llc snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp intel_rapl kvm_intel kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper iTCO_wdt iTCO_vendor_support cryptd mxm_wmi snd_hda_codec_realtek i915 snd_hda_codec_generic evdev snd_hda_intel snd_hda_controller serio_raw pcspkr snd_hda_codec snd_hwdep i2c_i801 drm_kms_helper snd_pcm drm snd_timer battery intel_smartconnect tpm_infineon snd tpm_tis
[19932.896319]  soundcore i2c_algo_bit wmi tpm video acpi_pad mei_me i2c_core button shpchp lpc_ich mei mfd_core processor coretemp loop fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 dm_mod sg sd_mod crc_t10dif crct10dif_generic hid_generic usbhid hid crct10dif_pclmul crct10dif_common crc32c_intel psmouse ahci libahci libata ehci_pci scsi_mod ehci_hcd e1000e ptp pps_core xhci_hcd usbcore thermal fan usb_common thermal_sys
[19932.896520] CPU: 1 PID: 17290 Comm: exe Tainted: G         C    3.16.0-4-amd64 #1 Debian 3.16.36-1+deb8u2
[19932.896556] Hardware name: MSI MS-7817/B85M ECO (MS-7817), BIOS V25.1 06/30/2014
[19932.896584] task: ffff880606c1d570 ti: ffff8805918c0000 task.ti: ffff8805918c0000
[19932.896612] RIP: 0010:[<ffffffff810a10d0>]  [<ffffffff810a10d0>] check_preempt_wakeup+0xd0/0x1d0
[19932.896647] RSP: 0018:ffff88081ea83e58  EFLAGS: 00010006
[19932.896667] RAX: 0000000000000000 RBX: ffff8807c730d280 RCX: 0000000000000008
[19932.896694] RDX: 0000000000000000 RSI: ffff880752c54390 RDI: ffff88081ea92fb8
[19932.896720] RBP: 0000000000000000 R08: ffffffff816108c0 R09: 0000000000000001
[19932.896747] R10: ffff880000000030 R11: 0000000000000010 R12: ffff880606c1d570
[19932.896773] R13: ffff88081ea92f40 R14: 0000000000000000 R15: 0000000000000000
[19932.896800] FS:  00007f23cbb48700(0000) GS:ffff88081ea80000(0000) knlGS:0000000000000000
[19932.896830] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19932.896852] CR2: 0000000000000078 CR3: 00000004b2807000 CR4: 00000000001427e0
[19932.896878] Stack:
[19932.896886]  ffffffff8109ffe2 ffff88081ea92f40 ffff880752c54390 ffff88081ea92f40
[19932.896918]  0000000000000046 0000000000000001 0000000000000001 ffffffff81095b75
[19932.896950]  ffff880752c54390 ffffffff81095ba4 ffff880752c54390 ffff88081ea92f40
[19932.896982] Call Trace:
[19932.896992]  <IRQ>
[19932.897000]
[19932.897009]  [<ffffffff8109ffe2>] ? enqueue_task_fair+0x7f2/0xe20
[19932.897029]  [<ffffffff81095b75>] ? check_preempt_curr+0x85/0xa0
[19932.897052]  [<ffffffff81095ba4>] ? ttwu_do_wakeup+0x14/0xf0
[19932.897074]  [<ffffffff81098176>] ? try_to_wake_up+0x1b6/0x2f0
[19932.897097]  [<ffffffff8108bfe0>] ? hrtimer_get_res+0x50/0x50
[19932.897119]  [<ffffffff8108bffe>] ? hrtimer_wakeup+0x1e/0x30
[19932.897142]  [<ffffffff8108c667>] ? __run_hrtimer+0x67/0x210
[19932.897164]  [<ffffffff8108ca69>] ? hrtimer_interrupt+0xe9/0x220
[19932.897188]  [<ffffffff8151b46b>] ? smp_apic_timer_interrupt+0x3b/0x50
[19932.897214]  [<ffffffff815194fd>] ? apic_timer_interrupt+0x6d/0x80
[19932.897237]  <EOI>
[19932.897246]
[19932.897254]  [<ffffffff81517d0e>] ? _raw_spin_unlock_irqrestore+0xe/0x20
[19932.897275]  [<ffffffff8109818b>] ? try_to_wake_up+0x1cb/0x2f0
[19932.897300]  [<ffffffff810d3c3d>] ? wake_futex+0x5d/0x80
[19932.897321]  [<ffffffff810d3d5f>] ? futex_wake+0xff/0x120
[19932.897342]  [<ffffffff810d5c5e>] ? do_futex+0x11e/0xb60
[19932.898513]  [<ffffffff811aa091>] ? new_sync_read+0x71/0xa0
[19932.899682]  [<ffffffff810d670e>] ? SyS_futex+0x6e/0x150
[19932.900850]  [<ffffffff81079ee1>] ? __set_current_blocked+0x31/0x50
[19932.902014]  [<ffffffff8107a0b6>] ? sigprocmask+0x56/0x90
[19932.903168]  [<ffffffff8151858d>] ? system_call_fast_compare_end+0x10/0x15
[19932.904337] Code: 39 c2 7d 27 0f 1f 80 00 00 00 00 83 e8 01 48 8b 5b 70 39 d0 75 f5 48 8b 7d 78 48 3b 7b 78 74 15 0f 1f 00 48 8b 6d 70 48 8b 5b 70 <48> 8b 7d 78 48 3b 7b 78 75 ee 48 85 ff 74 e9 e8 8c cb ff ff 48
[19932.906878] RIP  [<ffffffff810a10d0>] check_preempt_wakeup+0xd0/0x1d0
[19932.908143]  RSP <ffff88081ea83e58>
[19932.909411] CR2: 0000000000000078

I upgraded my kernel but I also removed health checks temporarily because I didn't have easy access to reboot the box so I am not certain if upgrading the kernel fixed it.

@jeremydenoun
Copy link

+1

I try a lot of things (sleep in healthcheck, static binary,...) same issue, I dump same oops kernel issue (few seconds before oops, it's seems cpu load increase a lot)...
with default kernel or last Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1 (2016-12-30) x86_64 GNU/Linux

only one way to come back to a normal state... not use it...

however it's seems with specific config like :
HEALTHCHECK --interval=5m --timeout=3s --retries=1
CMD xxx

the oops seems appear less often

@thaJeztah
Copy link
Member

I there anything actionable here? Or is this a kernel bug? Debian 9 "stretch" is currently in "full freeze" (https://wiki.debian.org/DebianStretch) and hopefully available as "stable" soon.

@gyorgyabraham
Copy link

You can upgrade to currently kernel 4.9 from jessie-backports. We are currently facing this exact situation, and doing a kernel-only upgrade.

@unixfox
Copy link

unixfox commented May 22, 2017

@gyorgyabraham Unfortunately the kernel 4.9 from jessie backports doesn't include btrfs so the user has to save and migrates his images before upgrading.

@gyorgyabraham
Copy link

@unixfox yeah and 4.9 lacks aufs too. So manual storage migration is necessary.

@Spectrik
Copy link

Spectrik commented Sep 9, 2017

I can confirm this bug still occurs. Once we removed healthcheck definitions from Dockerfiles the problem disappeared.

@gyorgyabraham
Copy link

gyorgyabraham commented Sep 10, 2017 via email

@EccoB
Copy link

EccoB commented Jan 15, 2018

3.16.0-4-amd64 #1 SMP Debian 3.16.51-3 (2017-12-13) x86_64 GNU/Linux
With 9 running containers, one of them with healthcheck on, the same problem occurs, approximately every 5 days on a virtual machine:

[436713.569983] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[436713.573641] IP: [] check_preempt_wakeup+0xd0/0x1d0
[436713.573641] PGD 1ed2a8067 PUD 1e5533067 PMD 0
[436713.573641] Oops: 0000 [#1] SMP
[436713.573641] Modules linked in: cfg80211 rfkill veth xt_nat ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype nf_nat bridge stp llc aufs(C) xt_multiport nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables kvm_intel kvm crc32_pclmul ppdev joydev ttm aesni_intel drm_kms_helper aes_x86_64 drm lrw parport_pc gf128mul evdev processor parport virtio_balloon glue_helper thermal_sys ablk_helper cryptd serio_raw button autofs4 hid_generic usbhid hid ext4 crc16 mbcache jbd2 ata_generic virtio_net virtio_blk ata_piix uhci_hcd ehci_hcd libata usbcore crct10dif_pclmul crct10dif_common usb_common crc32c_intel i2c_piix4 psmouse virtio_pci virtio_ring scsi_mod i2c_core virtio floppy
[436713.573641] CPU: 1 PID: 15261 Comm: runc:[2:INIT] Tainted: G C 3.16.0-4-amd64 #1 Debian 3.16.51-3
[436713.573641] Hardware name: Fedora Project OpenStack Nova, BIOS 1.9.1-5.el7_3.2 04/01/2014
[436713.573641] task: ffff8800ba791490 ti: ffff8801e5964000 task.ti: ffff8801e5964000
[436713.573641] RIP: 0010:[] [] check_preempt_wakeup+0xd0/0x1d0
[436713.573641] RSP: 0000:ffff88023fc43e58 EFLAGS: 00010006
[436713.573641] RAX: 0000000000000000 RBX: ffff8800ba878280 RCX: 0000000000000008
[436713.573641] RDX: 0000000000000000 RSI: ffff88009195c310 RDI: ffff88023fc53278
[436713.573641] RBP: 0000000000000000 R08: ffffffff81610680 R09: 0000000000000001
[436713.573641] R10: 000000000000339b R11: 0000000000000010 R12: ffff8800ba791490
[436713.573641] R13: ffff88023fc53200 R14: 0000000000000000 R15: 0000000000000000
[436713.573641] FS: 00007f0a9e408700(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000
[436713.573641] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[436713.573641] CR2: 0000000000000078 CR3: 00000001e57d1000 CR4: 00000000003407e0
[436713.573641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[436713.573641] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[436713.573641] Stack:
[436713.573641] ffffffff810a26b2 ffff88023fc53200 ffff88009195c310 ffff88023fc53200
[436713.573641] 0000000000000046 0000000000000001 0000000000000001 ffffffff810980e5
[436713.573641] ffff88009195c310 ffffffff81098114 ffff88009195c310 ffff88023fc53200
[436713.573641] Call Trace:
[436713.573641]
[436713.573641] [] ? enqueue_task_fair+0x7f2/0xe20
[436713.573641] [] ? check_preempt_curr+0x85/0xa0
[436713.573641] [] ? ttwu_do_wakeup+0x14/0xf0
[436713.573641] [] ? try_to_wake_up+0x1b6/0x2f0
[436713.573641] [] ? hrtimer_get_res+0x50/0x50
[436713.573641] [] ? hrtimer_wakeup+0x1e/0x30
[436713.573641] [] ? __run_hrtimer+0x67/0x210
[436713.573641] [] ? hrtimer_interrupt+0xe9/0x220
[436713.573641] [] ? smp_apic_timer_interrupt+0x3b/0x50
[436713.573641] [] ? apic_timer_interrupt+0x6d/0x80
[436713.573641]
[436713.573641] [] ? zone_statistics+0x85/0x90
[436713.573641] [] ? get_page_from_freelist+0x483/0x910
[436713.573641] [] ? __alloc_pages_nodemask+0x166/0xb50
[436713.573641] [] ? __rb_insert_augmented+0x1da/0x200
[436713.573641] [] ? vma_compute_subtree_gap+0x70/0x70
[436713.573641] [] ? alloc_pages_vma+0x98/0x160
[436713.573641] [] ? handle_mm_fault+0xd39/0x1140
[436713.573641] [] ? mprotect_fixup+0x14f/0x270
[436713.573641] [] ? __do_page_fault+0x177/0x410
[436713.573641] [] ? async_page_fault+0x28/0x30
[436713.573641] Code: 39 c2 7d 27 0f 1f 80 00 00 00 00 83 e8 01 48 8b 5b 70 39 d0 75 f5 48 8b 7d 78 48 3b 7b 78 74 15 0f 1f 00 48 8b 6d 70 48 8b 5b 70 <48> 8b 7d 78 48 3b 7b 78 75 ee 48 85 ff 74 e9 e8 8c cb ff ff 48
[436713.573641] RIP [] check_preempt_wakeup+0xd0/0x1d0
[436713.573641] RSP
[436713.573641] CR2: 0000000000000078
[436713.573641] ---[ end trace 0350f6c55b57d998 ]---

@jeremydenoun
Copy link

@EccoB if you check my old comment (6 Feb 2017) I recommend you to increase container health check interval, this seems limit panic appearance (or upgrade kernel if possible)

@gyorgyabraham
Copy link

gyorgyabraham commented Jan 15, 2018 via email

@djkenne
Copy link

djkenne commented Mar 8, 2018

I have this issue as well running just two Docker containers on Debian Jessie with kernel 3.16. Although this is officially supported by Docker, I guess I will need to look at upgrading the kernel or moving to Stretch. Thanks to everyone for all of the information.

@AkihiroSuda
Copy link
Member

Jessie is no longer supported: docker/docker-ce-packaging#253

If somebody is still hitting this, please ask the distro's kernel maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests