Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100% CPU usage on docker 1.13.1 #31060

Closed
SamSaffron opened this issue Feb 15, 2017 · 10 comments
Closed

100% CPU usage on docker 1.13.1 #31060

SamSaffron opened this issue Feb 15, 2017 · 10 comments
Assignees
Milestone

Comments

@SamSaffron
Copy link

SamSaffron commented Feb 15, 2017

Getting 100% CPU erratically on 1.13.1 once this starts happening dockerd uses 100% on all cores till restarted.

~# docker version
Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 06:42:29 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 06:42:29 2017
 OS/Arch:      linux/amd64
 Experimental: false
root@tiefighter21:~# docker info
Containers: 10
 Running: 10
 Paused: 0
 Stopped: 0
Images: 309
Server Version: 1.13.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 521
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1
runc version: 9df8b306d01f59d3a8029be411de015b7304dd8f
init version: 949e6fa
Security Options:
 apparmor
Kernel Version: 3.19.0-65-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.37 GiB
Name: tiefighter21.sjc1.discourse.org
ID: OUAL:ATX4:IVBF:YGEV:XXSH:CZZG:B634:PGUZ:2ZK6:7QTL:TY7E:ZWWI
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: 5eedb751
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: true

I have a data-structure dump but it contains sensitive env vars, thread dump is:

https://gist.github.com/SamSaffron/84da3bb794cb901cd9a36194100ffc71

Also while this is happening noticing this in the logs:

time="2017-02-15T21:20:51.744872734Z" level=error msg="Error closing logger: invalid argument"
time="2017-02-15T21:21:03.907878784Z" level=error msg="Error closing logger: invalid argument"

Anything else I can provide to help debugging this?

@cpuguy83
Copy link
Member

Thanks for the report.
I think it's spinning here: https://github.com/docker/docker/blob/v1.13.1/daemon/logger/jsonfilelog/read.go#L255
I still need to look into it deeper to know what's causing the spin.

The invalid argument is a bit weird, seems close is called multiple times somehow, maybe related maybe not.

@cpuguy83 cpuguy83 self-assigned this Feb 15, 2017
@SamSaffron
Copy link
Author

Thanks heaps @cpuguy83! let me know if you would like me to test anything, I am comfortable installing a pre-release on our CI server that is exhibiting this issue.

@mpalmer
Copy link

mpalmer commented Feb 15, 2017

If it helps (doing a bisect or something), we (I work with SamSaffron) also saw this on 1.13.0. It didn't happen with any 1.12.x, and doesn't appear to be happening on a couple of machines running 1.13.0-rc2, although their workload is very different, so it might not be triggering the bug.

@cpuguy83
Copy link
Member

Ok, think I figured out the spin issue.
#31070

@SamSaffron
Copy link
Author

wow awesome! thanks so much

@tswift242
Copy link
Contributor

We are also seeing this on 1.13.1. Top shows dockerd taking 100-600% cpu on our hosts, depending on how long the host has been up. What's the priority on releasing a fix for this? If this is at all easy to reproduce (which it seems it is, I see the cpu usage spike on hosts that have only been up one hour), this seems like a very high priority issue. We are having to rollback until a fix is out.

@cpuguy83
Copy link
Member

@tswift242 PR is already open, it will be in 1.13.2.

@tswift242
Copy link
Contributor

@cpuguy83 Thanks for responding and working on the fix for this. Any ETA for when 1.13.2 will be released?

@cpuguy83 cpuguy83 added this to the 1.13.2 milestone Feb 16, 2017
@cpuguy83
Copy link
Member

@tswift242 No date yet. Not this week.

@SamSaffron
Copy link
Author

@cpuguy83 I just cherry picked this commit onto my problem server, will let you know if it starts misbehaving again!

image

Thanks heaps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants