New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
journald restart crashes Docker #19728
Comments
@jeffjohnston Could you find logs from crashed daemon, please? I don't see such behaviour, more that that - containers continue to write logs and |
Where would I find the logs? I also have syslog installed, but in Redhat 7 that is proxied through journald. Here is an example of running a minimal container. Then I "journalctl -f" to see messages coming through. I bolded the steps to see the errors and then what the logs look like. Usually I run weave and 8 containers, but in this example I am just firing up an ubuntu container. Hope this helps. First I run this:
Then I restart journald:
Then I look at docker processes: (sometimes I have to run this a few times before getting the error)
Then I run the docker process again:
|
I got the same error after upgrade to new kernel version on CentOS 7.2.
And I can't start docker daemon with that 'driver not supported' error.
|
We are also having this problem with RHEL 7.2.
Rancher-agent is set to autostart.
Also I don't know if this is the same problem but after docker has been running for few days it might crash crashing all containers with it when I run any docker command. I started our containers on Friday and today I tried to stop 2 containers with docker-compose but docker crashed instead. This has happened 2 times before. Unfortunately I don't have any logs from those events but they are similar: docker has been untouched for few days and after running a docker command daemon crashes/restarts killing all containers. Once I had to remove all liferay docker images and re-import the latest one because docker wouldn't start.
|
We are investigating this in a bugzilla. https://bugzilla.redhat.com/show_bug.cgi?id=1300076 Right now it looks like a golang problem. Golang will not ignore sigpipe. |
We are investigating this in a bugzilla. https://bugzilla.redhat.com/show_bug.cgi?id=1300076 Right now it looks like a golang problem. Golang will not ignore sigpipe. #7087 also has this problem. |
The basic problem is when you run a unit file systemd hooks up stdout/stderr to the journal, if the journal goes away. These sockets will get closed, and when you write to them you will get a sigpipe. systemd says programs should ignore sigpipe on stdout and stderr, but golang seems to refuse to do this more then 10 times. This bugzilla explains the behaviour from a systemd/journald point of view. |
One fix might be to redirect stdout from docker to syslog or journald directly. Most system services do not write the volume of content to stdout/stderr that the docker daemon does, they usually write it to a logging system, which use dgram sockets and would not be effected by this issue. |
Maybe we can just catch SIGPIPE? |
@LK4D4 Problem is with golang < 1.6. Golang has some wacky code that ignores SIGPIPE 10 times and then dies, no matter what you do in your code. We are looking to ship a stop gap which would pipe the output of docker directly to journald until we can build with golang 1.6. After we can build with 1.6 this tool will go away. |
@rhatdan cool, thanks for update. |
cc/ me |
@rhatdan I see the linked issue was resolved; https://bugzilla.redhat.com/show_bug.cgi?id=1300076 I guess it's the stopgap solution; can you tell me if there's any changes needed in this repository (e.g., updates to our unit-files? https://github.com/docker/docker/tree/master/contrib/init/systemd |
@thaJeztah unfortunately no. We had to add an helper binary which spawns Docker itself and capture the logs. |
I mean, I believe we can update the contrib/ folder and unit files so that this binary is used if you guys wish |
Yes only way to fix this was to get a "C" program that launches docker and grabs its stdout and stderr and sends it to journald. When we get to go 1.6 we will through away the "C" program. |
Using golang 1.6, is it now possible to ignore SIGPIPE events on stdout/stderr. Previous versions of the golang library cached 10 events and then killed the process receiving the events. systemd-journald sends SIGPIPE events when jounald is restarted and the target of the unit file writes to stdout/stderr. Docker logs to stdout/stderr. This patch silently ignores all SIGPIPE events. Signed-off-by: Jhon Honce <jhonce@redhat.com>
It appears that a journald restart on Redhat 7.2 crashes Docker. I was getting my logging settings in place when I noticed that after restarting journald my Docker daemon would crash. I backed things out to where even without any containers if I restarted journald Docker crashes and then restarts.
It is very easy to reproduce. Just restart journald.
Then check Docker processes. After just a few times you get this.
You can see it recovers. However, if I have containers running it will stop all of them and even require me to manually remove contatiners that show up as Dead. I also have to remove the containers sometimes as they appear to be in a state where they will not start without errors.
docker version
:Client:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5
Built: Fri Nov 20 13:25:01 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5
Built: Fri Nov 20 13:25:01 UTC 2015
OS/Arch: linux/amd64
docker info
:Containers: 13
Images: 62
Server Version: 1.9.1
Storage Driver: devicemapper
Pool Name: docker-253:0-67131623-pool
Pool Blocksize: 65.54 kB
Base Device Size: 107.4 GB
Backing Filesystem:
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 4.038 GB
Data Space Total: 107.4 GB
Data Space Available: 31.62 GB
Metadata Space Used: 5.886 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.142 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.107-RHEL7 (2015-10-14)
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.10.0-327.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.2 (Maipo)
CPUs: 2
Total Memory: 7.64 GiB
Name: vmtstcon01.roomandboard.com
ID: 7J2R:VPDV:RZRX:NM3X:6JG3:5HGD:ZUIW:FGMW:RCJO:NULN:FS7Z:UD42
WARNING: bridge-nf-call-ip6tables is disabled
uname -a
:Linux vmtstcon01.roomandboard.com 3.10.0-327.el7.x86_64 #1 SMP Thu Oct 29 17:29:29 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
Environment details (AWS, VirtualBox, physical, etc.):
vmware
The text was updated successfully, but these errors were encountered: