grpc-interop1 worker goes offline regularly #3847

jtattermusch · 2015-10-15T17:55:47Z

every 1 or 2 days, grpc-interop1 worker goes offline and cannot be ssh'ed to.
The workaround it to restart the VM and it starts working again.

One sign of the worker going offline, the CPU load goes up to 100%.
My theory is that the problem is caused by some issue with docker - because this machine spins up a LOT of short lived docker containers (in the past we've seen some issues with docker on linux workers when running too many docker containers in parallel).

jtattermusch · 2015-10-15T17:57:54Z

Last build before the worker goes offline usually shows this:

ERROR: Connection was broken: java.io.IOException: Unexpected termination of the channel
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
    at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2332)
    at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2801)
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
    at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

jtattermusch · 2015-10-16T23:20:22Z

Probable cause: moby/moby#13885

jtattermusch · 2015-10-21T22:21:55Z

I added worker grpc-interop2 that is exactly the same grpc-interop1 worker, but instead of Debian 7.8 image, it is using Ubuntu 14.04 LTS. So far, I haven't seen the issue again. Closing this for now and will reopen is seen again.

jtattermusch added area/interop priority/P1 infra/Jenkins labels Oct 15, 2015

jtattermusch self-assigned this Oct 15, 2015

jtattermusch closed this as completed Oct 21, 2015

lock bot locked as resolved and limited conversation to collaborators Oct 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

grpc-interop1 worker goes offline regularly #3847

grpc-interop1 worker goes offline regularly #3847

jtattermusch commented Oct 15, 2015

jtattermusch commented Oct 15, 2015

jtattermusch commented Oct 16, 2015

jtattermusch commented Oct 21, 2015

grpc-interop1 worker goes offline regularly #3847

grpc-interop1 worker goes offline regularly #3847

Comments

jtattermusch commented Oct 15, 2015

jtattermusch commented Oct 15, 2015

jtattermusch commented Oct 16, 2015

jtattermusch commented Oct 21, 2015