Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc-interop1 worker goes offline regularly #3847

Closed
jtattermusch opened this issue Oct 15, 2015 · 3 comments
Closed

grpc-interop1 worker goes offline regularly #3847

jtattermusch opened this issue Oct 15, 2015 · 3 comments

Comments

@jtattermusch
Copy link
Contributor

every 1 or 2 days, grpc-interop1 worker goes offline and cannot be ssh'ed to.
The workaround it to restart the VM and it starts working again.

One sign of the worker going offline, the CPU load goes up to 100%.
My theory is that the problem is caused by some issue with docker - because this machine spins up a LOT of short lived docker containers (in the past we've seen some issues with docker on linux workers when running too many docker containers in parallel).

@jtattermusch
Copy link
Contributor Author

Last build before the worker goes offline usually shows this:

ERROR: Connection was broken: java.io.IOException: Unexpected termination of the channel
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
    at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2332)
    at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2801)
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
    at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:40)
    at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)

@jtattermusch
Copy link
Contributor Author

Probable cause: moby/moby#13885

@jtattermusch
Copy link
Contributor Author

I added worker grpc-interop2 that is exactly the same grpc-interop1 worker, but instead of Debian 7.8 image, it is using Ubuntu 14.04 LTS. So far, I haven't seen the issue again. Closing this for now and will reopen is seen again.

@lock lock bot locked as resolved and limited conversation to collaborators Oct 5, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant