Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker stop does not work #552

Open
renes opened this issue Sep 6, 2016 · 70 comments
Open

docker stop does not work #552

renes opened this issue Sep 6, 2016 · 70 comments

Comments

@renes
Copy link

renes commented Sep 6, 2016

Hello,

I have a maven project and the docker plug in configuration for it is listed below. Starting works fine.
But after the integration tests are done and the plug in shuts down the container it just hangs with this output forever

[INFO] --- docker-maven-plugin:0.14.1:stop (stop) @ wh-cdc-web ---
08:52:37.421 DB> 2016-09-06 06:52:37 UTC [1-2] LOG: received smart shutdown request
08:52:37.421 DB> 2016-09-06 06:52:37 UTC [10-2] LOG: autovacuum launcher shutting down

The container is not running anymore so it was shutdown, it's just the plugin waiting for something.

PlugIn Version: 0.15.16
Docker version 1.12.1, build 23cf638

Dockerfile.txt

<plugin>
                <groupId>io.fabric8</groupId>
                <artifactId>docker-maven-plugin</artifactId>
                <version>${docker.maven.plugin.fabric8.version}</version>
                <configuration>
                    <logDate>default</logDate>
                    <autoPull>true</autoPull>
                    <images>
                        <image>
                            <alias>db</alias>
                            <name>postgres:9</name>
                            <build>
                                <ports>
                                    <port>5432</port>
                                </ports>
                                <dockerFileDir>${project.basedir}/docker</dockerFileDir>
                            </build>
                            <run>
                                <ports>
                                    <port>${dbPort}:5432</port>
                                </ports>
                                <wait>
                                    <log>database system is ready to accept connections</log>
                                    <time>20000</time>
                                </wait>
                                <log>
                                    <prefix>DB</prefix>
                                    <color>yellow</color>
                                </log>
                            </run>
                        </image>
                    </images>
                </configuration>

                <!-- Hooking into the lifecycle -->
                <executions>
                    <execution>

                        <id>start</id>
                        <phase>pre-integration-test</phase>
                        <goals>
                            <goal>build</goal>
                            <goal>start</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>stop</id>
                        <phase>post-integration-test</phase>
                        <goals>
                            <goal>stop</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
@rhuss
Copy link
Collaborator

rhuss commented Oct 7, 2016

I just tried here locally and it works fine for me (although I have no real integration tests running, of course). In your case could create a thread dump when it hangs and attach it to this issue ?

@renes
Copy link
Author

renes commented Oct 7, 2016

Thanks for the effort.

my dump looks like this:

Thread dump at 0:15.093.616

  • Thread group "main":

Thread Dump

  Thread "main":
    at com.kenai.jffi.Foreign.invokeN3O1(long, long, long, long, long, java.lang.Object, int, int, int)
    at com.kenai.jffi.Invoker.invokeN3(com.kenai.jffi.CallContext, long, long, long, long, int, java.lang.Object, com.kenai.jffi.ObjectParameterStrategy, com.kenai.jffi.ObjectParameterInfo) (line: 1061)
    at jnr.enxio.channels.Native$LibC$jnr$ffi$1.read(int, java.nio.ByteBuffer, long)
    at jnr.enxio.channels.Native.read(int, java.nio.ByteBuffer) (line: 95)
    at jnr.enxio.channels.NativeSocketChannel.read(java.nio.ByteBuffer) (line: 68)
    at sun.nio.ch.ChannelInputStream.read(java.nio.channels.ReadableByteChannel, java.nio.ByteBuffer, boolean) (line: 59)
    at sun.nio.ch.ChannelInputStream.read(java.nio.ByteBuffer)
    at sun.nio.ch.ChannelInputStream.read(byte[ ], int, int)
    at java.io.FilterInputStream.read(byte[ ], int, int)
    at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(byte[ ], int, int)
    at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer() (line: 155)
    at org.apache.http.impl.io.SessionInputBufferImpl.readLine(org.apache.http.util.CharArrayBuffer) (line: 284)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(org.apache.http.io.SessionInputBuffer) (line: 140)
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(org.apache.http.io.SessionInputBuffer)
    at org.apache.http.impl.io.AbstractMessageParser.parse() (line: 261)
    at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader() (line: 165)
    at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader()
    at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(org.apache.http.HttpRequest, org.apache.http.HttpClientConnection, org.apache.http.protocol.HttpContext) (line: 272)
    at org.apache.http.protocol.HttpRequestExecutor.execute(org.apache.http.HttpRequest, org.apache.http.HttpClientConnection, org.apache.http.protocol.HttpContext) (line: 124)
    at org.apache.http.impl.execchain.MainClientExec.execute(org.apache.http.conn.routing.HttpRoute, org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext, org.apache.http.client.methods.HttpExecutionAware) (line: 271)
    at org.apache.http.impl.execchain.ProtocolExec.execute(org.apache.http.conn.routing.HttpRoute, org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext, org.apache.http.client.methods.HttpExecutionAware) (line: 184)
    at org.apache.http.impl.execchain.RetryExec.execute(org.apache.http.conn.routing.HttpRoute, org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext, org.apache.http.client.methods.HttpExecutionAware) (line: 88)
    at org.apache.http.impl.execchain.RedirectExec.execute(org.apache.http.conn.routing.HttpRoute, org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext, org.apache.http.client.methods.HttpExecutionAware) (line: 110)
    at org.apache.http.impl.client.InternalHttpClient.doExecute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) (line: 184)
    at org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext)
    at org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.client.ResponseHandler, org.apache.http.protocol.HttpContext) (line: 220)
    at org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.client.ResponseHandler, org.apache.http.protocol.HttpContext)
    at org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.client.ResponseHandler)
    at io.fabric8.maven.docker.access.hc.ApacheHttpClientDelegate.post(java.lang.String, java.lang.Object, org.apache.http.client.ResponseHandler, int[ ])
    at io.fabric8.maven.docker.access.hc.ApacheHttpClientDelegate.post(java.lang.String, int[ ])
    at io.fabric8.maven.docker.access.hc.DockerAccessWithHcClient.stopContainer(java.lang.String, int) (line: 200)
    at io.fabric8.maven.docker.service.RunService.shutdown(io.fabric8.maven.docker.service.ContainerTracker$ContainerShutdownDescriptor, boolean, boolean) (line: 422)
    at io.fabric8.maven.docker.service.RunService.stopStartedContainers(boolean, boolean, boolean, io.fabric8.maven.docker.util.PomLabel) (line: 165)
    at io.fabric8.maven.docker.StopMojo.executeInternal(io.fabric8.maven.docker.service.ServiceHub) (line: 60)
    at io.fabric8.maven.docker.AbstractDockerMojo.execute() (line: 196)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(org.apache.maven.execution.MavenSession, org.apache.maven.plugin.MojoExecution) (line: 134)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(org.apache.maven.execution.MavenSession, org.apache.maven.plugin.MojoExecution, org.apache.maven.lifecycle.internal.ProjectIndex, org.apache.maven.lifecycle.internal.DependencyContext) (line: 207)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(org.apache.maven.execution.MavenSession, org.apache.maven.plugin.MojoExecution, org.apache.maven.lifecycle.internal.ProjectIndex, org.apache.maven.lifecycle.internal.DependencyContext, org.apache.maven.lifecycle.internal.PhaseRecorder) (line: 153)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(org.apache.maven.execution.MavenSession, java.util.List, org.apache.maven.lifecycle.internal.ProjectIndex) (line: 145)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(org.apache.maven.execution.MavenSession, org.apache.maven.execution.MavenSession, org.apache.maven.lifecycle.internal.ReactorContext, org.apache.maven.project.MavenProject, org.apache.maven.lifecycle.internal.TaskSegment) (line: 116)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(org.apache.maven.execution.MavenSession, org.apache.maven.lifecycle.internal.ReactorContext, org.apache.maven.project.MavenProject, org.apache.maven.lifecycle.internal.TaskSegment) (line: 80)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(org.apache.maven.execution.MavenSession, org.apache.maven.lifecycle.internal.ReactorContext, org.apache.maven.lifecycle.internal.ProjectBuildList, java.util.List, org.apache.maven.lifecycle.internal.ReactorBuildStatus) (line: 51)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(org.apache.maven.execution.MavenSession) (line: 128)
    at org.apache.maven.DefaultMaven.doExecute(org.apache.maven.execution.MavenExecutionRequest, org.apache.maven.execution.MavenSession, org.apache.maven.execution.MavenExecutionResult, org.eclipse.aether.DefaultRepositorySystemSession) (line: 307)
    at org.apache.maven.DefaultMaven.doExecute(org.apache.maven.execution.MavenExecutionRequest) (line: 193)
    at org.apache.maven.DefaultMaven.execute(org.apache.maven.execution.MavenExecutionRequest) (line: 106)
    at org.apache.maven.cli.MavenCli.execute(org.apache.maven.cli.CliRequest) (line: 863)
    at org.apache.maven.cli.MavenCli.doMain(org.apache.maven.cli.CliRequest) (line: 288)
    at org.apache.maven.cli.MavenCli.main(java.lang.String[ ], org.codehaus.plexus.classworlds.ClassWorld)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method, java.lang.Object, java.lang.Object[ ])
    at sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[ ])
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[ ])
    at java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[ ])
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(java.lang.String[ ]) (line: 289)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch(java.lang.String[ ]) (line: 229)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(java.lang.String[ ]) (line: 415)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main(java.lang.String[ ]) (line: 356)* Thread group "org.apache.maven.artifact.resolver.DefaultArtifactResolver":

  Thread "resolver-1":
    at sun.misc.Unsafe.park(boolean, long)
    at java.util.concurrent.locks.LockSupport.park(java.lang.Object) (line: 175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() (line: 2039)
    at java.util.concurrent.LinkedBlockingQueue.take() (line: 442)
    at java.util.concurrent.ThreadPoolExecutor.getTask() (line: 1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) (line: 1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run() (line: 617)
    at java.lang.Thread.run()

  Thread "resolver-2":
    at sun.misc.Unsafe.park(boolean, long)
    at java.util.concurrent.locks.LockSupport.park(java.lang.Object) (line: 175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() (line: 2039)
    at java.util.concurrent.LinkedBlockingQueue.take() (line: 442)
    at java.util.concurrent.ThreadPoolExecutor.getTask() (line: 1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) (line: 1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run() (line: 617)
    at java.lang.Thread.run()

  Thread "resolver-3":
    at sun.misc.Unsafe.park(boolean, long)
    at java.util.concurrent.locks.LockSupport.park(java.lang.Object) (line: 175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() (line: 2039)
    at java.util.concurrent.LinkedBlockingQueue.take() (line: 442)
    at java.util.concurrent.ThreadPoolExecutor.getTask() (line: 1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) (line: 1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run() (line: 617)
    at java.lang.Thread.run()

  Thread "resolver-4":
    at sun.misc.Unsafe.park(boolean, long)
    at java.util.concurrent.locks.LockSupport.park(java.lang.Object) (line: 175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() (line: 2039)
    at java.util.concurrent.LinkedBlockingQueue.take() (line: 442)
    at java.util.concurrent.ThreadPoolExecutor.getTask() (line: 1067)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) (line: 1127)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run() (line: 617)
    at java.lang.Thread.run()

@rhuss
Copy link
Collaborator

rhuss commented Oct 7, 2016

Thanks. It looks like, that your Docker daemon doesn't close the connect. The Maven plugin is stuck in read of a post request to containers/<container id>/stop. So it seems that your Docker daemon has trouble in shutting down the container.

How long have you waited until to killed the maven call ?

You can try to tune the <kill> parameter in the <run><wait> configuration, to tell the docker daemon to be more aggresive. See https://dmp.fabric8.io/#start-wait for details (actually <kill> is used for stopping containers).

@renes
Copy link
Author

renes commented Oct 7, 2016

it will wait forever I tried to play around with the kill parameter already

the container itself is stopped at this time so just the feedback seems not to work

@rhuss
Copy link
Collaborator

rhuss commented Oct 7, 2016

That's really strange. Do you have a chance to test a different Docker daemone ? (e.g. with Docker machine or from a VM). BTW your are running on which plattform (OX X, Windows, Linux) ?

@rhuss
Copy link
Collaborator

rhuss commented Oct 7, 2016

Could you also upgrade to d-m-p version 0.16.6 ?

@renes
Copy link
Author

renes commented Oct 7, 2016

Linux

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.1 LTS
Release: 16.04
Codename: xenial

I will try another docker machine and the update

@renes
Copy link
Author

renes commented Oct 19, 2016

ok so with

Docker version 1.12.2, build bb80604

and plugin version

0.16.8

it is working

thanks for your time

@renes renes closed this as completed Oct 19, 2016
@benhubert
Copy link

benhubert commented May 3, 2017

I have the same issue here, also with a postgres database. When I look at docker ps -a I see that the database exits sometimes (maybe always) with status code 137.

Docker version: 17.03.1-ce (git commit c6d412e329)
docker-maven-plugin version: 0.20.1
Postgres image: postgres:9.6.1
Maven version: 3.0.5
Java version 1.8.0_121
Linux kernel release: 4.10.6-1-ARCH

Furthermore I don't have any integration tests using this database at the moment.

@rhuss
Copy link
Collaborator

rhuss commented May 3, 2017

Does it happen for you all the time that the container cant be stopped or only sporadically ?
Any chance you could share the pom.xml ?

@benhubert
Copy link

It worked one time. So I'd say sporadically it works, but usually not. I also played around with the <kill> and <shutdown> parameters and in fact when it worked i had the <kill> parameter set to 10000.

<plugins>
  <plugin>
    <groupId>io.fabric8</groupId>
    <artifactId>docker-maven-plugin</artifactId>
    <version>0.20.1</version>
    <executions>
      <execution>
        <id>prepare-it-database</id>
        <phase>pre-integration-test</phase>
        <goals>
          <goal>start</goal>
        </goals>
        <configuration>
          <images>
            <image>
              <name>postgres:9.6.1</name>
              <alias>postgres-integration-test</alias>
              <run>
                <ports>
                  <port>it.db.port:5432</port>
                </ports>
                <wait>
                  <log>database system is ready to accept connections</log>
                  <time>20000</time>
                </wait>
              </run>
            </image>
          </images>
        </configuration>
      </execution>
      <execution>
        <id>remove-it-database</id>
        <phase>post-integration-test</phase>
        <goals>
          <goal>stop</goal>
        </goals>
      </execution>
    </executions>
  </plugin>
  <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-failsafe-plugin</artifactId>
    <executions>
      <execution>
        <goals>
          <goal>integration-test</goal>
          <goal>verify</goal>
        </goals>
        <configuration>
          <environmentVariables>
            <it.db.port>${it.db.port}</it.db.port>
          </environmentVariables>
        </configuration>
      </execution>
    </executions>
  </plugin>
</plugins>

@while1eq1
Copy link

Im experiencing the same thing with the postgres container, It is sporadic for me but fails most times.

@while1eq1
Copy link

@benhubert by any chance are you running this postgres image docker-in-docker and/or with gitlab-runner

@benhubert
Copy link

benhubert commented May 4, 2017

I found the issue why the container end up with exit code 137. The problem was the log line I waited for. On first startup postgres initializes the database and then restarts it. So The log line database system is ready to accept connections appears twice in the log which starts the integration test too early.

The solution was to change the log line configuration to (?s)PostgreSQL init process complete.*database system is ready to accept connections


But the original issue - that the maven process hangs - is still not fixed.

I run it on my physical machine. No docker-in-docker or gitlab-runner.

Here's a thread dump
2017-05-04 12:29:45
Full thread dump OpenJDK 64-Bit Server VM (25.121-b13 mixed mode):

"RMI TCP Connection(3)-127.0.0.1" #31 daemon prio=9 os_prio=0 tid=0x00007f5724007800 nid=0x36f0 runnable [0x00007f5773af3000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
- locked <0x00000000f590b330> (a java.io.BufferedInputStream)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:550)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$10/286149792.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- <0x00000000f5810cb8> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"RMI TCP Connection(2)-127.0.0.1" #30 daemon prio=9 os_prio=0 tid=0x00007f5724007000 nid=0x36ec in Object.wait() [0x00007f5773bf3000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000f57a5968> (a com.sun.jmx.remote.internal.ArrayNotificationBuffer)
at com.sun.jmx.remote.internal.ArrayNotificationBuffer.fetchNotifications(ArrayNotificationBuffer.java:449)
- locked <0x00000000f57a5968> (a com.sun.jmx.remote.internal.ArrayNotificationBuffer)
at com.sun.jmx.remote.internal.ArrayNotificationBuffer$ShareBuffer.fetchNotifications(ArrayNotificationBuffer.java:227)
at com.sun.jmx.remote.internal.ServerNotifForwarder.fetchNotifs(ServerNotifForwarder.java:274)
at javax.management.remote.rmi.RMIConnectionImpl$4.run(RMIConnectionImpl.java:1270)
at javax.management.remote.rmi.RMIConnectionImpl$4.run(RMIConnectionImpl.java:1268)
at javax.management.remote.rmi.RMIConnectionImpl.fetchNotifications(RMIConnectionImpl.java:1274)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$10/286149792.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- <0x00000000f5810748> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"JMX server connection timeout 29" #29 daemon prio=9 os_prio=0 tid=0x00007f5720014000 nid=0x36e3 in Object.wait() [0x00007f5778152000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e1666838> (a [I)
at com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(ServerCommunicatorAdmin.java:168)
- locked <0x00000000e1666838> (a [I)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"RMI Scheduler(0)" #28 daemon prio=9 os_prio=0 tid=0x00007f5720012000 nid=0x36e2 waiting on condition [0x00007f577837a000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e16679b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"RMI TCP Connection(1)-127.0.0.1" #27 daemon prio=9 os_prio=0 tid=0x00007f5724006000 nid=0x36e1 runnable [0x00007f5778278000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
- locked <0x00000000e1920540> (a java.io.BufferedInputStream)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:550)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$10/286149792.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- <0x00000000e17f48b0> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"RMI TCP Accept-0" #26 daemon prio=9 os_prio=0 tid=0x00007f573009f000 nid=0x36de runnable [0x00007f5779126000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:400)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:372)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"Attach Listener" #25 daemon prio=9 os_prio=0 tid=0x00007f5754001000 nid=0x36da waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"resolver-5" #20 daemon prio=5 os_prio=0 tid=0x00007f579109e800 nid=0x3642 waiting on condition [0x00007f5779227000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e0e718d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"resolver-4" #14 daemon prio=5 os_prio=0 tid=0x00007f579082a000 nid=0x3567 waiting on condition [0x00007f5779629000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e0e718d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"resolver-3" #13 daemon prio=5 os_prio=0 tid=0x00007f5790830000 nid=0x3566 waiting on condition [0x00007f577972a000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e0e718d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"resolver-2" #12 daemon prio=5 os_prio=0 tid=0x00007f579082e800 nid=0x3565 waiting on condition [0x00007f5779ef0000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e0e718d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"resolver-1" #11 daemon prio=5 os_prio=0 tid=0x00007f579082e000 nid=0x3564 waiting on condition [0x00007f577a0cc000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e0e718d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"process reaper" #10 daemon prio=10 os_prio=0 tid=0x00007f57905e8800 nid=0x3563 waiting on condition [0x00007f577a7f2000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000e0d5bfe0> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Locked ownable synchronizers:
- None

"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007f57900bf800 nid=0x3560 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f57900bc800 nid=0x355f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f57900ba800 nid=0x355e waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f57900b8000 nid=0x355d waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f57900b6800 nid=0x355c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

Locked ownable synchronizers:
- None

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f5790083800 nid=0x355b in Object.wait() [0x00007f577b4f3000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e093c300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000000e093c300> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

Locked ownable synchronizers:
- None

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f579007f000 nid=0x355a in Object.wait() [0x00007f577b5f4000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000e093b870> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000e093b870> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

Locked ownable synchronizers:
- None

"main" #1 prio=5 os_prio=0 tid=0x00007f579000a000 nid=0x3554 runnable [0x00007f5798c11000]
java.lang.Thread.State: RUNNABLE
at com.kenai.jffi.Foreign.invokeN3O1(Native Method)
at com.kenai.jffi.Invoker.invokeN3(Invoker.java:1061)
at jnr.enxio.channels.Native$LibC$jnr$ffi$1.read(Unknown Source)
at jnr.enxio.channels.Native.read(Native.java:95)
at jnr.enxio.channels.NativeSocketChannel.read(NativeSocketChannel.java:68)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59)
- locked <0x00000000e19228f0> (a java.lang.Object)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
- locked <0x00000000e1922900> (a sun.nio.ch.ChannelInputStream)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:220)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:164)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:139)
at io.fabric8.maven.docker.access.hc.ApacheHttpClientDelegate.post(ApacheHttpClientDelegate.java:102)
at io.fabric8.maven.docker.access.hc.ApacheHttpClientDelegate.post(ApacheHttpClientDelegate.java:108)
at io.fabric8.maven.docker.access.hc.DockerAccessWithHcClient.stopContainer(DockerAccessWithHcClient.java:208)
at io.fabric8.maven.docker.service.RunService$2.call(RunService.java:480)
at io.fabric8.maven.docker.service.RunService$2.call(RunService.java:477)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at io.fabric8.maven.docker.wait.WaitUtil.wait(WaitUtil.java:27)
at io.fabric8.maven.docker.service.RunService.shutdownAndWait(RunService.java:477)
at io.fabric8.maven.docker.service.RunService.shutdown(RunService.java:427)
at io.fabric8.maven.docker.service.RunService.stopStartedContainers(RunService.java:166)
at io.fabric8.maven.docker.StopMojo.executeInternal(StopMojo.java:68)
at io.fabric8.maven.docker.AbstractDockerMojo.execute(AbstractDockerMojo.java:223)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)

Locked ownable synchronizers:
- None

"VM Thread" os_prio=0 tid=0x00007f5790077000 nid=0x3559 runnable

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f579001f000 nid=0x3555 runnable

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f5790020800 nid=0x3556 runnable

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f5790022800 nid=0x3557 runnable

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f5790024000 nid=0x3558 runnable

"VM Periodic Task Thread" os_prio=0 tid=0x00007f57900c2000 nid=0x3561 waiting on condition

JNI global references: 249

@rhuss
Copy link
Collaborator

rhuss commented May 4, 2017

"main" #1 prio=5 os_prio=0 tid=0x00007f579000a000 nid=0x3554 runnable [0x00007f5798c11000]

java.lang.Thread.State: RUNNABLE

at com.kenai.jffi.Foreign.invokeN3O1(Native Method)

at com.kenai.jffi.Invoker.invokeN3(Invoker.java:1061)

at jnr.enxio.channels.Native$LibC$jnr$ffi$1.read(Unknown Source)

at jnr.enxio.channels.Native.read(Native.java:95)

.....
......
at io.fabric8.maven.docker.access.hc.ApacheHttpClientDelegate.post(ApacheHttpClientDelegate.java:102)

at io.fabric8.maven.docker.access.hc.ApacheHttpClientDelegate.post(ApacheHttpClientDelegate.java:108)

at io.fabric8.maven.docker.access.hc.DockerAccessWithHcClient.stopContainer(DockerAccessWithHcClient.java:208)

The interesting thing is that its hangs while reading the response from the Docker daemon when sending the stop command over the Unix socket

Could you please try out the following (since unfortunately, I can't reproduce it):

  • Could you switch to a TCP connection to the Docker daemon and try again ?
  • Could you try the latest snapshot 0.20-SNAPSHOT of the plugin which comes an update of jnr-unixsocket ?

What really puzzles me a bit is that the thread is in state RUNNABLE. Could you do a second thread dump sometimes later and see whether this thread is still RUNNABLE ? Also could you check your CPU load?

Sorry to bug you with so many questions, but unfortunately, as said, I cant reproduce it.

@rhuss rhuss reopened this May 4, 2017
@benhubert
Copy link

  • It never happened for me when starting docker with -H 0.0.0.0:2375 and setting <docker.host>http://localhost:2375</docker.host> in the pom (and I ran it in a while loop for about one hour)
  • It still happens with the current master branch (0.20-SNAPSHOT, 77b83e87).

It seems that the thread stays RUNNABLE forever. I don't see high CPU load while the process hangs.

@rhuss
Copy link
Collaborator

rhuss commented May 4, 2017

@benhubert thanks a lot for your investigations ! Looks interesting :)

sorry, the jnr-unisocket update is not on master (but on https://github.com/rhuss/docker-maven-plugin/tree/pr/update-jnr). However, the snapshot pushed to Maven central contains the update.

Could you try it with this version ?

@rhuss
Copy link
Collaborator

rhuss commented May 4, 2017

I probably need to check how a proper timeout could be set on the unix socket connection.

@benhubert
Copy link

I ran it with the branch but this ends up with DOCKER> Cannot create docker access object [Cannot extract API version from server unix://127.0.0.1:1 : already connected].

@benhubert
Copy link

Yes, a timeout would be a workaround for this, but then I end up with many docker container running or at least not being removed because sometimes the removal does not work.

@rhuss
Copy link
Collaborator

rhuss commented May 4, 2017

Thanks for the feedback. I will have a deeper look into the jnr-unix upgrade. However, I'm not so positive if this upgrade doesn't fix it, because there is very likely not much I can do except maybe adding the timeout. No idea why the Docker daemon doesn't send the response and keeps the connection open (or the client). Would be nice to debug into the Docker daemon, too, but tbh, no idea where to start on the docker side ;-(

@toran414
Copy link

toran414 commented May 4, 2017

I am having this same issue with a MySQL container, but only when I specify a volume for it to use. I was trying to fix some slowness that was happening on Travis by running MySQL on a ramdisk, and then starting running into this issue.

  <plugin>
    <groupId>io.fabric8</groupId>
    <artifactId>docker-maven-plugin</artifactId>
    <version>0.20.1</version>
    <executions>
      <execution>
        <id>prepare-it-containers</id>
        <phase>pre-integration-test</phase>
        <goals>
          <goal>start</goal>
        </goals>
        <configuration>
          <images>
            <image>
              <name>mysql:5.7</name>
              <alias>db_it</alias>
              <run>
                <namingStrategy>alias</namingStrategy>
                <env>
                  ...
                </env>
                <ports>
                  <port>${mysql.port}:3306</port>
                </ports>
                <wait>
                  <log>ready for connections</log>
                  <time>20000</time>
                  <shutdown>5000</shutdown>
                  <kill>5000</kill>
                </wait>
                <volumes>
                  <bind>
                    <volume>/mnt/ramdisk:/var/lib/mysql</volume>
                  </bind>
                </volumes>
              </run>
            </image>
          </images>
        </configuration>
      </execution>
      <execution>
        <id>remove-it-containers</id>
        <phase>post-integration-test</phase>
        <goals>
          <goal>stop</goal>
        </goals>
        <configuration>
          <removeVolumes>true</removeVolumes>
        </configuration>
      </execution>
    </executions>
  </plugin>

I tried attaching the volume to my standard volume rather than ramdisk, and it still happens. I also check and the containers are gone almost immediately after it starts hanging, it just never returns.

@rhuss
Copy link
Collaborator

rhuss commented May 4, 2017

@toran414 do youse Unix or TCP sockets for connecting to the Docker daemon ?

@toran414
Copy link

toran414 commented May 4, 2017

Unix

@jljouannic
Copy link

jljouannic commented May 31, 2017

Same problem as everyone here, also with the official postgres image.

Docker version: Docker version 17.03.1-ce, build c6d412e
docker-maven-plugin version: 0.21.0
Postgres image: postgres: 9.6.3
Maven version: 3.5.0
Java version 1.7.0_80
Linux kernel release: 4.4.0-78-generic

docker:stop fails anytime the build is started in Jenkins, but passes when run on my computer. 😕

@jljouannic
Copy link

Seems to work when using a TCP socket instead of UNIX socket to connect to docker daemon.

@pepperbob
Copy link

pepperbob commented May 31, 2017

We see the same problem but with different images - with some hanging 90% at the time others work just fine.

Unfortunately we can't just switch to TCP sockets so a workaround / fix would be highly appreciated. Looking at docker events everything shuts down totally normal. Is there anything that can be provided to bring more light into this issue?

@rhuss
Copy link
Collaborator

rhuss commented May 31, 2017

@jljouannic @pepperbob I just pushed an experimental version 0.21-SNAPSHOT to Maven snapshot repository with an update to jnr-unixsocket 0.18 which we use for connecting via Unix Socket.

It would be awesome if you could try this plugin version on your CI to check whether it makes any difference.

@GiovanniPaoloGibilisco
Copy link

Still happening on 0.26.0

@rhuss
Copy link
Collaborator

rhuss commented May 24, 2018

@andreasaronsson I tried jnr-unixsocket-0.19 already but I got tons of issues with closed streams. Not sure that's d-m-p doing something wrong or the lib. Going to investigate this further, as it seems that it contains some fix which could be related to this issue.

@andreasaronsson
Copy link

Also using 0.26.0.
I am building often on two setups. My local development machine and a vm in a local cloud.
The development machine gets this problem quite consistently. It has not yet happened when I use tcp://127.0.01:2375, only unix socket.
On the vm I do not see this at all despite using unix socket on all builds.
Not sure if it is relevant, just thought I'd share the observed behavior.

@rhuss
Copy link
Collaborator

rhuss commented Jun 13, 2018

@andreasaronsson thanks for reporting back. When I have on day in row to spend I will debug this very annoying issue again. This should be fixed eventually !

@jansu76
Copy link

jansu76 commented Jun 27, 2018

This could be related to how the postgres container starts postgres twice?

On my build, container initializes like this

...
2018-06-27T06:32:08.620361712Z LOG:  database system is ready to accept connections
2018-06-27T06:32:08.620596412Z LOG:  autovacuum launcher started
2018-06-27T06:32:09.543918933Z  done
2018-06-27T06:32:09.543946064Z server started
2018-06-27T06:32:09.553168080Z ALTER ROLE
2018-06-27T06:32:09.553324000Z 
2018-06-27T06:32:09.553386493Z 
2018-06-27T06:32:09.553482005Z /usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/init-db.sh
2018-06-27T06:32:09.823030836Z CREATE DATABASE
2018-06-27T06:32:10.095982063Z CREATE DATABASE
2018-06-27T06:32:10.374175763Z CREATE DATABASE
2018-06-27T06:32:10.670122446Z CREATE DATABASE
2018-06-27T06:32:10.938015349Z CREATE DATABASE
2018-06-27T06:32:11.230115987Z CREATE DATABASE
2018-06-27T06:32:11.503438617Z CREATE DATABASE
2018-06-27T06:32:11.803037499Z CREATE DATABASE
2018-06-27T06:32:12.071810882Z CREATE DATABASE
2018-06-27T06:32:12.338221669Z CREATE DATABASE
2018-06-27T06:32:12.601096615Z CREATE DATABASE
2018-06-27T06:32:12.891608679Z CREATE DATABASE
2018-06-27T06:32:13.150962243Z CREATE DATABASE
2018-06-27T06:32:13.453191324Z CREATE DATABASE
2018-06-27T06:32:13.453455288Z 
2018-06-27T06:32:13.454900258Z waiting for server to shut down....LOG:  received fast shutdown request
2018-06-27T06:32:13.454919604Z LOG:  aborting any active transactions
2018-06-27T06:32:13.455206337Z LOG:  autovacuum launcher shutting down
2018-06-27T06:32:13.455362349Z LOG:  shutting down
2018-06-27T06:32:13.492598122Z LOG:  database system is shut down
2018-06-27T06:32:14.455678905Z  done
2018-06-27T06:32:14.455712100Z server stopped
2018-06-27T06:32:14.455783219Z 
2018-06-27T06:32:14.455796724Z PostgreSQL init process complete; ready for start up.
2018-06-27T06:32:14.455802733Z 
2018-06-27T06:32:14.520521960Z LOG:  database system was shut down at 2018-06-27 06:32:13 UTC
2018-06-27T06:32:14.533110150Z LOG:  MultiXact member wraparound protections are now enabled
2018-06-27T06:32:14.533796230Z LOG:  database system is ready to accept connections

If I wait for the second "database system is ready" log message, which comes from the second postgres startup, like @benhubert:

<log>(?s)PostgreSQL init process complete.*database system is ready to accept connections</log>

then stopping hangs.

If I wait for the first "database system is ready" log, which comes from the first postgres startup,

?s)database system is ready to accept

(and add a dirty <exec> <postStart>sleep 10</postStart> </exec>)

then stopping does not hang.

@akoshman
Copy link

It's still reproduced in the latest 0.27.2 version of the plugin. Is there any decent solution or a workaround of this problem?

@gbloisi
Copy link

gbloisi commented Dec 4, 2018

I'm facing this issue too by launching maven within a jenkins docker container that is configured to reuse the host docker. For example:
docker run -p 8080:8080
-v /var/run/docker.sock:/var/run/docker.sock
--name jenkins
jenkins/jenkins:lts

@amitgol
Copy link

amitgol commented Feb 28, 2019

I'm facing this issue with 0.28.0 as well. My workaround is to run socat and connect to docker with TCP. This setup works for me in OSX and Ubuntu14:

            <plugin>
                <!-- Start a socat proxy for docker (if one does not exist) - to workaround issue with connecting to unix socket -->
                <artifactId>maven-antrun-plugin</artifactId>
                <executions>
                    <execution>
                        <phase>initialize</phase>
                        <goals>
                            <goal>run</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <target>
                        <exec executable="bash">
                            <arg value="-c"/>
                            <arg line='"docker top socat || docker run --name socat -d -v /var/run/docker.sock:/var/run/docker.sock -p 127.0.0.1:2375:2375 bobrik/socat TCP-LISTEN:2375,fork UNIX-CONNECT:/var/run/docker.sock"'/>
                        </exec>
                    </target>
                </configuration>
            </plugin>
...
                <plugin>
                    <groupId>io.fabric8</groupId>
                    <artifactId>docker-maven-plugin</artifactId>
                    <version>${fabric8.version}</version>
                    <configuration>
                        <dockerHost>tcp://127.0.0.1:2375</dockerHost>
                    </configuration>
                </plugin>

@mklinger
Copy link

mklinger commented Aug 4, 2019

Same problem here with 0.30.0. Is there any news on this topic?

@mklinger
Copy link

mklinger commented Aug 4, 2019

I tested various versions of the plugin while doing integration tests in an endless loop (as the problem doesn't show up with every test run) and found that my use-case works fine with version 0.24.0. With 0.25.0 and above, stopping the container sometimes blocks forever.

My use-case is:

  • Build container
  • Start container
  • Wait for log
  • Do integration tests
  • Stop container

@ncoif-tw
Copy link

ncoif-tw commented Aug 20, 2019

Having the same bug too, when trying to spawn a RabbitMQ side-car for integration testing. My build is sometimes stuck for the stop step of the plugin: docker-maven-plugin:0.31.0:stop. I also noticed that the docker image isn't running anymore when the maven build is stuck.

docker-maven-plugin version 0.31.0

docker -v
Docker version 19.03.1-ce, build 74b1e89e8a
mvn --version
Apache Maven 3.6.1 (NON-CANONICAL_2019-07-24T20:49:02Z_root; 2019-07-24T21:49:02+01:00)
Maven home: /opt/maven
Java version: 1.8.0_162, vendor: Oracle Corporation, runtime: /usr/lib/jvm/jdk8-oracle/jre
Default locale: en_GB, platform encoding: UTF-8
OS name: "linux", version: "5.1.21-1-manjaro", arch: "amd64", family: "unix"
uname -a
Linux nicolasbox 5.1.21-1-MANJARO #1 SMP PREEMPT Sun Jul 28 09:33:43 UTC 2019 x86_64 GNU/Linux

Threaddump of the Java process attached thread-dump.txt (seems stuck with jnr):

"main" #1 prio=5 os_prio=0 tid=0x00007f3ea400a800 nid=0x159f runnable [0x00007f3eabac7000]
   java.lang.Thread.State: RUNNABLE
	at com.kenai.jffi.Foreign.invokeN3O1(Native Method)
	at com.kenai.jffi.Invoker.invokeN3(Invoker.java:1061)
	at jnr.enxio.channels.Native$LibC$jnr$ffi$1.read(Unknown Source)
	at jnr.enxio.channels.Native.read(Native.java:101)
	at jnr.unixsocket.impl.Common.read(Common.java:50)
	at jnr.unixsocket.impl.AbstractNativeSocketChannel.read(AbstractNativeSocketChannel.java:65)
	at jnr.unixsocket.UnixSocketChannel.read(UnixSocketChannel.java:250)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59)

Maven debug informations just before the process is hanging:

[INFO] --- docker-maven-plugin:0.31.0:stop (stop) @ redacted ---
[DEBUG] Configuring mojo io.fabric8:docker-maven-plugin:0.31.0:stop from plugin realm ClassRealm[plugin>io.fabric8:docker-maven-plugin:0.31.0, parent: sun.misc.Launcher$AppClassLoader@232204a1]
[DEBUG] Configuring mojo 'io.fabric8:docker-maven-plugin:0.31.0:stop' with basic configurator -->
[DEBUG]   (f) allContainers = false
[DEBUG]   (f) autoCreateCustomNetworks = true
[DEBUG]   (f) execution = io.fabric8:docker-maven-plugin:0.31.0:stop {execution: stop}
[DEBUG]   (f) alias = rabbitmq
[DEBUG]   (s) name = redacted/rabbitmq:3.7.10-1
[DEBUG]   (f) ports = [5672:5672, 15672:15672]
[DEBUG]   (f) log = .*Server startup complete*
[DEBUG]   (f) time = 20000
[DEBUG]   (f) kill = 20000
[DEBUG]   (f) wait = io.fabric8.maven.docker.config.WaitConfiguration@44fb1361
[DEBUG]   (f) run = io.fabric8.maven.docker.config.RunImageConfiguration@ba1d5fa
[DEBUG]   (f) images = [ImageConfiguration {name='redacted/rabbitmq:3.7.10-1', alias='rabbitmq'}]
[DEBUG]   (f) keepContainer = false
[DEBUG]   (f) keepRunning = false
[DEBUG]   (f) logStdout = false
[DEBUG]   (f) maxConnections = 100
[DEBUG]   (f) project = MavenProject: redacted
[DEBUG]   (f) removeVolumes = false
[DEBUG]   (f) session = org.apache.maven.execution.MavenSession@deb3b60
[DEBUG]   (f) settings = org.apache.maven.execution.SettingsAdapter@5f834ca0
[DEBUG]   (f) skip = false
[DEBUG]   (f) skipExtendedAuth = false
[DEBUG]   (f) skipMachine = false
[DEBUG]   (f) sledgeHammer = false
[DEBUG]   (f) useColor = true
[DEBUG] -- end configuration --
[DEBUG] DOCKER> shutdown will wait max of 20 seconds before removing container
[DEBUG] DOCKER> Closing LogWaitChecker callback

Sometimes (maybe ~1% of the time), the build doesn't have the problem, and here are the maven details in theses cases:

[DEBUG] DOCKER> shutdown will wait max of 20 seconds before removing container
[INFO] DOCKER> [redacted/rabbitmq:3.7.10-1] "rabbitmq": Stop and removed container a2983b5ff810 after 2096 ms
[INFO] 
[DEBUG] DOCKER> Closing LogWaitChecker callback

Also worth noting that the problem doesn't happen with a side-car that is much faster to shutdown (i.e. a redis side-car instead of a rabbitMQ), here are the logs for it:

[DEBUG] DOCKER> shutdown will wait max of 20 seconds before removing container
[DEBUG] DOCKER> Closing LogWaitChecker callback
[INFO] DOCKER> [redacted/redis:5.0.3] "redis": Stop and removed container a93cc1169593 after 290 ms

Any news/known work-around for this problem? Please shout if there is more debug information I can provide.

@chonton
Copy link
Contributor

chonton commented Sep 10, 2019

I was having a similar problem with one of my containers. On OSX, (using tcp to communicate with daemon), no problem. On Linux Jenkins slave, (using unix socket to communicate with daemon), docker:stop caused docker container to stop but never completed the maven mojo.

The image configuration includes <wait><log> to ensure that service is properly started and <log><file> to capture the service docker logs.

Once I added an integration test which caused additional docker logs output to occur between startup and shutdown, the docker:stop works consistently on both local development and Jenkins hosts.

@selckin
Copy link

selckin commented Dec 2, 2019

Disabling connection pooling and passing -Ddocker.maxConnections=1 to maven makes it work for me

@selckin
Copy link

selckin commented Dec 2, 2019

apologies celebrated too early, went from 100% reproducing it to like 20% tho, trying to make standalone reproducer but those all magically seem to work as well .. :/

@selckin
Copy link

selckin commented Dec 2, 2019

Can reproduce on 2 machines with https://github.com/selckin/mvn-docker-reproduce
mvn 3.6.3
adoptopenjdk 1.8.0_232
docker 19.03.5

@selckin
Copy link

selckin commented Dec 2, 2019

jnr socket with the following (existing) patch applied seems to fix it !

jnr/jnr-unixsocket#68

@rhuss
Copy link
Collaborator

rhuss commented Dec 16, 2019

@selckin cool, very good find ! Now lets push jnr-unixsocket over there to get the fix merged and hope for a new release.

@rhuss
Copy link
Collaborator

rhuss commented Jan 9, 2020

@selckin and all others: jnr-unixsocket 0.25 contains the potential fix and I updated on master to it. Also, I pushed a 0.33-SNAPSHOT release to Maven central, so it would be awesome if you could give it a try and check whether it solves the issue at hand. If so, I can release a 0.33.0 (0.32.1) rather quickly with this updated dependency.

@selckin
Copy link

selckin commented Jan 9, 2020

@rhuss tested with my build, everything looks good, thanks!

@bmarcj
Copy link

bmarcj commented Feb 13, 2020

I'm still experiencing this issue with 0.33.0 of the fabric8 docker-maven-plugin.

@selckin
Copy link

selckin commented Feb 18, 2020

@bmarcj do you have public build that can reproduce it ?

@bmarcj
Copy link

bmarcj commented Feb 20, 2020

I'm afraid I've nothing public I can provide. I'm getting this with a MySQL docker image on Ubuntu 18.04, if that's any help.

@nicomz
Copy link

nicomz commented Mar 9, 2020

Hi! @selckin, I'm facing same situation.
Docker version: 19.03.5
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Java version: 1.8.0_242, vendor: AdoptOpenJDKProject's sample POM:

'

<properties>
    <docker.maven.plugin.version>0.33.0</docker.maven.plugin.version>
</properties>


<build>
    <plugins>
        <plugin>
            <groupId>io.fabric8</groupId>
            <artifactId>docker-maven-plugin</artifactId>
            <version>${docker.maven.plugin.version}</version>
            <configuration>
                <images>
                    <image>
                        <name>store/oracle/database-enterprise:12.2.0.1-slim</name>
                        <run>
                            <ports>
                                <port>oracle.db.port:1521</port>
                            </ports>
                            <ulimits>
                                <ulimit>
                                    <name>nofile</name>
                                    <hard>10240</hard>
                                    <soft>10240</soft>
                                </ulimit>
                            </ulimits>
                            <log>
                                <enabled>true</enabled>
                            </log>
                            <wait>
                                <log>Done ! The database is ready for use .</log>
                                <time>500000</time>
                            </wait>
                            <stopMode>kill</stopMode>
                        </run>
                    </image>
                </images>
                <verbose>true</verbose>
            </configuration>
            <dependencies>
                <dependency>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-api</artifactId>
                    <version>1.7.24</version>
                </dependency>
                <dependency>
                    <groupId>org.slf4j</groupId>
                    <artifactId>jcl-over-slf4j</artifactId>
                    <version>1.7.24</version>
                </dependency>
            </dependencies>
            <executions>
                <execution>
                    <id>start</id>
                    <phase>test-compile</phase>
                    <goals>
                        <goal>build</goal>
                        <goal>start</goal>
                    </goals>
                    <configuration>
                        <showLogs>true</showLogs>
                        <logStdout>true</logStdout>
                    </configuration>
                </execution>
                <execution>
                    <id>stop</id>
                    <phase>post-integration-test</phase>
                    <goals>
                        <goal>stop</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

`
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests