New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stopping container threads, probably inadvertently #29

Closed
TomiTakussaari opened this Issue Apr 8, 2015 · 21 comments

Comments

Projects
None yet
5 participants
@TomiTakussaari

We are experiencing some strange things when using this library on our application deployed to Tomcat7.

ClassLoaderLeakPreventor: Stopping Thread 'Thread[catalina-exec-720,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app after 5000 ms

This happens randomly during application deployment, but eventually halts all Tomcat executor threads, which then makes Tomcat quite unresponsive as all executors are in this kind of state:

"catalina-exec-919" #32896 daemon prio=5 os_prio=0 tid=0x00007f780c0f7000 nid=0x46d5 waiting on condition [0x00007f77e4c1b000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000e08c9478> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2083)
        at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
        at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)

After a while, all threads in Tomcat threadpool (maxThreadPoolSize) will be in this state, and Tomcat will be stuck until restart

Has anyone else experienced anything like this ?
Any ideas to resolve this ?

I would not want to skip shutting down threads as we do have some JDBC related threads that we do need to close..

@TomiTakussaari TomiTakussaari changed the title from Stopping container threads to Stopping container threads, probably inadvertently Apr 8, 2015

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Apr 11, 2015

Owner

Hi Tomi and thanks for the report. Indeed this would be considered a bug. Rather than doing some hard coded exclusion of org.apache.tomcat.util.threads.TaskThread I tried doing a generic improvement affecting threads only running in the web app, but not loaded by it. Would it be possible for you to build my '#29' branch by yourself and do some testing with it?

Owner

mjiderhamn commented Apr 11, 2015

Hi Tomi and thanks for the report. Indeed this would be considered a bug. Rather than doing some hard coded exclusion of org.apache.tomcat.util.threads.TaskThread I tried doing a generic improvement affecting threads only running in the web app, but not loaded by it. Would it be possible for you to build my '#29' branch by yourself and do some testing with it?

@TomiTakussaari

This comment has been minimized.

Show comment
Hide comment
@TomiTakussaari

TomiTakussaari Apr 26, 2015

Hi Mattias. Your #29 branch does seem to fix this issue.

Threads no longer get killed even if they are processing requests for context while context is being undeployed, and there does not seem to be any side-effects from this either.

Thanks!

Hi Mattias. Your #29 branch does seem to fix this issue.

Threads no longer get killed even if they are processing requests for context while context is being undeployed, and there does not seem to be any side-effects from this either.

Thanks!

@mjiderhamn mjiderhamn closed this in 5e72be3 May 2, 2015

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn May 2, 2015

Owner

The behaviour has been changed in the just released version 1.13.0

Owner

mjiderhamn commented May 2, 2015

The behaviour has been changed in the just released version 1.13.0

@wilkejj

This comment has been minimized.

Show comment
Hide comment
@wilkejj

wilkejj Mar 24, 2016

I am seeing very similar behavior - and even after I use 1.13.0. I am still having all of my tomcat threads be killed. I have multiple webapps (all the same war - I realize this is a bad setup) in 1 tomcat container.

The steps i'm doing are:

  1. lock database table for app1
  2. try and insert something through app1 (this will hang)
  3. shutdown app1 through tomcat manager

I would expect app1 to go down gracefully, and app2, app3, appN to stay up, along with Tomcat Manager.

Sometimes this will work cleanly - but a lot of times blows up. All of the following threads disappear (anything with http-bio-8080-exec):

47 = {TaskThread@30485} "Thread[http-bio-8080-exec-16,5,main]" 48 = {TaskThread@30369} "Thread[http-bio-8080-exec-19,5,main]" 49 = {TaskThread@30486} "Thread[http-bio-8080-exec-20,5,main]" 50 = {TaskThread@30487} "Thread[http-bio-8080-exec-21,5,main]" 51 = {TaskThread@30488} "Thread[http-bio-8080-exec-22,5,main]"

My other threads remain. Previously I was on version 1.7.0 - and kept getting:

ClassLoaderLeakPreventor: Stopping Thread 'Thread[http-bio-8080-exec-11,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app after 5000 ms Mar 24, 2016 5:29:47 PM org.restlet.engine.application.StatusFilter doHandle WARNING: Exception or error caught in status service java.lang.ThreadDeath   at java.lang.Thread.stop(Thread.java:797)   at se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor.stopThreads(ClassLoaderLeakPreventor.java:706)   at se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor.contextDestroyed(ClassLoaderLeakPreventor.java:385)   at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:4819)   at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5466)   at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232)   at org.apache.catalina.manager.ManagerServlet.stop(ManagerServlet.java:1297)   at org.apache.catalina.manager.ManagerServlet.doGet(ManagerServlet.java:374)   at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)

But after reading this post - i upgraded to 1.13.0 - but now get the following in catalina.out - and Tomcat is unresponsive.
`
ClassLoaderLeakPreventor: se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor shutting down context by removing known leaks (CL: 0x18a1d32)
ClassLoaderLeakPreventor: JDBC driver loaded by web app deregistered: class com.mysql.jdbc.Driver
ClassLoaderLeakPreventor: Looping 4 RMI Targets to find leaks
ClassLoaderLeakPreventor: Looping 4 RMI Targets to find leaks
ClassLoaderLeakPreventor: Shutting down java.util.concurrent.ThreadPoolExecutor running within the classloader.
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-1,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 5000
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-1,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader
ClassLoaderLeakPreventor: ThreadLocal of type java.lang.ThreadLocal: java.lang.ThreadLocal@d060ae with value org.restlet.engine.util.ChildContext@7b5b9f of type org.restlet.engine.util.ChildContext that is loaded by web app will be made stale for later expunging from Thread[http-bio-8080-exec-1,5,main]Mar 24, 2016 6:28:39 PM org.restlet.engine.log.LogFilter afterHandle

ClassLoaderLeakPreventor: ThreadLocal of type java.lang.ThreadLocal: java.lang.ThreadLocal@1e099e2 with value HTTP/1.1 - Unauthorized (401) - The request requires user authentication of type org.restlet.engine.adapter.HttpResponse that is loaded by web app will be made stale for later expunging from Thread[http-bio-8080-exec-1,5,main]
ClassLoaderLeakPreventor: ThreadLocal of type java.lang.ThreadLocal: java.lang.ThreadLocal@96f38d with value com.company.company.api.companyResourceApplication@d97e28 of type com.company.company.api.companyResourceApplication that is loaded by web app will be made stale for later expunging from Thread[http-bio-8080-exec-1,5,main]
ClassLoaderLeakPreventor: Since Java 1.6+ is used, we can call public static final void java.util.ResourceBundle.clearCache(java.lang.ClassLoader)
ClassLoaderLeakPreventor: Releasing web app classloader from Apache Commons Logging
Mar 24, 2016 6:28:39 PM org.apache.catalina.loader.WebappClassLoader loadClass
INFO: Illegal access: this web application instance has been stopped already. Could not load org.restlet.engine.util.ImmutableDate. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact.
java.lang.IllegalStateException
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1587)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1546)
at org.restlet.engine.header.DateWriter.write(DateWriter.java:72)
at org.restlet.engine.header.DateWriter.write(DateWriter.java:55)
at org.restlet.engine.header.HeaderUtils.addGeneralHeaders(HeaderUtils.java:261)
at org.restlet.engine.adapter.ServerAdapter.addResponseHeaders(ServerAdapter.java:90)
at org.restlet.engine.adapter.ServerAdapter.commit(ServerAdapter.java:193)
at org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:153)
at org.restlet.ext.servlet.ServerServlet.service(ServerServlet.java:1089)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
`

Any help is appreciated.

wilkejj commented Mar 24, 2016

I am seeing very similar behavior - and even after I use 1.13.0. I am still having all of my tomcat threads be killed. I have multiple webapps (all the same war - I realize this is a bad setup) in 1 tomcat container.

The steps i'm doing are:

  1. lock database table for app1
  2. try and insert something through app1 (this will hang)
  3. shutdown app1 through tomcat manager

I would expect app1 to go down gracefully, and app2, app3, appN to stay up, along with Tomcat Manager.

Sometimes this will work cleanly - but a lot of times blows up. All of the following threads disappear (anything with http-bio-8080-exec):

47 = {TaskThread@30485} "Thread[http-bio-8080-exec-16,5,main]" 48 = {TaskThread@30369} "Thread[http-bio-8080-exec-19,5,main]" 49 = {TaskThread@30486} "Thread[http-bio-8080-exec-20,5,main]" 50 = {TaskThread@30487} "Thread[http-bio-8080-exec-21,5,main]" 51 = {TaskThread@30488} "Thread[http-bio-8080-exec-22,5,main]"

My other threads remain. Previously I was on version 1.7.0 - and kept getting:

ClassLoaderLeakPreventor: Stopping Thread 'Thread[http-bio-8080-exec-11,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app after 5000 ms Mar 24, 2016 5:29:47 PM org.restlet.engine.application.StatusFilter doHandle WARNING: Exception or error caught in status service java.lang.ThreadDeath   at java.lang.Thread.stop(Thread.java:797)   at se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor.stopThreads(ClassLoaderLeakPreventor.java:706)   at se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor.contextDestroyed(ClassLoaderLeakPreventor.java:385)   at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:4819)   at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5466)   at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232)   at org.apache.catalina.manager.ManagerServlet.stop(ManagerServlet.java:1297)   at org.apache.catalina.manager.ManagerServlet.doGet(ManagerServlet.java:374)   at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)

But after reading this post - i upgraded to 1.13.0 - but now get the following in catalina.out - and Tomcat is unresponsive.
`
ClassLoaderLeakPreventor: se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor shutting down context by removing known leaks (CL: 0x18a1d32)
ClassLoaderLeakPreventor: JDBC driver loaded by web app deregistered: class com.mysql.jdbc.Driver
ClassLoaderLeakPreventor: Looping 4 RMI Targets to find leaks
ClassLoaderLeakPreventor: Looping 4 RMI Targets to find leaks
ClassLoaderLeakPreventor: Shutting down java.util.concurrent.ThreadPoolExecutor running within the classloader.
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-1,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 5000
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-1,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader
ClassLoaderLeakPreventor: ThreadLocal of type java.lang.ThreadLocal: java.lang.ThreadLocal@d060ae with value org.restlet.engine.util.ChildContext@7b5b9f of type org.restlet.engine.util.ChildContext that is loaded by web app will be made stale for later expunging from Thread[http-bio-8080-exec-1,5,main]Mar 24, 2016 6:28:39 PM org.restlet.engine.log.LogFilter afterHandle

ClassLoaderLeakPreventor: ThreadLocal of type java.lang.ThreadLocal: java.lang.ThreadLocal@1e099e2 with value HTTP/1.1 - Unauthorized (401) - The request requires user authentication of type org.restlet.engine.adapter.HttpResponse that is loaded by web app will be made stale for later expunging from Thread[http-bio-8080-exec-1,5,main]
ClassLoaderLeakPreventor: ThreadLocal of type java.lang.ThreadLocal: java.lang.ThreadLocal@96f38d with value com.company.company.api.companyResourceApplication@d97e28 of type com.company.company.api.companyResourceApplication that is loaded by web app will be made stale for later expunging from Thread[http-bio-8080-exec-1,5,main]
ClassLoaderLeakPreventor: Since Java 1.6+ is used, we can call public static final void java.util.ResourceBundle.clearCache(java.lang.ClassLoader)
ClassLoaderLeakPreventor: Releasing web app classloader from Apache Commons Logging
Mar 24, 2016 6:28:39 PM org.apache.catalina.loader.WebappClassLoader loadClass
INFO: Illegal access: this web application instance has been stopped already. Could not load org.restlet.engine.util.ImmutableDate. The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact.
java.lang.IllegalStateException
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1587)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1546)
at org.restlet.engine.header.DateWriter.write(DateWriter.java:72)
at org.restlet.engine.header.DateWriter.write(DateWriter.java:55)
at org.restlet.engine.header.HeaderUtils.addGeneralHeaders(HeaderUtils.java:261)
at org.restlet.engine.adapter.ServerAdapter.addResponseHeaders(ServerAdapter.java:90)
at org.restlet.engine.adapter.ServerAdapter.commit(ServerAdapter.java:193)
at org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:153)
at org.restlet.ext.servlet.ServerServlet.service(ServerServlet.java:1089)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
`

Any help is appreciated.

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Mar 26, 2016

Owner

Hi @wilkejj. I find it a bit hard to determine what parts of your comment are relevant to this library, and what the actual issue you are seeing is.

What version of Tomcat are you using? Are you aware that since 7.0.6 Tomcat renews it's thread pool every time a webapp is redeployed? Could this be what you are seeing...? Or what actual problems are you facing due to threads disappearing? Do the non-redeployed apps stop working too?

Owner

mjiderhamn commented Mar 26, 2016

Hi @wilkejj. I find it a bit hard to determine what parts of your comment are relevant to this library, and what the actual issue you are seeing is.

What version of Tomcat are you using? Are you aware that since 7.0.6 Tomcat renews it's thread pool every time a webapp is redeployed? Could this be what you are seeing...? Or what actual problems are you facing due to threads disappearing? Do the non-redeployed apps stop working too?

@wilkejj

This comment has been minimized.

Show comment
Hide comment
@wilkejj

wilkejj Mar 28, 2016

The following line seems to be killing all threads in the Tomcat pool

se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor.stopThreads(ClassLoaderLeakPreventor.java:706)

And Tomcat manager and all webapps become unresponsive.

Server version: Apache Tomcat/7.0.52 (Ubuntu) Server built: Jun 19 2015 08:54:46 Server number: 7.0.52.0 OS Name: Linux OS Version: 3.13.0-76-generic Architecture: amd64 JVM Version: 1.7.0_91-b02 JVM Vendor: Oracle Corporation

When shutting down 1 webapp that is mid-transaction - it brings down all the other webapps - as well as tomcat manager.

wilkejj commented Mar 28, 2016

The following line seems to be killing all threads in the Tomcat pool

se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor.stopThreads(ClassLoaderLeakPreventor.java:706)

And Tomcat manager and all webapps become unresponsive.

Server version: Apache Tomcat/7.0.52 (Ubuntu) Server built: Jun 19 2015 08:54:46 Server number: 7.0.52.0 OS Name: Linux OS Version: 3.13.0-76-generic Architecture: amd64 JVM Version: 1.7.0_91-b02 JVM Vendor: Oracle Corporation

When shutting down 1 webapp that is mid-transaction - it brings down all the other webapps - as well as tomcat manager.

@hdeadman

This comment has been minimized.

Show comment
Hide comment
@hdeadman

hdeadman Mar 28, 2016

I have multiple webapps (all the same war - I realize this is a bad setup) in 1 tomcat container.

What do you mean by multiple webapps "all the same war"? Do you have multiple copies of the same war deployed to different contexts on a tomcat server? Or do you have one war containing the functionality for multiple applications?

I have multiple webapps (all the same war - I realize this is a bad setup) in 1 tomcat container.

What do you mean by multiple webapps "all the same war"? Do you have multiple copies of the same war deployed to different contexts on a tomcat server? Or do you have one war containing the functionality for multiple applications?

@wilkejj

This comment has been minimized.

Show comment
Hide comment
@wilkejj

wilkejj Mar 28, 2016

Multiple copies of the same war deployed to different contexts (all pointing to their own database)

wilkejj commented Mar 28, 2016

Multiple copies of the same war deployed to different contexts (all pointing to their own database)

@hdeadman

This comment has been minimized.

Show comment
Hide comment
@hdeadman

hdeadman Mar 28, 2016

Are there any jars from the applications (other than jdbc drivers) deployed
to the Tomcat lib folder or does each application have all of its own jars?

On Mon, Mar 28, 2016 at 11:31 AM, wilkejj notifications@github.com wrote:

Multiple copies of the same war deployed to different contexts (all
pointing to their own database)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#29 (comment)

Are there any jars from the applications (other than jdbc drivers) deployed
to the Tomcat lib folder or does each application have all of its own jars?

On Mon, Mar 28, 2016 at 11:31 AM, wilkejj notifications@github.com wrote:

Multiple copies of the same war deployed to different contexts (all
pointing to their own database)


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#29 (comment)

@wilkejj

This comment has been minimized.

Show comment
Hide comment
@wilkejj

wilkejj Mar 28, 2016

In production there are some jars that are shared. But I was able to replicate this in a dev environment where each app has all it's own jars.

Adding the following to web.xml seemed to solve our issue - but now we are concerned that threads that should be stopped - won't be.

<context-param> <param-name>ClassLoaderLeakPreventor.stopThreads</param-name> <param-value>false</param-value> </context-param>

wilkejj commented Mar 28, 2016

In production there are some jars that are shared. But I was able to replicate this in a dev environment where each app has all it's own jars.

Adding the following to web.xml seemed to solve our issue - but now we are concerned that threads that should be stopped - won't be.

<context-param> <param-name>ClassLoaderLeakPreventor.stopThreads</param-name> <param-value>false</param-value> </context-param>

@szakrewsky

This comment has been minimized.

Show comment
Hide comment
@szakrewsky

szakrewsky Nov 28, 2016

I am having this same issue with code version 1.15.2 and tomcat version 7.0.50

Basically, sometimes if the system is in the middle of processing stuff and we hot deploy we get something like:

ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 30000
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader

and then we see errors like

Nov 28, 2016 3:30:36 PM org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor run
WARNING: Socket reprocessing request was rejected for:org.apache.tomcat.util.net.SocketWrapper@43315a3e:Socket[addr=/10.224.2.123,port=48310,localport=8080]
java.util.concurrent.RejectedExecutionException: Executor not running, can't force a command into the queue

At this point the system is completely hosed and only restarting tomcat will fix it. As far as I can tell this only happens if the system is in the middle of a request when this occurs. If the system is idle, then everything works as expected.

I also noticed that the leak prevention doesn't seem to be waiting for the threads to finish even though it says it does. That is evident by the logs surrounding the leak prevention output (I have it set to wait 30 secs).

Nov 28, 2016 3:30:35 PM oracle.jdbc.driver.OracleDriver unRegisterMBeans
...
ClassLoaderLeakPreventor: se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor shutting down context by removing known leaks (CL: 0x248b326e)
...
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 30000
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader
...
Nov 28, 2016 3:30:36 PM org.apache.catalina.startup.HostConfig undeploy

I am having this same issue with code version 1.15.2 and tomcat version 7.0.50

Basically, sometimes if the system is in the middle of processing stuff and we hot deploy we get something like:

ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 30000
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader

and then we see errors like

Nov 28, 2016 3:30:36 PM org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor run
WARNING: Socket reprocessing request was rejected for:org.apache.tomcat.util.net.SocketWrapper@43315a3e:Socket[addr=/10.224.2.123,port=48310,localport=8080]
java.util.concurrent.RejectedExecutionException: Executor not running, can't force a command into the queue

At this point the system is completely hosed and only restarting tomcat will fix it. As far as I can tell this only happens if the system is in the middle of a request when this occurs. If the system is idle, then everything works as expected.

I also noticed that the leak prevention doesn't seem to be waiting for the threads to finish even though it says it does. That is evident by the logs surrounding the leak prevention output (I have it set to wait 30 secs).

Nov 28, 2016 3:30:35 PM oracle.jdbc.driver.OracleDriver unRegisterMBeans
...
ClassLoaderLeakPreventor: se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventor shutting down context by removing known leaks (CL: 0x248b326e)
...
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 30000
ClassLoaderLeakPreventor: Thread 'Thread[http-bio-8080-exec-4746,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader
...
Nov 28, 2016 3:30:36 PM org.apache.catalina.startup.HostConfig undeploy

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Nov 29, 2016

Owner

@szakrewsky , could you try upgrading to 2.1.0 or at least 1.15.4?

Owner

mjiderhamn commented Nov 29, 2016

@szakrewsky , could you try upgrading to 2.1.0 or at least 1.15.4?

@szakrewsky

This comment has been minimized.

Show comment
Hide comment
@szakrewsky

szakrewsky Nov 29, 2016

HI @mjiderhamn,

I have the same problem using version 2.1.0

ClassLoaderLeakPreventorListener: se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventorListener shutting down context by removing known leaks (CL: 0x6940c399)
...
Nov 29, 2016 12:21:33 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Shutting down java.util.concurrent.ThreadPoolExecutor running within the classloader.
Nov 29, 2016 12:21:33 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Thread 'Thread[http-bio-8080-exec-66,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 5000
Nov 29, 2016 12:21:33 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Thread 'Thread[http-bio-8080-exec-66,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader

And then once the app reloads and starts serving requests again

Nov 29, 2016 12:21:44 PM org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor run
WARNING: Socket reprocessing request was rejected for:org.apache.tomcat.util.net.SocketWrapper@12acdf95:Socket[addr=/10.0.2.2,port=63302,localport=8080]
java.util.concurrent.RejectedExecutionException: Executor not running, can't force a command into the queue
at org.apache.tomcat.util.threads.TaskQueue.force(TaskQueue.java:64)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:181)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:156)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:343)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

HI @mjiderhamn,

I have the same problem using version 2.1.0

ClassLoaderLeakPreventorListener: se.jiderhamn.classloader.leak.prevention.ClassLoaderLeakPreventorListener shutting down context by removing known leaks (CL: 0x6940c399)
...
Nov 29, 2016 12:21:33 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Shutting down java.util.concurrent.ThreadPoolExecutor running within the classloader.
Nov 29, 2016 12:21:33 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Thread 'Thread[http-bio-8080-exec-66,5,main]' of type org.apache.tomcat.util.threads.TaskThread running in web app; waiting 5000
Nov 29, 2016 12:21:33 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Thread 'Thread[http-bio-8080-exec-66,5,main]' of type org.apache.tomcat.util.threads.TaskThread still running in web app; changing context classloader to system classloader

And then once the app reloads and starts serving requests again

Nov 29, 2016 12:21:44 PM org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor run
WARNING: Socket reprocessing request was rejected for:org.apache.tomcat.util.net.SocketWrapper@12acdf95:Socket[addr=/10.0.2.2,port=63302,localport=8080]
java.util.concurrent.RejectedExecutionException: Executor not running, can't force a command into the queue
at org.apache.tomcat.util.threads.TaskQueue.force(TaskQueue.java:64)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:181)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:156)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:343)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Dec 27, 2016

Owner

Hi again @szakrewsky . Sorry for the long delay.

I've finally found time for looking into this, and I believe the problem is that Tomcat does not handle TaskThreads being interrupt()ed. I think the most correct might be not to interrupt() threads which we do not intend to stop (but only change context ClassLoader), since there is no guarantee that third party code will handle interruptions "correctly". I have also improved the logging, so that we can see where a thread is currently executing, before being stopped or having it's context ClassLoader replaced.

What you noticed in the logs as "the leak prevention doesn't seem to be waiting for the threads to finish" further indicates that the interruption is the actual problem here, since we only wait the full time if needed. If interruption causes the thread to die, there is no need to continue waiting.

My changes are available on the master branch. Would it be possible for you to compile from sources, test and report back? It could be interresting to see the stack traces that will now be logged also, as they may provide clues to an even better solution. Thanks in advance.

Owner

mjiderhamn commented Dec 27, 2016

Hi again @szakrewsky . Sorry for the long delay.

I've finally found time for looking into this, and I believe the problem is that Tomcat does not handle TaskThreads being interrupt()ed. I think the most correct might be not to interrupt() threads which we do not intend to stop (but only change context ClassLoader), since there is no guarantee that third party code will handle interruptions "correctly". I have also improved the logging, so that we can see where a thread is currently executing, before being stopped or having it's context ClassLoader replaced.

What you noticed in the logs as "the leak prevention doesn't seem to be waiting for the threads to finish" further indicates that the interruption is the actual problem here, since we only wait the full time if needed. If interruption causes the thread to die, there is no need to continue waiting.

My changes are available on the master branch. Would it be possible for you to compile from sources, test and report back? It could be interresting to see the stack traces that will now be logged also, as they may provide clues to an even better solution. Thanks in advance.

@szakrewsky

This comment has been minimized.

Show comment
Hide comment
@szakrewsky

szakrewsky Jan 3, 2017

@mjiderhamn Thanks for looking into this. I will give it a try and report back soon. I had another idea on revisiting this. Originally, I had thought that the issue was that the threads were not being stopped, and stuff was being ripped out underneath, causing everything to get stuck in a weird state. But on second look, the errors seem to be that the thread pool executor has been stopped and tomcat is trying to use it to schedule other requests. Maybe there is something to check to decide whether or not to stop the thread pool executor; like if it belongs to the web application classloader being stopped? What do you think?

@mjiderhamn Thanks for looking into this. I will give it a try and report back soon. I had another idea on revisiting this. Originally, I had thought that the issue was that the threads were not being stopped, and stuff was being ripped out underneath, causing everything to get stuck in a weird state. But on second look, the errors seem to be that the thread pool executor has been stopped and tomcat is trying to use it to schedule other requests. Maybe there is something to check to decide whether or not to stop the thread pool executor; like if it belongs to the web application classloader being stopped? What do you think?

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Jan 3, 2017

Owner

Such pooled threads already are not stopped, but seemingly the interruption caused them to die undeliberately.

Owner

mjiderhamn commented Jan 3, 2017

Such pooled threads already are not stopped, but seemingly the interruption caused them to die undeliberately.

@szakrewsky

This comment has been minimized.

Show comment
Hide comment
@szakrewsky

szakrewsky Jan 3, 2017

@mjiderhamn I was not able to recreate the problem by overloading the system alone, but I also wasn't seeing any of the logs so I was just not able to catch the system mid-thread execution. Instead I created a very simple servlet which sleeps for 1min simulating a very long running task so I would be sure to trigger the issue.

catalina.txt

I am not sure I understand your comment above. You can see in the stack trace

Jan 03, 2017 4:49:16 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Shutting down java.util.concurrent.ThreadPoolExecutor running within the classloader.

followed by the interuption of the sleep.

java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at TestServlet.doGet(TestServlet.java:17)

The Java Docs for ThreadPoolExecutor.shutdownNow say that the threads will be interrupted. Next it prints

Jan 03, 2017 4:49:16 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Thread 'Thread[http-bio-8080-exec-4,5,]' of type org.apache.tomcat.util.threads.TaskThread running in protected ClassLoader; waiting 5000 ms. Thread stack trace:

I suspect that the thread actually finished running here since there is no stacktrace and it successfully returned the 500 error to the web browser like it was supposed to.

Finally, after it reloads you see the issue I was referring to about the ThreadPoolExecutor. No other web applications work after this point.

Jan 03, 2017 4:49:16 PM org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor run
WARNING: Socket reprocessing request was rejected for:org.apache.tomcat.util.net.SocketWrapper@6d31b35a:Socket[addr=/10.0.2.2,port=52684,localport=8080]
java.util.concurrent.RejectedExecutionException: Executor not running, can't force a command into the queue
at org.apache.tomcat.util.threads.TaskQueue.force(TaskQueue.java:64)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:181)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:156)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:343)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

@mjiderhamn I was not able to recreate the problem by overloading the system alone, but I also wasn't seeing any of the logs so I was just not able to catch the system mid-thread execution. Instead I created a very simple servlet which sleeps for 1min simulating a very long running task so I would be sure to trigger the issue.

catalina.txt

I am not sure I understand your comment above. You can see in the stack trace

Jan 03, 2017 4:49:16 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Shutting down java.util.concurrent.ThreadPoolExecutor running within the classloader.

followed by the interuption of the sleep.

java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at TestServlet.doGet(TestServlet.java:17)

The Java Docs for ThreadPoolExecutor.shutdownNow say that the threads will be interrupted. Next it prints

Jan 03, 2017 4:49:16 PM se.jiderhamn.classloader.leak.prevention.JULLogger warn
WARNING: Thread 'Thread[http-bio-8080-exec-4,5,]' of type org.apache.tomcat.util.threads.TaskThread running in protected ClassLoader; waiting 5000 ms. Thread stack trace:

I suspect that the thread actually finished running here since there is no stacktrace and it successfully returned the 500 error to the web browser like it was supposed to.

Finally, after it reloads you see the issue I was referring to about the ThreadPoolExecutor. No other web applications work after this point.

Jan 03, 2017 4:49:16 PM org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor run
WARNING: Socket reprocessing request was rejected for:org.apache.tomcat.util.net.SocketWrapper@6d31b35a:Socket[addr=/10.0.2.2,port=52684,localport=8080]
java.util.concurrent.RejectedExecutionException: Executor not running, can't force a command into the queue
at org.apache.tomcat.util.threads.TaskQueue.force(TaskQueue.java:64)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:181)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:156)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:343)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

@szakrewsky

This comment has been minimized.

Show comment
Hide comment
@szakrewsky

szakrewsky Jan 4, 2017

@mjiderhamn I did some more digging and added some more logging and here is what I have found.

  1. The latest master branch (not interrupting) is still not waiting for the thread to complete and has the lockup issue.
  2. Not stopping the ThreadPoolExecutor is the solution.
    1. The leak-prevention code is now waiting for the thread to complete. Why? It turns out that the leak-prevention code is running in the same ThreadPoolExecutor as the thread it is trying to stop. With shutdownNow, the currently running threads are interrupted. So if the leak-prevention code is waiting in the join when it is interrupted it would give the appearance of not waiting.
    2. But deciding whether to stop the ThreadPoolExecutor is complicated.
      1. It could be stopped if either the Thread, ThreadGroup, or Runnable are loaded by the classloader. This may not be 100% correct but seems likely that the executor is under the control of something in the classloader. That is the executor will not be stopped if the thread is only running in the classloader because it is likely that the system controls the executor.
      2. Stopping of the executor could be skipped if the current thread and the thread being stopped are running in the same executor. Not sure what the cons are here. It probably makes sense to check this regardless, to make sure the current thread's executor is not shutdown.

I have some of this code already. If you like anyone of these solutions let me know and I can submit a pull request.

@mjiderhamn I did some more digging and added some more logging and here is what I have found.

  1. The latest master branch (not interrupting) is still not waiting for the thread to complete and has the lockup issue.
  2. Not stopping the ThreadPoolExecutor is the solution.
    1. The leak-prevention code is now waiting for the thread to complete. Why? It turns out that the leak-prevention code is running in the same ThreadPoolExecutor as the thread it is trying to stop. With shutdownNow, the currently running threads are interrupted. So if the leak-prevention code is waiting in the join when it is interrupted it would give the appearance of not waiting.
    2. But deciding whether to stop the ThreadPoolExecutor is complicated.
      1. It could be stopped if either the Thread, ThreadGroup, or Runnable are loaded by the classloader. This may not be 100% correct but seems likely that the executor is under the control of something in the classloader. That is the executor will not be stopped if the thread is only running in the classloader because it is likely that the system controls the executor.
      2. Stopping of the executor could be skipped if the current thread and the thread being stopped are running in the same executor. Not sure what the cons are here. It probably makes sense to check this regardless, to make sure the current thread's executor is not shutdown.

I have some of this code already. If you like anyone of these solutions let me know and I can submit a pull request.

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Jan 4, 2017

Owner

@szakrewsky Thanks for the log. Seems I misunderstood the issue previously (which lead to some other improvements though :-) ).

I've been digging into why I'm shutting down ThreadPoolExecutors in the first place, and the only reason I could find is that I was inspired by Tomcats built in leak prevention mechanism doing it. (Though I'm not sure how it avoids shutting down its own ThreadPoolExecutor.) I cannot recall having seen a real life scenario where a ThreadPoolExecutor has been the cause of a leak.

I therefore decided to change strategy and become much more restrictive about when a ThreadPoolExecutor is shut down. Instead of your suggestions in 2 ii a, I went with checking if the ThreadPoolExecutor itself or its ThreadFactory is loaded by the ClassLoader that needs protecting. If neither of those conditions are fulfilled, I can at this time only imagine one scenario where this could cause a leak, and that is if the Worker has a firstTask that implies a reference to the ClassLoader, and that task does not finish. Possibly I'll look into that - or maybe I'll wait until someone reports an actual case of this happening.

This should be available on master, together with even more improved logging. I'd appreciate if you could test and get back with the results.

I'm not sure I understand your point in 2 i and 2 ii b would supposedly be irrelevant with the changes I made (or you'd be doing something really strange).

Owner

mjiderhamn commented Jan 4, 2017

@szakrewsky Thanks for the log. Seems I misunderstood the issue previously (which lead to some other improvements though :-) ).

I've been digging into why I'm shutting down ThreadPoolExecutors in the first place, and the only reason I could find is that I was inspired by Tomcats built in leak prevention mechanism doing it. (Though I'm not sure how it avoids shutting down its own ThreadPoolExecutor.) I cannot recall having seen a real life scenario where a ThreadPoolExecutor has been the cause of a leak.

I therefore decided to change strategy and become much more restrictive about when a ThreadPoolExecutor is shut down. Instead of your suggestions in 2 ii a, I went with checking if the ThreadPoolExecutor itself or its ThreadFactory is loaded by the ClassLoader that needs protecting. If neither of those conditions are fulfilled, I can at this time only imagine one scenario where this could cause a leak, and that is if the Worker has a firstTask that implies a reference to the ClassLoader, and that task does not finish. Possibly I'll look into that - or maybe I'll wait until someone reports an actual case of this happening.

This should be available on master, together with even more improved logging. I'd appreciate if you could test and get back with the results.

I'm not sure I understand your point in 2 i and 2 ii b would supposedly be irrelevant with the changes I made (or you'd be doing something really strange).

@szakrewsky

This comment has been minimized.

Show comment
Hide comment
@szakrewsky

szakrewsky Jan 6, 2017

@mjiderhamn I think that this thread pool executor stuff is definitely something to keep an eye on. Logging the different cases is definitely a good start. This solves my problem; the executor service is not shutdown, the wait timeout is waiting for the thread to complete, the context classloader is reassigned if it doesn't, and tomcat doesn't break.

Do you think you will be cutting a release version soon? I can build a version for my repo myself, but would like to just grab from maven central if this will be included anytime soon.

@mjiderhamn I think that this thread pool executor stuff is definitely something to keep an eye on. Logging the different cases is definitely a good start. This solves my problem; the executor service is not shutdown, the wait timeout is waiting for the thread to complete, the context classloader is reassigned if it doesn't, and tomcat doesn't break.

Do you think you will be cutting a release version soon? I can build a version for my repo myself, but would like to just grab from maven central if this will be included anytime soon.

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Jan 6, 2017

Owner

I just released 2.2.0 - should propagate to Maven Central shortly.

Owner

mjiderhamn commented Jan 6, 2017

I just released 2.2.0 - should propagate to Maven Central shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment