New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oracle thread started but not stopped, causes classloader leak #23

Closed
bradcupit opened this Issue Oct 2, 2014 · 16 comments

Comments

Projects
None yet
4 participants
@bradcupit
Contributor

bradcupit commented Oct 2, 2014

After using your excellent library we (unfortunately) had a new Oracle thread start running. This thread did not get shutdown on reload, thus causing a classloader leak.

We have the Oracle JDBC driver in our <TOMCAT_HOME>/lib (though we're currently not using the Oracle JDBC driver). I'm not sure if this is relevant.

I must confess I don't understand why the oracle thread's context classloader should be the System classloader. Does that prevent stopThreads() from stopping it?

Work around
I extended ClassLoaderLeakPreventor and overrode contextInitialized(), skipping the code that spins up the Oracle thread.

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Oct 2, 2014

Owner

The forced start of the Oracle thread was added to avoid that thread stopping classloader from being garbage collected.

Can you provide some information, in what way it caused a classloader leak? Do you have a heapdump or GC root trace?

Owner

mjiderhamn commented Oct 2, 2014

The forced start of the Oracle thread was added to avoid that thread stopping classloader from being garbage collected.

Can you provide some information, in what way it caused a classloader leak? Do you have a heapdump or GC root trace?

@hdeadman

This comment has been minimized.

Show comment
Hide comment
@hdeadman

hdeadman Oct 3, 2014

I believe I was responsible for trying to prevent that Oracle thread from getting associated with a particular web app by causing the thread to be started while the system classloader was the context classloader. It's been awhile since I looked it but I seem to recall the fix not appearing to work (still had leaks that pointed at that thread). I vaguely remember trying to use the test framework to demonstrate the bug but I was unsuccessful. The attempted fix should probably be removed if it seems to be causing someone a problem. I would think the thread would get kicked off eventually by the app anyway and likely tied to that app's classloader (or whichever app used Oracle first).

hdeadman commented Oct 3, 2014

I believe I was responsible for trying to prevent that Oracle thread from getting associated with a particular web app by causing the thread to be started while the system classloader was the context classloader. It's been awhile since I looked it but I seem to recall the fix not appearing to work (still had leaks that pointed at that thread). I vaguely remember trying to use the test framework to demonstrate the bug but I was unsuccessful. The attempted fix should probably be removed if it seems to be causing someone a problem. I would think the thread would get kicked off eventually by the app anyway and likely tied to that app's classloader (or whichever app used Oracle first).

@bradcupit

This comment has been minimized.

Show comment
Hide comment
@bradcupit

bradcupit Oct 6, 2014

Contributor

I'm on Tomcat 7.0.53 with the Oracle JDBC driver version 11.2.0.3.0 (ojdbc6.jar) loaded in TOMCAT_HOME/lib

Before ClassLoaderLeakPreventor the Oracle thread did not start (since we're not using the driver). After ClassLoaderLeakPreventor it does start.

GC roots:
image

On a related note
I like how the contextDestroyed() method calls other (protected) methods to do the work. If I want to tweak how a method works or even skip it completely I can extend ClassLoaderLeakPreventor and override that method.

If contextInitialized() did the same then I could skip any step that caused me a problem, like this Oracle one.

Contributor

bradcupit commented Oct 6, 2014

I'm on Tomcat 7.0.53 with the Oracle JDBC driver version 11.2.0.3.0 (ojdbc6.jar) loaded in TOMCAT_HOME/lib

Before ClassLoaderLeakPreventor the Oracle thread did not start (since we're not using the driver). After ClassLoaderLeakPreventor it does start.

GC roots:
image

On a related note
I like how the contextDestroyed() method calls other (protected) methods to do the work. If I want to tweak how a method works or even skip it completely I can extend ClassLoaderLeakPreventor and override that method.

If contextInitialized() did the same then I could skip any step that caused me a problem, like this Oracle one.

@bradcupit

This comment has been minimized.

Show comment
Hide comment
@bradcupit

bradcupit Oct 7, 2014

Contributor

This is related to #8

Especially note the 2nd-to-last-paragraph here, which says the thread keeps running.

The GC root screenshot (see comment above) shows the still-running thread referencing the webapp classloader, causing the leak.

Contributor

bradcupit commented Oct 7, 2014

This is related to #8

Especially note the 2nd-to-last-paragraph here, which says the thread keeps running.

The GC root screenshot (see comment above) shows the still-running thread referencing the webapp classloader, causing the leak.

@bradcupit

This comment has been minimized.

Show comment
Hide comment
@bradcupit

bradcupit Oct 8, 2014

Contributor

Pull request #24 adds a good workaround: we can extend ClassLoaderLeakPreventor and override initOracleJdbcThread(). This override can be an empty method so it does nothing, effectively ignoring that step.

Leaving this ticket open in case we want to completely remove initOracleJdbcThread().

Contributor

bradcupit commented Oct 8, 2014

Pull request #24 adds a good workaround: we can extend ClassLoaderLeakPreventor and override initOracleJdbcThread(). This override can be an empty method so it does nothing, effectively ignoring that step.

Leaving this ticket open in case we want to completely remove initOracleJdbcThread().

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Jan 6, 2015

Owner

Having some time to contemplate on this now. What do you think about keeping initOracleJdbcThread(), but after loading the class (and thus spawning the thread) finding the oracle.jdbc.driver.OracleTimeoutPollingThread and setting its inheritedAccessControlContext to a dummy AccessControlContext?

      for(Thread thread : getAllThreads()) {
        if("oracle.jdbc.driver.OracleTimeoutPollingThread".equals(thread.getClass().getName())) {
          Field inheritedAccessControlContext = findField(Thread.class, "inheritedAccessControlContext");
          if(inheritedAccessControlContext != null) {
            inheritedAccessControlContext.set(thread, new AccessControlContext(new ProtectionDomain[0]));
          }
        }
      }
Owner

mjiderhamn commented Jan 6, 2015

Having some time to contemplate on this now. What do you think about keeping initOracleJdbcThread(), but after loading the class (and thus spawning the thread) finding the oracle.jdbc.driver.OracleTimeoutPollingThread and setting its inheritedAccessControlContext to a dummy AccessControlContext?

      for(Thread thread : getAllThreads()) {
        if("oracle.jdbc.driver.OracleTimeoutPollingThread".equals(thread.getClass().getName())) {
          Field inheritedAccessControlContext = findField(Thread.class, "inheritedAccessControlContext");
          if(inheritedAccessControlContext != null) {
            inheritedAccessControlContext.set(thread, new AccessControlContext(new ProtectionDomain[0]));
          }
        }
      }
@hdeadman

This comment has been minimized.

Show comment
Hide comment
@hdeadman

hdeadman Jan 6, 2015

That might help. I do worry about the case where an application that doesn't use Oracle but finds the Oracle jar in its classpath (which would be the case for an app running in Weblogic that didn't use Oracle) since a thread is being spawned that otherwise wouldn't be spawned. It might be worth adding a configuration parameter that allowed someone to turn off the Oracle check if they choose.

ClassLoaderLeakPreventor.disableOracleChecks (not sure what the default should be, false may be OK b/c for many users (of Tomcat, Jetty, etc) the existence of the Oracle class in the classpath is probably a reasonable indicator they are using Oracle).

hdeadman commented Jan 6, 2015

That might help. I do worry about the case where an application that doesn't use Oracle but finds the Oracle jar in its classpath (which would be the case for an app running in Weblogic that didn't use Oracle) since a thread is being spawned that otherwise wouldn't be spawned. It might be worth adding a configuration parameter that allowed someone to turn off the Oracle check if they choose.

ClassLoaderLeakPreventor.disableOracleChecks (not sure what the default should be, false may be OK b/c for many users (of Tomcat, Jetty, etc) the existence of the Oracle class in the classpath is probably a reasonable indicator they are using Oracle).

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Jan 6, 2015

Owner

I spent some more time with this and I believe I found a more proper workaround, hopefully also covering more similar cases. At the same time I introduced a setting for the spawning of the Oracle timeout thread in question, ClassLoaderLeakPreventor.startOracleTimeoutThread which defaults to true.

I don't have the time to test right now, so if anyone would care to help me out - especially with the Oracle cases - that would be appreciated. You'd need to pull the AccessControlContext branch and compile from there.
Just reviewing and commenting on the changes would also be appreciated.

Owner

mjiderhamn commented Jan 6, 2015

I spent some more time with this and I believe I found a more proper workaround, hopefully also covering more similar cases. At the same time I introduced a setting for the spawning of the Oracle timeout thread in question, ClassLoaderLeakPreventor.startOracleTimeoutThread which defaults to true.

I don't have the time to test right now, so if anyone would care to help me out - especially with the Oracle cases - that would be appreciated. You'd need to pull the AccessControlContext branch and compile from there.
Just reviewing and commenting on the changes would also be appreciated.

@hdeadman

This comment has been minimized.

Show comment
Hide comment
@hdeadman

hdeadman Jan 7, 2015

I added the listener from the AccessControlContext branch to one of the applications I work on and deployed it to Weblogic 12 and Jetty 9.0.5 (from Maven). I deployed to Weblogic several times to see if it leaked but it wasn't a good test b/c I hadn't checked this application to see if it leaked recently and there are other applications deployed to the server. I didn't see any errors logged by the listener on deploy or undeploy except for:
Internal registry of java.beans.PropertyEditorManager not found

I like the changes.

hdeadman commented Jan 7, 2015

I added the listener from the AccessControlContext branch to one of the applications I work on and deployed it to Weblogic 12 and Jetty 9.0.5 (from Maven). I deployed to Weblogic several times to see if it leaked but it wasn't a good test b/c I hadn't checked this application to see if it leaked recently and there are other applications deployed to the server. I didn't see any errors logged by the listener on deploy or undeploy except for:
Internal registry of java.beans.PropertyEditorManager not found

I like the changes.

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Jan 11, 2015

Owner

Version 1.11.0 that includes these changes has now been released.

Owner

mjiderhamn commented Jan 11, 2015

Version 1.11.0 that includes these changes has now been released.

@LukeWoodward

This comment has been minimized.

Show comment
Hide comment
@LukeWoodward

LukeWoodward Sep 23, 2015

Sorry, I don't believe this issue has been fixed. As far as I can see there is still a leak that causes webapp classloaders not to be garbage collected.

Courtesy of JVisualVM, here's a path from a (destroyed) Tomcat WebappClassLoader to a GC root:

this     - value: org.apache.catalina.loader.WebappClassLoader #1
 <- <classLoader>     - class: example.ClassLoaderLeakPreventor$2, value: org.apache.catalina.loader.WebappClassLoader #1
  <- <class>     - class: example.ClassLoaderLeakPreventor$2, value: example.ClassLoaderLeakPreventor$2 class ClassLoaderLeakPreventor$2
   <- combiner     - class: java.security.AccessControlContext, value: example.ClassLoaderLeakPreventor$2 #1
    <- inheritedAccessControlContext (thread object)     - class: oracle.jdbc.driver.OracleTimeoutPollingThread, value: java.security.AccessControlContext #27

Secondly, if you're using ojdbc7.jar, there is now another class that needs to be initialized in the same way that oracle.jdbc.driver.OracleTimeoutThreadPerVM is initialized. This additional class is oracle.jdbc.driver.BlockSource$ThreadedCachingBlockSource$BlockReleaser. Like OracleTimeoutThreadPerVM, this class is a Thread.

In case it helps, I'm using JDK1.8.0_45 on Windows 10 x64, the Oracle JDBC driver jar is in <TOMCAT_HOME>/lib, not in WEB-INF/lib, and the driver JAR is being used by a test web application I've knocked up to reproduce this problem (i.e. it's not sitting around idle as it was in the initial comment on this issue).

LukeWoodward commented Sep 23, 2015

Sorry, I don't believe this issue has been fixed. As far as I can see there is still a leak that causes webapp classloaders not to be garbage collected.

Courtesy of JVisualVM, here's a path from a (destroyed) Tomcat WebappClassLoader to a GC root:

this     - value: org.apache.catalina.loader.WebappClassLoader #1
 <- <classLoader>     - class: example.ClassLoaderLeakPreventor$2, value: org.apache.catalina.loader.WebappClassLoader #1
  <- <class>     - class: example.ClassLoaderLeakPreventor$2, value: example.ClassLoaderLeakPreventor$2 class ClassLoaderLeakPreventor$2
   <- combiner     - class: java.security.AccessControlContext, value: example.ClassLoaderLeakPreventor$2 #1
    <- inheritedAccessControlContext (thread object)     - class: oracle.jdbc.driver.OracleTimeoutPollingThread, value: java.security.AccessControlContext #27

Secondly, if you're using ojdbc7.jar, there is now another class that needs to be initialized in the same way that oracle.jdbc.driver.OracleTimeoutThreadPerVM is initialized. This additional class is oracle.jdbc.driver.BlockSource$ThreadedCachingBlockSource$BlockReleaser. Like OracleTimeoutThreadPerVM, this class is a Thread.

In case it helps, I'm using JDK1.8.0_45 on Windows 10 x64, the Oracle JDBC driver jar is in <TOMCAT_HOME>/lib, not in WEB-INF/lib, and the driver JAR is being used by a test web application I've knocked up to reproduce this problem (i.e. it's not sitting around idle as it was in the initial comment on this issue).

@mjiderhamn mjiderhamn reopened this Sep 26, 2015

@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Sep 26, 2015

Owner

Hi Luke and thanks for the report. Your trace made me realize that the DomainCombiner I created to avoid the initial ProtectionDomain issue, itself implied an unwanted reference... I have pushed changes that I hope will fix that, and also the oracle.jdbc.driver.BlockSource$ThreadedCachingBlockSource$BlockReleaser problem. Could you please build the library from latest sources and try it out?

Owner

mjiderhamn commented Sep 26, 2015

Hi Luke and thanks for the report. Your trace made me realize that the DomainCombiner I created to avoid the initial ProtectionDomain issue, itself implied an unwanted reference... I have pushed changes that I hope will fix that, and also the oracle.jdbc.driver.BlockSource$ThreadedCachingBlockSource$BlockReleaser problem. Could you please build the library from latest sources and try it out?

@LukeWoodward

This comment has been minimized.

Show comment
Hide comment
@LukeWoodward

LukeWoodward Sep 26, 2015

Sorry Mattias, I'm afraid that your latest change doesn't fix the problem. I've had a brief look into why.

Each of the two threads that the Oracle JDBC driver creates has a different AccessControlContext, nos 26 and 27 in the image below:

image

Neither 26 nor 27 is the one that the ClassLoaderLeakPreventor has a reference to (no 23). However, nos 26 and 27 both reference no 23 via a privilegedContext field:

image

Your code nulls out the combiner field in no 23, but it's nos 26 and 27 that you would want to do this to instead.

LukeWoodward commented Sep 26, 2015

Sorry Mattias, I'm afraid that your latest change doesn't fix the problem. I've had a brief look into why.

Each of the two threads that the Oracle JDBC driver creates has a different AccessControlContext, nos 26 and 27 in the image below:

image

Neither 26 nor 27 is the one that the ClassLoaderLeakPreventor has a reference to (no 23). However, nos 26 and 27 both reference no 23 via a privilegedContext field:

image

Your code nulls out the combiner field in no 23, but it's nos 26 and 27 that you would want to do this to instead.

@mjiderhamn mjiderhamn reopened this Sep 27, 2015

mjiderhamn pushed a commit that referenced this issue Sep 27, 2015

Mattias Jiderhamn
Avoid our classloader being referenced by the DomainCombiner we use t…
…o filter ProtectionDomains in the AccessControlContext, by unsetting during shutdown.

Also adding test case for Oracle JDBC #23
@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Sep 27, 2015

Owner

Thanks Luke. I forgot about the inheritance of AccessControlContexts. I hope to have found a way around that too now. (Created a test case to look specifically at the Oracle JDBC case and it looks good). Would you care to give the version on the 'AccessControlContext' branch a spin?

Owner

mjiderhamn commented Sep 27, 2015

Thanks Luke. I forgot about the inheritance of AccessControlContexts. I hope to have found a way around that too now. (Created a test case to look specifically at the Oracle JDBC case and it looks good). Would you care to give the version on the 'AccessControlContext' branch a spin?

@LukeWoodward

This comment has been minimized.

Show comment
Hide comment
@LukeWoodward

LukeWoodward Sep 27, 2015

@mjiderhamn: the version on your AccessControlContext branch works for me. I can now reload a test web app that talks to an Oracle database without it leaking any classloaders.

LukeWoodward commented Sep 27, 2015

@mjiderhamn: the version on your AccessControlContext branch works for me. I can now reload a test web app that talks to an Oracle database without it leaking any classloaders.

mjiderhamn pushed a commit that referenced this issue Sep 28, 2015

Mattias Jiderhamn
Avoid our classloader being referenced by the DomainCombiner we use t…
…o filter ProtectionDomains in the AccessControlContext, by unsetting during shutdown.

Also adding test case for Oracle JDBC #23
@mjiderhamn

This comment has been minimized.

Show comment
Hide comment
@mjiderhamn

mjiderhamn Sep 28, 2015

Owner

Thanks for verifying @LukeWoodward. Version 1.14.0 of the library should propagate to Maven Central shortly.

Owner

mjiderhamn commented Sep 28, 2015

Thanks for verifying @LukeWoodward. Version 1.14.0 of the library should propagate to Maven Central shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment