Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogFactory should use same classloader for feature detection and loading the correct logcreator #1922

Closed
javabrett opened this issue Feb 11, 2018 · 14 comments

Comments

@javabrett
Copy link

javabrett commented Feb 11, 2018

What version of Flyway are you using?

5.0.7.

Which client are you using? (Command-line, Java API, Maven plugin, Gradle plugin, SBT plugin, ANT tasks)

Java API.

What database are you using (type & version)?

MySQL, Oracle DB, multiple versions.

What operating system are you using?

Linux, OS/X

TL;DR

org.flywaydb.core.api.logging.LogFactory uses the Thread Context Class Loader (TCCL) when discovering which logging APIs/backends are available for its own logging. It should not do this as the TCCL is not subsequently used to load the logging-API, which is loaded as-normal by the classloader of LogFactory itself, which is not necessarily the same classloader with the same class-visibility. This can lead to a NoClassDefFoundError: org/slf4j/LoggerFactory, especially in certain JavaEE packaging scenarios.

Problem demonstration

This problem was noticed in a real application-scenario, but can be simplified and reproduced as-follows:

  • Package a Java EE application as an EAR. For simplicity it contains some common libraries (including Flyway) and a single web-application (WAR).
  • The web-application uses Flyway, either directly or wrapped in some other logic. The important factor is that it is desired to deploy Flyway to some parent-classloader - either the EAR's libs, or some application-server/domain common-classloader.
  • The WAR file has SLF4J packaged for its own logging.
  • SLF4J is not deployed in the parent classloader where Flyway is deployed.

When Flyway is first classloaded, it initializes its own logging, and uses the TCCL classloader for detection of the available logging APIs (SLF4J, Apache Commons Logging), but when loading the API-wrapper e.g. Slf4jLogCreator, normal classloading is used, which is not the TCCL. In the above scenario this will yield:

Caused by: java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
    at org.flywaydb.core.internal.util.logging.slf4j.Slf4jLogCreator.createLogger(Slf4jLogCreator.java:27)
    at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:82)
    at org.flywaydb.core.internal.util.ClassUtils.<clinit>(ClassUtils.java:37)
    at org.flywaydb.core.internal.util.FeatureDetector.isSlf4jAvailable(FeatureDetector.java:96)
    at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:71)
    at org.flywaydb.core.internal.util.FeatureDetector.<clinit>(FeatureDetector.java:25)
    at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:68)
    at org.flywaydb.core.Flyway.<clinit>(Flyway.java:72)
    at test.flywaydb.ContextListener.contextInitialized(ContextListener.java:26)

Code

LogFactory creates a FeatureDetector passing the TCCL as the classloader to be used in feature-tests. Later at https://github.com/flyway/flyway/blob/master/flyway-core/src/main/java/org/flywaydb/core/api/logging/LogFactory.java#L72 a Slf4jLogCreator is instantiated directly using LogFactory's classloader, which is not necessarily the TCCL, leading to the possibility of a NCDFE.

Analysis

In a JavaEE application the TCCL will often be set to a leaf classloader e.g a WebappClassLoader with visibility of parent classloaders, and perhaps other libraries like SLF4J. It is reasonable that Flyway might be loaded by some parent/shared classloader. It should not use the TCCL classloaders when testing what classes it can load for itself e.g. for the purposes of obtaining a logging-API, since the TCCL may be a child classloader with classes not-visible to Flyway. This is the case in a JavaEE deployment where FLyway is loaded by a classloader which is a parent to the TCCL e.g. an EarLibClassLoader .

Desired behaviour (Flyway logging) - DRAFT

Overall logging goal

When an application loads Flyway, Flyway's own internal-logging and public logging API attempt to discern the logging-framework used by the application, and load and use that framework along with any associated configuration. Applications should expect log-messages from Flyway to be logged similarly to the application's own logging, and any extensions which use org.flywaydb.core.api.logging.LogFactory likewise expect similar logging.

Simple case - single application/classloader, or no parent/shared classloader

  • Application A in classloader C1 uses logging-framework F1 with configuration-file f1.properties. Flyway is also deployed-to and also loaded from C1, and detects and uses F1 for logging.
  • Application B in classloader C2 uses logging-framework F2 with configuration-file f2.properties. Flyway is also deployed-to and also loaded from C2, and detects and uses F2 for logging.
  • Applications A and B can have different logging-frameworks and configurations and Flyway will log for each one accordingly.

Multiple applications, Flyway is deployed to a parent/shared classloader of the applications

Same goal as for the simple case. Now Flyway is installed in a classloader which is shared between the applications, and doesn't have visibility of each application's classes (or logging APIs, unless they are co-deployed with Flyway) through it's own classloader. Difficulties:

  • Care with classloaders.
  • Cannot use static members to store logging-instances, lest they be shared by Applications A and B, and a first-caller-wins-initialization occurs.
javabrett added a commit to javabrett/flyway-issues-1922 that referenced this issue Feb 11, 2018
Running './gradlew clean assemble test' will cause a deployment failure in embedded Glassfish
caused by:

Caused by: java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
	at org.flywaydb.core.internal.util.logging.slf4j.Slf4jLogCreator.createLogger(Slf4jLogCreator.java:27)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:82)
	at org.flywaydb.core.internal.util.ClassUtils.<clinit>(ClassUtils.java:37)
	at org.flywaydb.core.internal.util.FeatureDetector.isSlf4jAvailable(FeatureDetector.java:96)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:71)
	at org.flywaydb.core.internal.util.FeatureDetector.<clinit>(FeatureDetector.java:25)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:68)
	at org.flywaydb.core.Flyway.<clinit>(Flyway.java:72)
	at test.flywaydb.ContextListener.contextInitialized(ContextListener.java:26)
javabrett added a commit to javabrett/flyway-issues-1922 that referenced this issue Feb 11, 2018
Running './gradlew clean assemble test' will cause a deployment failure in embedded Glassfish
caused by:

Caused by: java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
	at org.flywaydb.core.internal.util.logging.slf4j.Slf4jLogCreator.createLogger(Slf4jLogCreator.java:27)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:82)
	at org.flywaydb.core.internal.util.ClassUtils.<clinit>(ClassUtils.java:37)
	at org.flywaydb.core.internal.util.FeatureDetector.isSlf4jAvailable(FeatureDetector.java:96)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:71)
	at org.flywaydb.core.internal.util.FeatureDetector.<clinit>(FeatureDetector.java:25)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:68)
	at org.flywaydb.core.Flyway.<clinit>(Flyway.java:72)
	at test.flywaydb.ContextListener.contextInitialized(ContextListener.java:26)
javabrett added a commit to javabrett/flyway-issues-1922 that referenced this issue Feb 11, 2018
Running './gradlew clean assemble test' will cause a deployment failure in embedded Glassfish
caused by:

Caused by: java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
	at org.flywaydb.core.internal.util.logging.slf4j.Slf4jLogCreator.createLogger(Slf4jLogCreator.java:27)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:82)
	at org.flywaydb.core.internal.util.ClassUtils.<clinit>(ClassUtils.java:37)
	at org.flywaydb.core.internal.util.FeatureDetector.isSlf4jAvailable(FeatureDetector.java:96)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:71)
	at org.flywaydb.core.internal.util.FeatureDetector.<clinit>(FeatureDetector.java:25)
	at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:68)
	at org.flywaydb.core.Flyway.<clinit>(Flyway.java:72)
	at test.flywaydb.ContextListener.contextInitialized(ContextListener.java:26)
@axelfontaine axelfontaine added this to the Flyway 5.1.0 milestone Feb 11, 2018
@axelfontaine
Copy link
Contributor

Thanks for pointing out the inconsistency.

I guess we have two choices:

  1. Load each individual LogCreator using the TCCL as well
  2. Expose a setClassLoader(ClassLoader) on LogFactoryand use that ClassLoader both for detection and LogCreator instantiation. Default to TCCL if not set.

What do you think? Which one would you favor and why?

@javabrett
Copy link
Author

Thanks for the follow-up and questions.

Before deciding how the TCCL should be used - it would be useful (to me anyway) to understand why it was introduced in the first-place, especially for the logging-case. Is there any brief write-up (other than in the following linked issues)?

History: FeatureDetector and ClassUtils.isPresent first learnt to be passed a ClassLoader in 8c800df and it seems issues #626 and #696 were drivers. Before that the normal classloader was used.

I figured logging in Flyway code (and location/configuration of logging-apis) would be mostly an internal-affair, but I do notice that at one point logging was moved from the internal to public API - maybe to support extension and plugins (I'm not familiar-enough with Flyway to know)?

Note that LogCreator and all of LogFactory is static. If it is allowed to be deployed to a shared-classloader and called by multiple applications/modules with their context-classloaders, we could end-up with a race where the first-caller to getLog brings their available logging-API.

So in-summary, I'm interested in knowing what the TCCL provides here. Maybe it is access to resource configuration-files for the logging API that only exist in the child/context classloader. Just seems like for something as fundamental as logging, Flyway should be self-sufficient within its own classloader.

@axelfontaine
Copy link
Contributor

Logging is part of the public API to allow Java migrations, custom resolvers and executors, Java callbacks, etc to use the same logging abstraction and work flawlessly across environments (API, Maven, CLI, etc).

Flyway's own ClassLoader may not be sufficient as the logging framework it loads most likely needs access to its config file which resides somewhere within your app.

I am currently leaning towards option 2. This lets users like you define initialize their classloader with what they need (this would have to be very early on ear startup) and keeps the existing behavior for everyone else.

@javabrett
Copy link
Author

Let me think on it a bit and maybe run a test or two. Of course these sorts of problems only occur if Flyway is in a shared/common classloader, parent of the TCCLs.

Central to my concern is that LogFactory logCreator is static, so there is currently a potential first-caller-wins if there are multiple child applications/classloaders ... the first to call getLog will make the decision on available logging-libraries, and its classloader and classes will be loaded for logging (and retained for the life of Flyway's classloader). Applications might make their first getLog call in a different order across different system start-ups.

If option 2, all applications would need to take great care in setting the classloader before logging is initialized ... noting that classloading Flyway triggers this. If TCCL remains a default then any single application code-path that doesn't follow this (including any third-party libraries) would immediately invalidate the option which will be silently-ignored.

As I say I'll think on it some more, unless you are in a hurry to proceed.

@axelfontaine
Copy link
Contributor

Not in a hurry. I'm hoping to address this in the coming month to get it out as part of Flyway 5.1. If we miss that window, it'll be 5.2. No biggie.

@axelfontaine
Copy link
Contributor

Note that you can work around this in 5.0.7 by manually calling LogFactory.setLogCreator() as the first thing on ear startup.

@javabrett
Copy link
Author

I still wasn't 100% clear on what the required logging behaviour is, so I updated the issue with a section Desired behaviour (Flyway logging) - DRAFT.

Please review. I'm wondering whether the requirements for the shared-classloader behaviour are too ambitions ... do we shoot for completely independent logging-configurations, or settle for first-initializer-wins?

@axelfontaine
Copy link
Contributor

@javabrett To be honest with you, using different logging frameworks for different wars in an ear is just asking for trouble and creating an unnecessary mess. Also this situation is very rare. I am therefore now leaning towards my suggestion 1.

@javabrett
Copy link
Author

The different logging-frameworks was an obtuse/general example I agree. But won't there be also be simpler examples of logging configuration cross-wires ... if Application A is using SLF4J -> log4j and logging to a.log and Application B same but logging to b.log ... won't it be true that Flyway will log everything it does to either a.log or b.log, depending on which application initialises logging first?

That might be acceptable, but will probably best be documented. Or is that not going to happen with the current code if Flyway is deployed to a shared/parent classloader?

I'm interested in how either option 1 or 2 are going to create the right classloading partitioning. Maybe dynamic proxies will help.

@axelfontaine
Copy link
Contributor

Well let's not start solving theoretical problems here and stick to the issue at hand. Would my suggestion 1 solve your specific issue?

@javabrett
Copy link
Author

Loading Slf4jLogCreator (or whichever is the active LogCreator based on which API is found in the TCCL) into the TCCL (or something wrapping it) will solve the problem, since it will then go-on to load org.slf4j.LoggerFactory and the rest of the dependencies.

But given the set-up above, where Flyway is not in the direct scope of the TCCL (it is deployed into a parent classloader of the TCCL), this won't just be a matter of asking the TCCL to load that class, since it will delegate to the parent which will load it and then the NCDFE occurs. You are likely already aware of that.

So I'm not sure what is planned - I would expect a non-trivial partitioning classloader would need to be injected to bridge between the classloaders. Or as I say maybe there is something that can be done with dynamic proxies.

If we let the current classloader (rather than the TCCL) do the test for logging APIs, it will only find APIs that are visible to Flyway. I would hope that in the (perhaps typical) case where Flyway and its caller are co-deployed to the same-classloader this would behave exactly as it does now. What would change is that what is currently not-working would never work, that is, logging would never be controlled by a child classloader application, and if the deployer wanted to control Flyway logging when it is loaded in a parent, they would be responsible for supplying the logging API and configuration.

@axelfontaine axelfontaine changed the title LogFactory should not use TCCL when locating logging implementation: can cause NoClassDefFoundError: org/slf4j/LoggerFactory LogFactory should use same classloader for feature detection and loading the correct logcreator Mar 14, 2018
axelfontaine pushed a commit to flyway/flywaydb.org that referenced this issue Mar 14, 2018
@axelfontaine
Copy link
Contributor

The individual logcreators are now also instantiated using the TCCL. Please build the latest sources and confirm this is now working for you.

@javabrett
Copy link
Author

Thanks for the update and the code-change. I tested this with a local master build to 0-SNAPSHOT and it does not fix the problem with this type of deployment-scenario, but I also did not expect it to.

Here's the new stacktrace:

Mar 15, 2018 1:16:06 PM test.flywaydb.ContextListener contextInitialized
    SEVERE: Exception in contextInitialized
    java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
        at org.flywaydb.core.internal.util.logging.slf4j.Slf4jLogCreator.createLogger(Slf4jLogCreator.java:27)
        at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:86)
        at org.flywaydb.core.internal.util.ClassUtils.<clinit>(ClassUtils.java:37)
        at org.flywaydb.core.internal.util.FeatureDetector.isSlf4jAvailable(FeatureDetector.java:96)
        at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:75)
        at org.flywaydb.core.internal.util.FeatureDetector.<clinit>(FeatureDetector.java:25)
        at org.flywaydb.core.api.logging.LogFactory.getLog(LogFactory.java:72)
        at org.flywaydb.core.Flyway.<clinit>(Flyway.java:76)
        at test.flywaydb.ContextListener.contextInitialized(ContextListener.java:29)
        at org.apache.catalina.core.StandardContext.contextListenerStart(StandardContext.java:5043)
        at com.sun.enterprise.web.WebModule.contextListenerStart(WebModule.java:592)
        at org.apache.catalina.core.StandardContext.start(StandardContext.java:5612)
        at com.sun.enterprise.web.WebModule.start(WebModule.java:540)
        at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:917)
        at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:900)
        at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:684)
        at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:2044)
        at com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1690)
        at com.sun.enterprise.web.WebApplication.start(WebApplication.java:107)
        at org.glassfish.internal.data.EngineRef.start(EngineRef.java:122)
        at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:291)
        at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:352)
        at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:500)
        at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:219)
        at org.glassfish.deployment.admin.DeployCommand.execute(DeployCommand.java:491)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:540)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:536)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$2.execute(CommandRunnerImpl.java:535)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:566)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:558)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:360)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:557)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:1465)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl.access$1300(CommandRunnerImpl.java:110)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1847)
        at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1723)
        at com.sun.enterprise.admin.cli.embeddable.DeployerImpl.deploy(DeployerImpl.java:133)
        at test.flywaydb.DeployTest.deploy(DeployTest.java:30)

The reason this doesn't work is because you can't just make Slf4jLogCreator et. al. be classloaded by the TCCL just by asking for that. In the deployment scenario in this issue, Flyway is deployed to a shared/parent classloader. So classes such as Slf4jLogCreator are not even directly visible to the TCCL (only via its parent), because I deployed Flyway libs to my EAR lib and not my WAR WEB-INF/lib. So when the TCCL is asked to classload Slf4jLogCreator it doesn't see it in its classpath, so it delegates to its parent classloader in the normal fashion, which does have that class. But it does not have the logger libraries in this case. The JVM will attempt to load the dependencies of Slf4jLogCreator using the classloader which loaded that class, which is the parent classloader.

If I add a little debug output:

diff --git a/flyway-core/src/main/java/org/flywaydb/core/api/logging/LogFactory.java b/flyway-core/src/main/java/org/flywaydb/core/api/logging/LogFactory.java
index ca97e6ee..185fe439 100644
--- a/flyway-core/src/main/java/org/flywaydb/core/api/logging/LogFactory.java
+++ b/flyway-core/src/main/java/org/flywaydb/core/api/logging/LogFactory.java
@@ -69,6 +69,7 @@ public class LogFactory {
     public static Log getLog(Class<?> clazz) {
         if (logCreator == null) {
             ClassLoader classLoader = Thread.currentThread().getContextClassLoader();
+            System.out.println("TCCL: " + classLoader);
             FeatureDetector featureDetector = new FeatureDetector(classLoader);
             if (featureDetector.isAndroidAvailable()) {
                 logCreator = ClassUtils.instantiate(AndroidLogCreator.class.getName(), classLoader);
@@ -83,6 +84,8 @@ public class LogFactory {
             }
         }
 
+        System.out.println("logCreator ClassName: " + logCreator.getClass().getName());
+        System.out.println("logCreator ClassLoader: " + logCreator.getClass().getClassLoader());
         return logCreator.createLogger(clazz);
     }
-}
\ No newline at end of file
+}

... I see this:

Gradle suite > Gradle test > test.flywaydb.DeployTest.deploy STANDARD_OUT
    TCCL: WebappClassLoader (delegate=true; repositories=WEB-INF/classes/)
    TCCL: WebappClassLoader (delegate=true; repositories=WEB-INF/classes/)
    TCCL: WebappClassLoader (delegate=true; repositories=WEB-INF/classes/)
    logCreator ClassName: org.flywaydb.core.internal.util.logging.slf4j.Slf4jLogCreator
    logCreator ClassLoader: EarLibClassLoader : 

Showing that the TCCL (WebappClassLoader) does not load Slf4jLogCreator, EarLibClassLoader does.

@axelfontaine
Copy link
Contributor

Ok, I have decided to mark this as won't fix.

You can fix your scenario by either moving the logging libraries to the ear or Flyway to the war. Both are much more sane and consistent options than what you have now. Alternatively you can also supply sour own logcreator via the setLogCreator on LogFactory as well.

dohrayme pushed a commit to dohrayme/flyway that referenced this issue Feb 3, 2020
… detection and loading the correct logcreator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants