New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement compatibility of the Jolokia JVM agent with org.jboss.logmanager.LogManager loaded form JBoss modules #258

Open
rraez opened this Issue May 17, 2016 · 6 comments

Comments

Projects
None yet
4 participants
@rraez
Contributor

rraez commented May 17, 2016

Problem

Currently we have enabled the Jolokia JVM agent with JBoss EAP 6.0.x & 6.2.x.

To get the JVM Agent running, we have added the jboss-logmanager.jar the the -Xbootclasspath.

This is currently the official workaround described by RedHat: https://access.redhat.com/solutions/312453

Alos other JVM agents have the same problem: https://issues.jboss.org/browse/WFLY-895

But in JBoss EAP 6.4. this is no more working:

java.lang.IllegalStateException: The LogManager was not properly installed (you must set the "java.util.logging.manager" system property to "org.jboss.logmanager.LogManager")
    at org.jboss.logmanager.Logger.getLogger(Logger.java:58)
    at org.jboss.as.server.Main.main(Main.java:84)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.jboss.modules.Module.run(Module.java:312)
    at org.jboss.modules.Main.main(Main.java:473)

Even when you add the system property java.util.logging.manager=org.jboss.logmanager.LogManager and add the jboss-logmanager.jar the the -Xbootclasspath, the following code throws a class cast exception:
https://github.com/jboss-logging/jboss-logmanager/blob/master/src/main/java/org/jboss/logmanager/Logger.java#L57

This exception is caused by changed class loading behaviour in JBoss eap 6.4. It looks like the org.jboss.logmanager.Logger class is loaded a second time from the modules and therefore it can't be casted.

In my option it was never a clean solution to override classes loaded by JBoss modules via the system class loader (-Xbootclasspath).

I have to find a solution for already released version of JBoss eap (6.2.2, 6.4.1, 6.4.7).

Implement a solution for the problem in products is not easy

I agree with you suggestion that a solution within the product (e.g. swarm ) would be nice, but it is not that simple, as the java.util.logging.LogManager of java is a singleton initialized the first time it is used.

  • Manipulate singletons is possible but this is not a clean solution.
  • RedHat could integrate the jboss-logmanager.jar into jboss-modules.jar, but we are loosing some modularity.
  • JBoss modules could load the classes of jboss-logmanager.jar officially from the parent class loader if available (same as for jboss eap 6.0, 6.2), but with this we are losing the possibility to patch this module (using an overlay).

Target

The Jolokia JVM agent should be running fine in environments where the java.util.logging.LogManager is initialized by the application at runtime with a class which is loaded from a custom class loader.

Possible options

You have already thought how to solve this in a clean and portable approach: https://issues.jboss.org/browse/SWARM-204

  1. Create an extra agent only for doing log initialization. Probably not so elegant as it adds extra burden to the user for starting up. It could be included in https://github.com/fabric8io/agent-bond, though (this is the 'super agent' we are using in the Fabric8 base images)
  2. If there would be a condition which could be queried by the agent and which indicates the full initialisation of the logging system, I could delay the initialization of the Jolokia agent by waiting on this condition. This condition must be queryable with Java reflection (i.e. without direct reference to any Swarm classes).
  3. Jolokia could listen on the port, does the initialization and the forward the connection to the httpserver (if this is possible). Quite involved and has still the drawback if the first request comes in early that the logging system is not ready.
  4. Wait a fixed amount of time until doing the initialization.

Option 1 is not solving the issue as the issue in EAP 6.4 is triggered by the too early loading.
Option 2 I definitely like the option 2, more on this later
Option 3 this option would cause a race condition: in case jolokia is called before JBoss has initialized the, we have the same problem agin
Option 4 this could be an option, but it would not be bulled prove: java running on virtual server have no kind of guarantees

I have some additional options:

  1. A JVM agent is is transforming all loaded classes form sun.net.httpserver.* is such a way that all references to java.util.logging.Logger is exchanged with another Logger. I have a working POC using ASM for this delegating to SLF4J + logback (all packages relocated to a different packages)
  2. Replace the sun.net.httpserver with another implementation not using a java java.util.Logger.

Option 5 has a small runtime impact, and is most likely too confusing/magic for maintainers & users.
Option 6 would be an option, I have no idea which implementation to take, its size should be small and I don't would like to follow this as it is too much work and requires extensive testing.

IMO a solution of option 2 is the way to go.

Proposal for option 2

Here a proposal to implement option 2:

  • There is a new boolean configuration option in Jolokia called awaitJavaUtilLoggingManagerConfiguration
  • When set to true, the Jolokia JVM agent performs the following steps before starting the agent:
  1. loop until there system property java.util.logging.manager is set
  2. loop until the class configured by the system property java.util.logging.manager is loaded by the JVM and remember its class loader
  3. remember the value of the current ContextClassLoader
  4. set the ContextClassLoader to the ClassLoader of the logging manager class
  5. Invoke LogManager.getLogManager() and log a warning in case the result is not the expected class
  6. restore the ContextClassLoader with the ContextClassLoader from step 3

To check if a class has been loaded, I propose to poll Instrumentation.getAllLoadedClasses

The steps 3-6 are needed for the case in which JBoss (or any other component) has configured the system property an loaded the class but the LogManager is not jet initialized.

In the case of JBoss, the steps 3-6 are required if:

To get an instance of Instrumentation, the JVM agent needs to be implemented with the following signature:
public static void premain(String args, Instrumentation inst)

I think the solution should not introduce time outs as this will complicate the implementation. I think logging the most important steps should be enough.

What do you think about this proposal for the option 2?

@rhuss

This comment has been minimized.

Show comment
Hide comment
@rhuss

rhuss May 17, 2016

Owner

@rraez thanks for coming back to this (annoying ;-) issue. Lets fix it now :)

To your additional suggestions:

  1. A JVM agent is is transforming all loaded classes form sun.net.httpserver.* is such a way that all references to java.util.logging.Logger is exchanged with another Logger. I have a working POC using ASM for this delegating to SLF4J + logback (all packages relocated to a different packages)
  2. Replace the sun.net.httpserver with another implementation not using a java java.util.Logger.

I guess this would mean to include an extra dependency (asm, another http-server) which I'm not keen on. Jolokia was designed with the constraint to be as small as possible in order to minimize impact. The only dependency is on json-simple (23k), not even logging.

You suggestion around option (2) sounds feasible. Do you think there could be a way to detect early and quickly whether running in an Wildfly environment so that we could set the suggested option awaitJavaUtilLoggingManagerConfiguration implicitly ?

The reason I ask is because its really only the Wildfly platform which has this issue, not one of the other 30+ platform in my integration test suite (including the former JBoss AS servers) has a similar problem. I had different issues on different platform (e.g. for detecting the MBeanServer on Weblogic is also not trivial), so use server detection like the JBossServerDetector to cope with platform peculiarities. It would be cool if we could include our workaround into this facility, too, but I'm afraid the detector stuff kicks in too late. To sacrifice an option (of which I already have too many I'm afraid) should be the last resort.

Said that I would happily integrate a PR or even the vanilla detection code if you have POC.

Thanks again for you investigating this tricky thing.

Owner

rhuss commented May 17, 2016

@rraez thanks for coming back to this (annoying ;-) issue. Lets fix it now :)

To your additional suggestions:

  1. A JVM agent is is transforming all loaded classes form sun.net.httpserver.* is such a way that all references to java.util.logging.Logger is exchanged with another Logger. I have a working POC using ASM for this delegating to SLF4J + logback (all packages relocated to a different packages)
  2. Replace the sun.net.httpserver with another implementation not using a java java.util.Logger.

I guess this would mean to include an extra dependency (asm, another http-server) which I'm not keen on. Jolokia was designed with the constraint to be as small as possible in order to minimize impact. The only dependency is on json-simple (23k), not even logging.

You suggestion around option (2) sounds feasible. Do you think there could be a way to detect early and quickly whether running in an Wildfly environment so that we could set the suggested option awaitJavaUtilLoggingManagerConfiguration implicitly ?

The reason I ask is because its really only the Wildfly platform which has this issue, not one of the other 30+ platform in my integration test suite (including the former JBoss AS servers) has a similar problem. I had different issues on different platform (e.g. for detecting the MBeanServer on Weblogic is also not trivial), so use server detection like the JBossServerDetector to cope with platform peculiarities. It would be cool if we could include our workaround into this facility, too, but I'm afraid the detector stuff kicks in too late. To sacrifice an option (of which I already have too many I'm afraid) should be the last resort.

Said that I would happily integrate a PR or even the vanilla detection code if you have POC.

Thanks again for you investigating this tricky thing.

@rraez

This comment has been minimized.

Show comment
Hide comment
@rraez

rraez May 18, 2016

Contributor

Thanks for you quick response. I definitely agree to keep the dependencies as minimal as possible, this is a key strength of Jolokia.

I like your idea, I will think about the automatic detection of JBoss module based JEE platforms (Wildfily, JBoss EAP 6+). This detection must not kick off just in case some JBoss libraries are on the class path. JBoss modules is used not only for the appserver, e.g. it is used to implement modules in the ceylon language.

Contributor

rraez commented May 18, 2016

Thanks for you quick response. I definitely agree to keep the dependencies as minimal as possible, this is a key strength of Jolokia.

I like your idea, I will think about the automatic detection of JBoss module based JEE platforms (Wildfily, JBoss EAP 6+). This detection must not kick off just in case some JBoss libraries are on the class path. JBoss modules is used not only for the appserver, e.g. it is used to implement modules in the ceylon language.

rraez added a commit to rraez/jolokia that referenced this issue May 20, 2016

Extended ServerDetector interface for the early detection of a server…
… and fixed a startup issue caused by a JBoss modules based server triggered by a custom java LogManager.

For details an discussion see rhuss#258.

rhuss added a commit that referenced this issue Oct 12, 2016

@rhuss

This comment has been minimized.

Show comment
Hide comment
@rhuss

rhuss Oct 12, 2016

Owner

Finally I'm about to integrate this fix into Jolokia (although I think swarm would be the more appropriate place).

The final missing piece is how to detect whether running und wildfly-swarm, which must be available early on.
Its possible to check for

  • system properties
  • available classes
  • MBean (but they come probably to late to the show to be usable)

The current algorithm checks two things:

  • that a class org.jboss.modules.Main exists
  • that a system property boot.module.loader exists which contains "wildfly"

It would be awesome if someone could either verify that this is necessary and sufficient, or if otherwise please update this issue with a unique way to identify swarm. Also it would be awesome if Swarm could be detected to extract its version number so that Jolokia can expose this to the outside (and while we are here also, how to differentiate between Wildfly AS and wildfly-swarm since this detector is used for both platforms)

Owner

rhuss commented Oct 12, 2016

Finally I'm about to integrate this fix into Jolokia (although I think swarm would be the more appropriate place).

The final missing piece is how to detect whether running und wildfly-swarm, which must be available early on.
Its possible to check for

  • system properties
  • available classes
  • MBean (but they come probably to late to the show to be usable)

The current algorithm checks two things:

  • that a class org.jboss.modules.Main exists
  • that a system property boot.module.loader exists which contains "wildfly"

It would be awesome if someone could either verify that this is necessary and sufficient, or if otherwise please update this issue with a unique way to identify swarm. Also it would be awesome if Swarm could be detected to extract its version number so that Jolokia can expose this to the outside (and while we are here also, how to differentiate between Wildfly AS and wildfly-swarm since this detector is used for both platforms)

@heiko-braun

This comment has been minimized.

Show comment
Hide comment
@heiko-braun

heiko-braun Dec 1, 2016

I've successfully tested swarm with jolokia 1.3.6-SNAPSHOT: https://issues.jboss.org/browse/SWARM-204

heiko-braun commented Dec 1, 2016

I've successfully tested swarm with jolokia 1.3.6-SNAPSHOT: https://issues.jboss.org/browse/SWARM-204

@rwiermer

This comment has been minimized.

Show comment
Hide comment
@rwiermer

rwiermer May 15, 2018

We have migrated several (slightly complex) EE applications to Swarm and have seen that the Swarm detection of Jolokia in agent mode is likely sensitive to timing issues. In ca. 20% of application restarts the detection fails, and we get the same exception as in the original description. I tried to reproduce it with a toy application, but was able to do so up too now.
It would be nice to either have the option to force compatibility mode or find out a more reliable way to detect WIldfly (Swarm).

rwiermer commented May 15, 2018

We have migrated several (slightly complex) EE applications to Swarm and have seen that the Swarm detection of Jolokia in agent mode is likely sensitive to timing issues. In ca. 20% of application restarts the detection fails, and we get the same exception as in the original description. I tried to reproduce it with a toy application, but was able to do so up too now.
It would be nice to either have the option to force compatibility mode or find out a more reliable way to detect WIldfly (Swarm).

@rhuss

This comment has been minimized.

Show comment
Hide comment
@rhuss

rhuss May 15, 2018

Owner

I probably need some assistance to find the proper variables which are available right from the start, as mentioned in #258 (comment)

@rwiermer Maybe we should open an issue at wildfly swarm for that ?

Owner

rhuss commented May 15, 2018

I probably need some assistance to find the proper variables which are available right from the start, as mentioned in #258 (comment)

@rwiermer Maybe we should open an issue at wildfly swarm for that ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment