fix agent for zulu8#1084
Conversation
It should use real class to try to load
The problem was that on zulu8 loading OkHttp touches JFR which in turn touches log manager - which would break things like JBOSS. The fix is to delay installing agent (and writer) until log manager things have settled down - in way similar to jmxfetch. Unfortunately for 'main' agent this turns out to be more involved because of classloader shenanigans.
This reverts commit 0b4b07b.
1d8af37 to
63b5504
Compare
Using `datadog.trace.agent` to hold agent class causes problems due to shadowing into this package of other classes.
63b5504 to
42cece6
Compare
jbachorik
left a comment
There was a problem hiding this comment.
Looks good. Tricky classloading :)
Just a few minor comments.
|
|
||
| @Override | ||
| public void run() { | ||
| /* |
There was a problem hiding this comment.
Makes sense, I'm little surprised we weren't doing this already.
There was a problem hiding this comment.
'Before' we only used this logic for jmxfetch which happens to fork out its own threads before doing anything useful so that wasn't really a problem.
dougqh
left a comment
There was a problem hiding this comment.
Looks good to me. I appreciate the comments. It helps a lot in explaining what we're working around.
There was a problem hiding this comment.
Nice work @mar-kolya. I have some concerns about delaying agent install unnecessarily (like Oracle Java 8 with JBOSS), since I'm not sure what impact that would have.
See questions below.
| private static synchronized void installDatadogTracer( | ||
| final Instrumentation inst, final URL bootstrapURL) throws Exception { | ||
| if (AGENT_CLASSLOADER == null) { | ||
| throw new IllegalStateException("Datadog agent should have been started already"); |
There was a problem hiding this comment.
If someone declares the javaagent on the command line multiple times, does this also prevent it from being started multiple times?
There was a problem hiding this comment.
This actually should be safe and there is a comment about that few lines down.
| final Method agentInstallerMethod = | ||
| agentInstallerClass.getMethod("installBytebuddyAgent", Instrumentation.class); | ||
| agentInstallerMethod.invoke(null, inst); | ||
| AGENT_CLASSLOADER = agentClassLoader; |
There was a problem hiding this comment.
Should we rename the AgentInstaller and AGENT_CLASSLOADER to be something else?
| * <li>Do not store any static data in this class | ||
| * <li>Do dot touch any logging facilities here so we can configure them later | ||
| * </ul> | ||
| */ |
There was a problem hiding this comment.
Nice comment! I wonder if we should add protection somehow to assert this class is never loaded on the bootstrap classpath.
There was a problem hiding this comment.
I'm not sure how to do that and it is really solves much...
| throw new TimeoutException(); | ||
| } | ||
|
|
||
| private static class StreamGobbler extends Thread { |
There was a problem hiding this comment.
If you didn't live in 🇨🇦 I'd say you had 🦃 on your mind. Nice name though.
There was a problem hiding this comment.
This originates from SO somewhere.
| * events which in turn loads LogManager. This is not a problem on newer JDKs because there JFR uses different | ||
| * logging facility. | ||
| */ | ||
| if (isJavaBefore9() && appUsingCustomLogManager) { |
There was a problem hiding this comment.
I wonder if we should isolate this behavior to only when JFR is available on Java 8. I'm concerned about unintended side effects of delaying tracer installation.
There was a problem hiding this comment.
should be fixed now
| if (entry.getKey().equals(typeName)) { | ||
| entry.getValue().run(); | ||
| synchronized (CLASS_LOAD_CALLBACKS) { | ||
| final List<Runnable> callbacks = CLASS_LOAD_CALLBACKS.get(typeName); |
There was a problem hiding this comment.
Should they get removed after successful execution?
There was a problem hiding this comment.
I'm not sure - this code originally didn't do removal
| // Note: this test is fails on IBM JVM, we would need to investigate this at some point | ||
| @Requires({ !System.getProperty("java.vm.name").contains("IBM J9 VM") }) | ||
| @Retry | ||
| @Timeout(30) |
There was a problem hiding this comment.
This test is notorious for locking up the test suite. I'd much prefer to keep the timeout.
There was a problem hiding this comment.
This test was locking up the testsuite because we didn't do proper capturing of stdout/stderr of forked process - which locked up the whole thing. I've fixed that. I didn't see it block anymore after the fix. I'd suggest we keep this removed and read that if it fails again.
tylerbenson
left a comment
There was a problem hiding this comment.
Please update description.
This patch should fix the agent on zulu8. Zulu8 recently made some changes that LoggerManager when JFR is loaded, and JFR gets loaded with some certificate handling code provided by JDK. The whole thing leads to LoggerManager being configured prematurely.