Skip to content

fix agent for zulu8#1084

Merged
mar-kolya merged 10 commits into
masterfrom
mar-kolya/fix-agent-for-zulu8
Nov 7, 2019
Merged

fix agent for zulu8#1084
mar-kolya merged 10 commits into
masterfrom
mar-kolya/fix-agent-for-zulu8

Conversation

@mar-kolya
Copy link
Copy Markdown
Contributor

@mar-kolya mar-kolya commented Nov 6, 2019

This patch should fix the agent on zulu8. Zulu8 recently made some changes that LoggerManager when JFR is loaded, and JFR gets loaded with some certificate handling code provided by JDK. The whole thing leads to LoggerManager being configured prematurely.

It should use real class to try to load
The problem was that on zulu8 loading OkHttp touches JFR which in turn
touches log manager - which would break things like JBOSS.

The fix is to delay installing agent (and writer) until log manager
things have settled down - in way similar to jmxfetch.

Unfortunately for 'main' agent this turns out to be more involved
because of classloader shenanigans.
@mar-kolya mar-kolya changed the title Mar kolya/fix agent for zulu8 fix agent for zulu8 Nov 6, 2019
@mar-kolya mar-kolya force-pushed the mar-kolya/fix-agent-for-zulu8 branch from 1d8af37 to 63b5504 Compare November 6, 2019 18:49
Using `datadog.trace.agent` to hold agent class causes problems due to
shadowing into this package of other classes.
@mar-kolya mar-kolya force-pushed the mar-kolya/fix-agent-for-zulu8 branch from 63b5504 to 42cece6 Compare November 6, 2019 18:58
@mar-kolya mar-kolya marked this pull request as ready for review November 6, 2019 20:53
@mar-kolya mar-kolya requested a review from a team as a code owner November 6, 2019 20:53
Copy link
Copy Markdown
Contributor

@jbachorik jbachorik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Tricky classloading :)
Just a few minor comments.

Comment thread dd-java-agent/src/main/java/datadog/trace/bootstrap/AgentBootstrap.java Outdated

@Override
public void run() {
/*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'm little surprised we weren't doing this already.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Before' we only used this logic for jmxfetch which happens to fork out its own threads before doing anything useful so that wasn't really a problem.

Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I appreciate the comments. It helps a lot in explaining what we're working around.

Copy link
Copy Markdown
Contributor

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @mar-kolya. I have some concerns about delaying agent install unnecessarily (like Oracle Java 8 with JBOSS), since I'm not sure what impact that would have.

See questions below.

private static synchronized void installDatadogTracer(
final Instrumentation inst, final URL bootstrapURL) throws Exception {
if (AGENT_CLASSLOADER == null) {
throw new IllegalStateException("Datadog agent should have been started already");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone declares the javaagent on the command line multiple times, does this also prevent it from being started multiple times?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually should be safe and there is a comment about that few lines down.

final Method agentInstallerMethod =
agentInstallerClass.getMethod("installBytebuddyAgent", Instrumentation.class);
agentInstallerMethod.invoke(null, inst);
AGENT_CLASSLOADER = agentClassLoader;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename the AgentInstaller and AGENT_CLASSLOADER to be something else?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions?

* <li>Do not store any static data in this class
* <li>Do dot touch any logging facilities here so we can configure them later
* </ul>
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice comment! I wonder if we should add protection somehow to assert this class is never loaded on the bootstrap classpath.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to do that and it is really solves much...

throw new TimeoutException();
}

private static class StreamGobbler extends Thread {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you didn't live in 🇨🇦 I'd say you had 🦃 on your mind. Nice name though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This originates from SO somewhere.

* events which in turn loads LogManager. This is not a problem on newer JDKs because there JFR uses different
* logging facility.
*/
if (isJavaBefore9() && appUsingCustomLogManager) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should isolate this behavior to only when JFR is available on Java 8. I'm concerned about unintended side effects of delaying tracer installation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fixed now

if (entry.getKey().equals(typeName)) {
entry.getValue().run();
synchronized (CLASS_LOAD_CALLBACKS) {
final List<Runnable> callbacks = CLASS_LOAD_CALLBACKS.get(typeName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should they get removed after successful execution?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure - this code originally didn't do removal

// Note: this test is fails on IBM JVM, we would need to investigate this at some point
@Requires({ !System.getProperty("java.vm.name").contains("IBM J9 VM") })
@Retry
@Timeout(30)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is notorious for locking up the test suite. I'd much prefer to keep the timeout.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was locking up the testsuite because we didn't do proper capturing of stdout/stderr of forked process - which locked up the whole thing. I've fixed that. I didn't see it block anymore after the fix. I'd suggest we keep this removed and read that if it fails again.

Copy link
Copy Markdown
Contributor

@tylerbenson tylerbenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update description.

@mar-kolya mar-kolya merged commit 2789324 into master Nov 7, 2019
@mar-kolya mar-kolya deleted the mar-kolya/fix-agent-for-zulu8 branch November 7, 2019 19:39
@tylerbenson tylerbenson added this to the 0.38.0 milestone Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants