Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use the OpenTelemetry javaagent #594

Closed
HughPowell opened this issue Apr 21, 2022 · 2 comments
Closed

Unable to use the OpenTelemetry javaagent #594

HughPowell opened this issue Apr 21, 2022 · 2 comments

Comments

@HughPowell
Copy link

I'm trying to use the OpenTelemetry javaagent with aleph which seems to fail. I've created a minimal project which throws the following exception

OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2022-04-21 11:06:13:000 +1000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 1.13.0
Exception in thread "main" java.lang.NullPointerException
	at io.opentelemetry.javaagent.instrumentation.netty.v4_1.InstrumentedAddressResolverGroup.getResolver(InstrumentedAddressResolverGroup.java:42)
	at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:194)
	at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:46)
	at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:180)
	at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:166)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)
	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)
	at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605)
	at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
	at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:1012)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:516)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at manifold.executor$thread_factory$reify__529$f__530.invoke(executor.clj:47)
	at clojure.lang.AFn.run(AFn.java:22)
	at java.base/java.lang.Thread.run(Thread.java:829)
[otel.javaagent 2022-04-21 11:06:23:764 +1000] [OkHttp http://localhost:4317/...] ERROR io.opentelemetry.exporter.internal.grpc.OkHttpGrpcExporter - Failed to export metrics. The request could not be executed. Full error message: Failed to connect to localhost/0:0:0:0:0:0:0:1:4317

when built with

clj -T:build uber

and run with

OTEL_TRACES_EXPORTER=logging java -javaagent:opentelemetry-javaagent.jar -jar target/honeycomb-aleph-0.1.2-standalone.jar

I can't work out exactly how that thread is constructed so can't work out what is going on. Any thoughts?

@KingMob
Copy link
Collaborator

KingMob commented Apr 21, 2022

Hmm, I've never encountered this before. The NPE error line in Bootstrap.java refers to this.resolver.getResolver(eventLoop);, but all the resolver-related code I see is years old, so I would have expected to see the issue before now, hopefully.

A search of OpenTelemetry's GH shows 279 Netty-related issues. Given that it works without the OT agent, I can't really justify spending limited time looking into what might not be an Aleph issue, but I can suggest a few things to try:

  1. Bump up the Netty version in your deps. Aleph is currently on 4.1.65.Final, but a newer version might not have the same issue with OT.
  2. Comment out the (.setContextClassLoader curr-loader) call in Manifold's manifold.executor/thread-factory, then rebuild and reinstall. I can see there being weird conficts over classloaders leading to this.

@HughPowell
Copy link
Author

Thanks for looking into this. I opened a similar issue with the OpenTelemetry team in case the issue was there and they've created a fix. I've tested it locally and it works, so I'm going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants