-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CAMEL-18661: Make span current and clear scope properly for async processing #8713
Conversation
🌟 Thank you for your contribution to the Apache Camel project! 🌟 If necessary Apache Camel Committers may access logs and test results in the job summaries! |
General question: is there a good place/way to wrap callback executions (user or 3rd party library) with something like
|
Components tested:
|
Sorry for not giving this PR attention - however we have been buys (as usual). Observability with open-telemtry and micrometer is important and thanks for helping with this. |
I'm sorry, too. It was on my to-do list to review this and I forgot about it |
@davsclaus @oscerd thank you! Once you folks have time to take a look, please let me know what approach you'd like me to pursue nd I'm happy to work on it further. |
@davsclaus @oscerd friendly ping to take a look. I'd be happy to continue working on it based on your initial feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found one typo. For placing the camel-tracing dependency in the camel-core-engine, I'm against polluting the core with a tracing component. This will probably create a circular dependency in the maven reactor. So we should find a different way.
core/camel-api/src/main/java/org/apache/camel/AsyncCallback.java
Outdated
Show resolved
Hide resolved
This is not possible, camel-base should be 100% independent on camel itself only. |
Its actually the opposite
if the returned value == true then it was a sync call and its done if the returned value == false, then its async call, and currently processing by another thread (done when the callback is triggered) |
66ff694
to
78389af
Compare
@davsclaus @oscerd thank you for the feedback. I've implemented the current approach with the extra event and added tests. Can you please take another look? |
40ddebf
to
afa146b
Compare
Hi @lmolkova Thanks for your contribution and support to look into this issue. Looking into
Please note that you don't have access to the pipeline's generated |
Using the event notifier to emit a "async processing" event is tricky as there are potentially more places this can happens in the core routing engine. However they all bubble up to the base-engine in the |
@bvahdat thank you! it should be fixed now. @davsclaus I removed notifications from everywhere except Do you have in mind any scenarios I can run to validate to make sure I didn't miss something? I tried to cover what I can think of in CurrentSpanTests. |
dc45e19
to
8989daf
Compare
@lmolkova have you tested your latest code with the AWS servicebus to see if the diagram/span looks correct |
@davsclaus, yes, I did (with Azure ServiceBus ) and yes, it works fine and looks the same as in the description. |
Components tested:
|
Components tested:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to fix tests
It happens when its the JDK fork-join pool that are in use
Which is a JDK pool you use via private static final Executor DELAYED = CompletableFuture.delayedExecutor(100L, TimeUnit.MILLISECONDS); |
thanks a lot, @davsclaus ! I was able to reproduce |
Thanks for looking into this. Yeah I only tried with JDK11, as I use that primary due to Camel v3 is still Java 11 based. Could be that the JDK ForJoin pool behaves differently or something? |
@lmolkova by any chance, could you please rebase on top of the main branch when you are done? |
Yeah in the mean time we have upgraded: upgrade opentelemetry to the version 1.21.0 |
77d0d64
to
93c9d81
Compare
@bvahdat sure, I rebased! I believe (again) that I fixed tests. There were two issues:
|
Components tested:
|
Components tested:
|
thanks @lmolkova and camel team!! |
Thank you all! |
Opentelemetry support is important. We would like to improve it in the future. This is a really good work. Thanks to everyone involved. |
…cessing (apache#8713) * Prototype: make span current and clear scope * cleanup * tests and fixes * cleanup and tests * cleanup * fix build * Remove unnecessary notifications * fix tests * oops * Fix tests on Java 11 * checkstyle
Fixesr https://issues.apache.org/jira/browse/CAMEL-18661
!Looking for input on the overall approach!This change enables current span propagation to underlying libraries and end-user-code (for OpenTelemetry):
for all spans created by Camel, it calls into
span.makeCurrent()
However, OpenTelemetry (and other tracing tools) rely on
ThreadLocal
to propagate context. They do it carefully for Executors, Reactor, etc, so it works in async scenarios too.Instrumentations have to close scope returned by
span.makeCurrent()
on the same thread where it was created to avoid leaking context (by leaving current span on current thread).This is easy to guarantee with sync operations in Camel - ExchangeStarted/Sending and ExchangeCompleted/Sent events (where spans start and end) are called on the same thread.
However, for async operations, they are called on different threads.
This change also adds a new event -
ExchangeAsyncStarted
(naming needs more work), that notifies thatProcessor.process
call has ended, i.e. async operation has started. Tracers would end the scope (but not span) when this event is sent.Alternative approach:
add a dependency on
camel-tracing
tocamel-base-engine
and call intoActiveSpanManager
directly:This would be easier to maintain and would more or less guarantee that scope is disposed on the same thread, however, dependency on tracing might not be desirable.
Traces with this change
(tick/testme spans come from Camel and were previously not correlated with ServiceBus spans coming from Azure SDK)