-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26759 Fix trace continuity through CallRunner #4126
HBASE-26759 Fix trace continuity through CallRunner #4126
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
I think the problem here is that we forget to end the span created in ServerRpcConnection? Anyway, I think we should have a big span to trace the whole rpc processing... |
I did not attempt to perform an audit of the span lifecycle, only reconnected the orphaned span to what appears to be its proper parent. If you think there are bugs in properly closing spans, I hope that we can find them by way of the pom change i have over on https://github.com/apache/hbase/pull/4106/files#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8 |
I believe this patch fixes the issue because I no longer see a litany of these spans attached to a test-root span, and instead see spans of this name attached to a region scanner lifecycle,
|
I do find it strange that we have a span explicitly for a |
This comment was marked as outdated.
This comment was marked as outdated.
I mean we have a span here hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ServerRpcConnection.java Line 634 in 5dc663e
I think it will be the parent span for the spans at server side, and seems it has not been ended properly? |
I see that the trace scope is closed properly, but I think you're correct that the trace is not ended, neither with success nor failure.
My impression was that this span in |
We appear to be generating spans with without a parent context with the name `RegionScanner.close`. This was a symptom of span scope disconnect through the `CallRunner` execution path.
4780035
to
01ab773
Compare
RegionScanner.close
spans
@Apache9 please take another look. |
🎊 +1 overall
This message was automatically generated. |
Ah, sorry, forgot to reply here. No additional scope. Let me check the newest PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
if (!this.rpcServer.scheduler.dispatch(new CallRunner(this.rpcServer, call))) { | ||
if (this.rpcServer.scheduler.dispatch(new CallRunner(this.rpcServer, call, span))) { | ||
// unset span do that it's not closed in the finally block | ||
span = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the code structure here is not friendly for tracing...
Thanks for the detailed comments!
🎊 +1 overall
This message was automatically generated. |
Signed-off-by: Andrew Purtell <apurtell@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>
Signed-off-by: Andrew Purtell <apurtell@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
We appear to be generating spans with without a parent context with the name
RegionScanner.close
.