New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance gRPC plugin #4177
Enhance gRPC plugin #4177
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4177 +/- ##
==========================================
+ Coverage 26.36% 26.37% +<.01%
==========================================
Files 1179 1179
Lines 25863 25863
Branches 3753 3753
==========================================
+ Hits 6819 6821 +2
+ Misses 18440 18438 -2
Partials 604 604
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you enhance the plugin, i think you should adjust Grpc tests cases, then use ci to check first.
Yes, it already in our checklist, and @GuoDuanLZ will help us in tests. |
This is a very good pull request template for the plugin contribution. 👍 |
Anyone want to discuss about new issue and pull request templates? |
What is this? Could you explain a little more? |
I will update the doc of PR later, I am too busy to update PR in the working day. |
* Add grpc on error test * Fixed bugs * Add license and fix bugs * Fixed bugs * Fix bugs * Override expect data * Update expectedData.yaml Co-authored-by: Kanro <higan@live.cn>
@wu-sheng @wayilau @kezhenxu94 |
It is still unclear for me about this section. Client/server side tracing are both required in nearly every RPC plugin. What makes the gRPC different? Manually setting this is very painful for the end user. cc @kezhenxu94 Could you get the point of this? |
@wu-sheng
A call will be traced both in server1(client-side), and server2(server-side). They are redundant, and the server-side is more accurate than the client-side because of the gRPC framework. You can distinguish the server active message(parent is entry span) and passive message(parent is request message span). In the internal server case, we don't need the client-side tracing, but in the external server case, client-side tracing is the only way to tracing gRPC call. |
In the default config, all gRPC calls are internal calls. No client-side tracing will be created. I think the default config is enough for most cases (most of external APIs will not be gRPC). But if you need the client-side tracing for separate peers, just add it to |
If no client-side span, there is no client-side metrics of topology. Then you would detect the error of unreachable or network perf unstable issues from trace and response time page. The thing you posted is exactly an APM should collect. Back to my point, this is a basic design of SkyWalking. We should not argue about this in a single one plugin. If you want to discuss that, it is more than this. You need to change the design and protocol of the project. If you want to change this, I prefer you keep that in private, and only push the both sides tracing in the upstream. |
I agree with you, the error of unreachable or network perf unstable issues are also important for service. I can simplify client-side tracing for this, like no |
I think in most cases, both of them should exist, including |
Update test case
It seems test case passed with client tracing |
...-plugin/src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/client/TracingClientCall.java
Outdated
Show resolved
Hide resolved
...-plugin/src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/server/TracingServerCall.java
Outdated
Show resolved
Hide resolved
...-plugin/src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/server/TracingServerCall.java
Show resolved
Hide resolved
...src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/server/TracingServerCallListener.java
Show resolved
Hide resolved
...src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/server/TracingServerCallListener.java
Show resolved
Hide resolved
...src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/server/TracingServerCallListener.java
Show resolved
Hide resolved
...-plugin/src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/server/TracingServerCall.java
Show resolved
Hide resolved
...src/main/java/org/apache/skywalking/apm/plugin/grpc/v1/server/TracingServerCallListener.java
Show resolved
Hide resolved
Some discuss not about this PR. I found there is bad throughput about OAP server collect tracing. Maybe separate collectors from OAP server is a good idea? If you are interested in this, maybe we could open an issue to discuss. |
You could use another issue or mail list to discuss this. But to be honest, I can't see much difference. Many discussions have been done about this. |
parentSpanId: 2, spanId: 3, spanLayer: RPCFramework, startTime: nq 0, | ||
endTime: nq 0, componentId: 23, componentName: '', isError: false, | ||
spanType: Local, peer: '', peerId: 0} | ||
- {operationName: GreeterBlocking.sayHello/client/Response/onClose, operationId: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these two use different formats? I know they may be right but it seems strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these two use different formats? I know they may be right but it seems strange.
When the span is shorter, it is a simplified format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you follow the same pattern? I prefer we keep this consistent same in all test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you follow the same pattern? I prefer we keep this consistent same in all test cases.
Yes, of course.
spanType: Exit | ||
peer: '127.0.0.1:18080' | ||
peerId: 0 | ||
- {operationName: GreeterBlockingError.sayHello/client/Request/onComplete, operationId: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one too.
This same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @kezhenxu94 @arugal @dmsolr Please recheck.
@kezhenxu94 Could you recheck this recently? |
Skywalking with enhanced gRPC
This PR enhances the original gRPC plugin of Skywalking.
I created PR early for the request for comments. @GuoDuanLZ and I will accept opinion and improve this PR.
Checklist
This PR still in WIP state, you can merge this PR after all check is done.
Description
The original gRPC plugin just provided very basically tracing function, the exception of business code will not be traced.
We refactored this plugin for those functions:
Operations
The operation name of that this plugin created will be like
There are two sides of tracing for gRPC:
client-side
andserver-side
.Trace side provides a view for gRPC request, it means what the gRPC request looks like in the client or the server.
There are five combinations for sides and events.
For simplifying tracing span, this plugin just creates
onMessage
spans for streaming calls.onMessage
span created./Request/onMessage
span created./Response/onMessage
span created./onMessage
span created.For example:
Internal and external gRPC server tracing
If you have two gRPC servers called
Server1
andServer2
,Server1
will call theServer2.test
method in your business code, we just need theserver-side
tracing because of theServer2
is an internal server,Server2
will provideserver-side
view,server-side
view will be more useful for tracing a call thanclient-side
view.If
Server2
is an external gRPC server which you can't getserver-side
view by it. You will need theServer1
client-side
view for the call.Configure plugin
This plugin provides three configs for configuring internal and external gRPC servers.
Default config is no external gRPC server, no
client-side
spans created. If you need theclient-side
view for three-part APIs, add the host and port likeservices.googleapis.com:11800
intoINCLUDED_CLIENT_TRACING_PEERS
.Screenshots
WIP