-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grpc server plugin reports wrong status code #2170
Comments
For anyone else that comes across this problem, what I found is that this happens in the case where there is a unfinished span that has been associated with that trace at the time that the grpc response finishes. When the grpc.request span gets finished, it seems to be put into the processor queue in-memory, however as soon as the cancelled event happens, the grpc status code is updated on that span, updating the in-memory reference in the queue, and therefore gets reported to the datadog agent incorrectly. The cancelled event seems to fire as soon as the grpc stream is reset (even for rstCode 0). If the grpc request ends without any unfinished spans, status codes get reported as normal, which is why I was unable to reproduce it before. So either the processor needs to be updated to do a copy of the span (seems expensive), the Span class could be updated to prevent changes to tags after a span is finished (could cause problems with other plugins), or the grpc plugin updated to ignore a cancelled event if the span is already finished. |
I think that for now that's the best option, but later on we might also want to also freeze tags when the span finishes, although as you mentioned this will require making sure that it doesn't break any plugin that might depend on the current behaviour. |
I can't seem to be able to reproduce this. Unless I explicitly cancel, no cancellation is happening. @brandontuttle Do you have an example of a case where the cancel event is emitted without an actual cancellation? |
Ended up being able to reproduce and fix the issue with #2339. |
Expected behaviour
When a grpc server returns a non-0 status code, it is reported through to datadog tags properly.
Actual behaviour
In a high percentage of the time, the server
grpc.request
trace may report a status code of "1" instead of the proper status code, where thegrpc.request
span for the client side reports the correct one. It seems like there is an edge case between something with the grpc stream being closed and the grpc response being returned causing the tag to show the wrong status code.Steps to reproduce
I spent several hours trying to get together a sample that would reproduce this problem and could not come up with anything.
Environment
@grpc/grpc-js
versions1.6.7
and1.3.4
The text was updated successfully, but these errors were encountered: