Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error traces sent to New Relic and Datadog are not being recognized as errors #1644

Closed
stephenhong opened this issue Nov 18, 2020 · 14 comments
Closed
Assignees

Comments

@stephenhong
Copy link

Describe the bug
I have a Node.js application sending traces to New Relic and Datadog. I'm using the latest release (0.14.0) of the OpenTelemetry Collector. I generated a 500 error with my app and the trace data was sent to both New Relic and Datadog. In both tools, the trace displayed the stack trace. However, neither tools recognized the trace as an error trace

Steps to reproduce
Node.js application using 'signalfx-tracing' sends error traces to the latest release (0.14.0) of the OpenTelemetry Collector. Configure the collector to export traces to New Relic and Datadog.

What did you expect to see?
New Relic and Datadog recognizing the error traces as errors

What did you see instead?
New Relic and Datadog not recognizing the error traces as errors

What version did you use?
Version: v0.14.0

What config did you use?
Config: the yaml config file

Environment
OS: Amazon EC2 instance
Compiler(if manually compiled): (e.g., "go 14.2")

Additional context
I can provide screenshots if needed

@bogdandrutu
Copy link
Member

Can you install the logging exporter and copy-paste the output for one of the trace?

@stephenhong
Copy link
Author

2020-11-19T16:45:07.394Z INFO loggingexporter/logging_exporter.go:313 TracesExporter {"#spans": 9}
2020-11-19T16:45:07.394Z DEBUG loggingexporter/logging_exporter.go:370 ResourceSpans #0
Resource labels:
-> service.name: STRING(nodejs-mysql)
InstrumentationLibrarySpans #0
Span #0
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 23da87b76f03daba
ID : 330fc72c2d59ce8e
Name : expressTrace
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.542696 +0000 UTC
End time : 2020-11-19 16:45:05.542857 +0000 UTC
Attributes:
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
-> component: STRING(express)
-> signalfx.tracing.library: STRING(nodejs-tracing)
Span open-telemetry/opentelemetry-collector#1
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 23da87b76f03daba
ID : 38207c99007e4da6
Name : expressInit
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.54238 +0000 UTC
End time : 2020-11-19 16:45:05.542613 +0000 UTC
Attributes:
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
-> component: STRING(express)
-> signalfx.tracing.library: STRING(nodejs-tracing)
Span open-telemetry/opentelemetry-collector#2
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 23da87b76f03daba
ID : 5b980014b24f99bf
Name : query
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.541863 +0000 UTC
End time : 2020-11-19 16:45:05.542232 +0000 UTC
Attributes:
-> signalfx.tracing.library: STRING(nodejs-tracing)
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
-> component: STRING(express)
Span open-telemetry/opentelemetry-collector#3
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 7075e928069fe4b1
ID : 4003cf5725119575
Name :
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.543749 +0000 UTC
End time : 2020-11-19 16:45:05.543956 +0000 UTC
Attributes:
-> sfx.error.kind: STRING(ReferenceError)
-> sfx.error.message: STRING(error is not defined)
-> sfx.error.stack: STRING(ReferenceError: error is not defined
at Layer. (/root/distributed-tracing-nodejs/server.js:48:9)
at /root/distributed-tracing-nodejs/node_modules/signalfx-tracing/src/plugins/router.js:111:19
at Scope._activate (/root/distributed-tracing-nodejs/node_modules/signalfx-tracing/src/scope/new/scope.js:45:14)
at Scope.activate (/root/distributed-tracing-nodejs/node_modules/signalfx-tracing/src/scope/new/base.js:13:17)
at Object.wrapMiddleware (/root/distributed-tracing-nodejs/node_modules/signalfx-tracing/src/plugins/util/web.js:115:27)
at callHandle (/root/distributed-tracing-nodejs/node_modules/signalfx-tracing/src/plugins/router.js:106:14)
at /root/distributed-tracing-nodejs/node_modules/signalfx-tracing/src/plugins/router.js:62:14
at Layer.handle [as handle_request] (/root/distributed-tracing-nodejs/node_modules/express/lib/router/layer.js:95:5)
at next (/root/distributed-tracing-nodejs/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/root/distributed-tracing-nodejs/node_modules/express/lib/router/route.js:112:3))
-> signalfx.tracing.library: STRING(nodejs-tracing)
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
-> component: STRING(express)
-> error: STRING(true)
Span open-telemetry/opentelemetry-collector#4
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 23da87b76f03daba
ID : 0e0bfe3bf70398d8
Name : urlencodedParser
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.543148 +0000 UTC
End time : 2020-11-19 16:45:05.543236 +0000 UTC
Attributes:
-> component: STRING(express)
-> signalfx.tracing.library: STRING(nodejs-tracing)
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
Span open-telemetry/community#39
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 23da87b76f03daba
ID : 4296f383cc452ff5
Name : jsonParser
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.542912 +0000 UTC
End time : 2020-11-19 16:45:05.543116 +0000 UTC
Attributes:
-> component: STRING(express)
-> signalfx.tracing.library: STRING(nodejs-tracing)
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
Span open-telemetry/opentelemetry-collector#6
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID :
ID : 23da87b76f03daba
Name : /e
Kind : SPAN_KIND_SERVER
Start time : 2020-11-19 16:45:05.540629 +0000 UTC
End time : 2020-11-19 16:45:05.547325 +0000 UTC
Attributes:
-> signalfx.tracing.library: STRING(nodejs-tracing)
-> http.route: STRING(/e)
-> http.status_code: STRING(500)
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
-> http.url: STRING(http://10.203.250.241:3030/e)
-> error: STRING(true)
-> http.method: STRING(GET)
-> component: STRING(http)
Span open-telemetry/opentelemetry-collector#7
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 23da87b76f03daba
ID : 7075e928069fe4b1
Name : bound dispatch
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.543629 +0000 UTC
End time : 2020-11-19 16:45:05.545747 +0000 UTC
Attributes:
-> component: STRING(express)
-> signalfx.tracing.library: STRING(nodejs-tracing)
-> signalfx.tracing.version: STRING(0.7.0)
-> environment: STRING(nodejs-mysql)
Span open-telemetry/opentelemetry-collector#8
Trace ID : 00000000000000002db198d7c8a9da91
Parent ID : 7075e928069fe4b1
ID : 7c5f4d080123b102
Name :
Kind : SPAN_KIND_UNSPECIFIED
Start time : 2020-11-19 16:45:05.544032 +0000 UTC
End time : 2020-11-19 16:45:05.545738 +0000 UTC
Attributes:
-> environment: STRING(nodejs-mysql)
-> component: STRING(express)
-> signalfx.tracing.library: STRING(nodejs-tracing)
-> signalfx.tracing.version: STRING(0.7.0)

@MrAlias
Copy link
Contributor

MrAlias commented Nov 19, 2020

Thanks for filing the issue @stephenhong, we recently merged an update to our status code handling here. I'm wondering if that change addressed your issue. Can you return with the recently release v0.15.0 version (that should include #1587)?

@ericmustin
Copy link
Contributor

@stephenhong I'm not able to recreate this on my end with errors from a node client. I am using the opentelemetry-js library for instrumenting a sample node app. I'm not very familiar with signalfx-tracing, quick questions.

  • what format does it export traces to OpenTelemetry-collector, and additionally
  • what receiver is being used here?

One thing that sticks out to me here is that in the debug logs don't see a span.Status, which is how the datadogexporter currently determines whether there is an error. As far as I know, this is in line with the specification. Is it possible that the issue is that the client library being used here, or the receiver are not setting a span.status? Just trying to narrow things down.

@MrAlias what do you think of the above? I realize debugging is tricky when we're trying to figure out root cause among ...3 different vendor libraries...so any insight you have here is super helpful

@ericmustin
Copy link
Contributor

@bogdandrutu feel free to re-assign from @mx-psi to me for the datadog tracing related stuffs 🙇

@stephenhong
Copy link
Author

stephenhong commented Nov 20, 2020

@MrAlias I tried sending the same trace using the recently release v0.15.0 version but I got the same result. Error traces are still not being recognized as errors in NR and DD.

@ericmustin The signalfx-tracing library sends traces in Zipkin format and I am using the Zipkin receiver with the OpenTelemetry Collector. In the collector log, I see that the error attribute is set to true. Is there a way for the span.Status to adjust accordingly with the error attribute?

@ericmustin
Copy link
Contributor

ericmustin commented Nov 20, 2020

@stephenhong thanks for the additional context...I'd certainly be open to updating things at the exporter level but I think that doesn't solve the root cause here which is that, if i understand correctly, the zipkin receiver isn't setting span status like it shoud be. I also want to make sure you can send traces to NR 🙂

@bogdandrutu do you want me to add exporter support for marking a span as an error based on this non-spec compliant span attribute, or should this be something that gets addressed at the zipkin receiver level? I am not very familiar with the zipkin receiver but I can try to dig in here if that helps, just want to unblock the end user here.

@stephenhong
Copy link
Author

@ericmustin
Copy link
Contributor

@stephenhong sorry , misphrasing on my part, i mean i want to make sure you're traces are marked as errors correctly in both vendors

@stephenhong
Copy link
Author

@ericmustin No, neither NR and DD are recognizing the traces as errors

@ericmustin
Copy link
Contributor

@stephenhong yup I think we're on the same page. Working to get this fixed 👍

Investigation
It looks like the zipkin translation code used to rely on the error tag, see: https://github.com/open-telemetry/opentelemetry-collector/pull/1002/files#diff-e8c9e4b66e4631dea1b8be1094a17e31ec8037ba033074ce310341bbdf9860a3L495

but now relies on, I think, a different tag, depending on whether using zipkinv2 or zipkinv1

From a brief investigation of the sfx client, it appears they export zipkin in v2 format: https://github.com/signalfx/signalfx-nodejs-tracing/blob/8b2063aa2e0fbd834bc4f2ad959add37a985f72c/README.md#license-and-versioning

The SignalFx-Tracing Library for JavaScript is a fork of the DataDog APM JavaScript Tracer that has been modified to provide Zipkin v2 JSON formatting, B3 trace propagation functionality, and properly annotated trace data for handling by SignalFx Microservices APM.

So, it would appear the issue is with either the client's definition of the zipkinv2 format, or the collector's zipkinv2 => internal translation logic.

@bogdandrutu Am I understanding things correctly here? What next steps would you recommend here, is there someone more familiar with zipkin and/or sfx that can help here?

@ericmustin
Copy link
Contributor

@bogdandrutu we're blocked here by @owais on the above referenced issue, is there anything I can do to help here?

@ericmustin
Copy link
Contributor

ericmustin commented Dec 19, 2020

👋 @stephenhong just wanted to ping here to let you know we have a PR approved upstream (see above) that should be merged soon and resolve the zipkin issue you were seeing, ideally available in next release. Definitely let us know if you have any other questions/issues with collector export, feel free to open an issue and tag us or reach out over the gitters as well. Cheers!

dyladan referenced this issue in dynatrace-oss-contrib/opentelemetry-collector-contrib Jan 29, 2021
Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
@mx-psi
Copy link
Member

mx-psi commented Sep 2, 2021

This should be fixed, if you are still having issues feel free to reopen it!

@mx-psi mx-psi closed this as completed Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants