New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#[tracing::instrument(err)]
doesn't work with opentelemetry / datadog
#2648
Comments
#![tracing::instrument(err)]
doesn't work with opentelemetry / datadog#[tracing::instrument(err)]
doesn't work with opentelemetry / datadog
It occurs to me I forgot to mention some context. Long story short: currently in this application |
* manually attach an `error.message` to subspans (on per-op errors) and to the main span. Unfortunately, `#[tracing(err)]` -> Datadog span export loses the error message (see <tokio-rs/tracing#2648>), so the manual approach is better. (It'd be worthwhile to do the same on other operations, but small steps...) * on `AtomicWriteConflict`, mention the offending key. * if there is an error that isn't a condition failure or write conflict, pass it back to the caller rather than erroneously claiming write conflict.
* manually attach an `error.message` to subspans (on per-op errors) and to the main span. Unfortunately, `#[tracing(err)]` -> Datadog span export loses the error message (see <tokio-rs/tracing#2648>), so the manual approach is better. (It'd be worthwhile to do the same on other operations, but small steps...) * on `AtomicWriteConflict`, mention the offending key. * if there is an error that isn't a condition failure or write conflict, pass it back to the caller rather than erroneously claiming write conflict.
* manually attach an `error.message` to subspans (on per-op errors) and to the main span. Unfortunately, `#[tracing(err)]` -> Datadog span export loses the error message (see <tokio-rs/tracing#2648>), so the manual approach is better. (It'd be worthwhile to do the same on other operations, but small steps...) * on `AtomicWriteConflict`, mention the offending key. * if there is an error that isn't a condition failure or write conflict, pass it back to the caller rather than erroneously claiming write conflict.
* manually attach an `error.message` to subspans (on per-op errors) and to the main span. Unfortunately, `#[tracing(err)]` -> Datadog span export loses the error message (see <tokio-rs/tracing#2648>), so the manual approach is better for now. (Would be nice to have a proc macro that produces the right format but doesn't seem worth it just for keyvaluestore-rs's stuff.) (It'd be worthwhile to do the same on other operations, but small steps...) * also fill in `otel.status_code` and `otel.kind`, which `tracing-opentelemetry` translates to the OpenTelemetry magic, and opentelemetry-datadog passes along. https://github.com/tokio-rs/tracing-opentelemetry/blob/f5f898b47e086659a7a78b495ef77661781895cd/src/layer.rs#L330 https://github.com/open-telemetry/opentelemetry-rust/blob/1f1a4fe0540b5df92b7cf5b289d6e89c2e1c02a8/opentelemetry-datadog/src/exporter/model/v05.rs#L183 * on `AtomicWriteConflict`, mention the offending key. * if there is an error that isn't a condition failure or write conflict, pass it back to the caller rather than erroneously claiming write conflict. We could further modify the interface to pass any errors back per-op rather than just condition failure, but I went with the more modest change for now.
* manually attach an `error.message` to subspans (on per-op errors) and to the main span. Unfortunately, `#[tracing(err)]` -> Datadog span export loses the error message (see <tokio-rs/tracing#2648>), so the manual approach is better for now. (Would be nice to have a proc macro that produces the right format but doesn't seem worth it just for keyvaluestore-rs's stuff.) (It'd be worthwhile to do the same on other operations, but small steps...) * also fill in `otel.status_code` and `otel.kind`, which `tracing-opentelemetry` translates to the OpenTelemetry magic, and opentelemetry-datadog passes along. https://github.com/tokio-rs/tracing-opentelemetry/blob/f5f898b47e086659a7a78b495ef77661781895cd/src/layer.rs#L330 https://github.com/open-telemetry/opentelemetry-rust/blob/1f1a4fe0540b5df92b7cf5b289d6e89c2e1c02a8/opentelemetry-datadog/src/exporter/model/v05.rs#L183 * on `AtomicWriteConflict`, mention the offending key. * if there is an error that isn't a condition failure or write conflict, pass it back to the caller rather than erroneously claiming write conflict. We could further modify the interface to pass any errors back per-op rather than just condition failure, but I went with the more modest change for now.
* improve rusoto `exec_atomic_write` errors * manually attach an `error.message` to subspans (on per-op errors) and to the main span. Unfortunately, `#[tracing(err)]` -> Datadog span export loses the error message (see <tokio-rs/tracing#2648>), so the manual approach is better for now. (Would be nice to have a proc macro that produces the right format but doesn't seem worth it just for keyvaluestore-rs's stuff.) (It'd be worthwhile to do the same on other operations, but small steps...) * also fill in `otel.status_code` and `otel.kind`, which `tracing-opentelemetry` translates to the OpenTelemetry magic, and opentelemetry-datadog passes along. https://github.com/tokio-rs/tracing-opentelemetry/blob/f5f898b47e086659a7a78b495ef77661781895cd/src/layer.rs#L330 https://github.com/open-telemetry/opentelemetry-rust/blob/1f1a4fe0540b5df92b7cf5b289d6e89c2e1c02a8/opentelemetry-datadog/src/exporter/model/v05.rs#L183 * on `AtomicWriteConflict`, mention the offending key. * if there is an error that isn't a condition failure or write conflict, pass it back to the caller rather than erroneously claiming write conflict. We could further modify the interface to pass any errors back per-op rather than just condition failure, but I went with the more modest change for now. * `error.message` -> `error.msg` to match https://docs.datadoghq.com/tracing/trace_explorer/trace_view/?tab=spantags#more-information * remove stray `println!` * remove unnecessary conversion
Using honeycomb here with a similar issue - errors just show up as span events with |
Bug Report
Version
Applies to the latest released versions as of this writing:
tracing = "0.1.37"
tracing-attributes = "0.1.26"
tracing-opentelemetry = "0.19.0"
opentelemetry = "0.19.0"
opentelemetry-datadog = "0.7.0"
opentelemetry-otlp = "0.12.0"
Platform
any
Crates
Interaction between
tracing-attributes
,tracing-opentelemetry
,opentelemetry
, andopentelemetry-datadog
.Description
I'd like the full integration from
#[tracing(err)]
to just work, resulting in following the expected protocols:exception
with attributesexception.escaped
(which should always betrue
the case of#[tracing::instrument(err)]
),exception.message
,exception.type
,exception.stack
.error.msg
,error.type
,error.stack
.From observation: it doesn't work with
opentelemetry_datadog
. E.g., if I use essentiallytracing_subscriber::registry().with(tracing_opentelemetry::layer().with_tracer(opentelemetry_datadog::new_pipeline().install_batch(...))
, Datadog will correctly identify that the span has an error, but I can't see any details anywhere. I just seeMissing error message and stack trace
from the Datadog UI.From code inspection: I don't think it'd work well with
opentelemetry_otlp
either. In fact, I thinktracing-attributes
andtracing-opentelemetry
aren't following the same protocol:#[tracing(err)]
code here will create an event (with no message, I think?) with attributeerror
that has theDisplay
orDebug
outputtracing_opentelemetry
has this promising-looking logic that looks based on the OpenTelemetry semantic conventions. It will fill inexception.message
andexception.stack
...but only onrecord_error
. So the conversion toDisplay
orDebug
(error = %e
orerror = ?e
) defeats that by callingrecord_debug
instead in both cases (DisplayValue and DebugValue. Also, I think the event name won't get set toexception
as it's supposed to; no tracingmessage
field => no OpenTelemetry event name, right? There's also logic to copy it to the span's attributes (OpenTelemetryLayer::with_exception_field_propagation
/ExceptionFieldConfig::propagate
), but that's not the right thing to do by my reading of the semantic conventions above......and then I'm not sure
opentelemetry
crates would do the right thing from there anyway.opentelemetry_datadog
at least doesn't seem to have a conversion from the OpenTelemetry protocol to the Datadog protocol.I don't know how to accomplish this, but I wish some group of folks could get together and make this integration work end to end. I can file a separate issue in the
opentelemetry
repo about the paragraph above, but I feel like there still may be a gap between the two that I don't know how to fill.My first impression is that going through these two conversions just seems really janky. It looks like the
opentelemetry
crate has thisSpan{,Ref}::record_error
method that could do the right thing for the given backend (otlp, datadog, etc). Buttracing_opentelemetry
isn't using it (instead trying to directly follow the opentelemetry semantic conventions), and I think it's not actually using backend-specific logic so I don't see any reason to believe it'd work correctly withopentelemetry_datadog
anyway.(Bit of a tangent, but the tracing -> opentelemetry -> datadog conversion seems janky in other ways too. E.g.
tracing
has static/low-cardinality span names.tracing_opentelemetry
copies those to the OpenTelemetry span names (unless overridden viaotel.name
attribute). But thenopentelemetry_datadog
says in the crate-level doc comment that because of cardinality concerns "This exporter therefore takes a different approach of naming the span with the name of the tracing provider, and using the span name to set the resource_name". So even though my code specifies low-cardinality span names, by default they all get replaced with the uselessopentelemetry-datadog
. It's possible to override this, but it's just a stupid papercut from going through two conversions.)The text was updated successfully, but these errors were encountered: