OpenTelemetry access logs: Missing span ID breaks trace-context correlation #33906
Labels
area/opentelemetry
area/tracing
bug
stale
stalebot believes this issue/PR has not been touched recently
Description:
When an Envoy is configured to use both OpenTelemetry
tracing
andaccess_log
exporting to OpenTelemetry, the log records sent downstream do not contain the span identifier, and that means that the trace context referenced in the OpenTelemetry log records is invalid. The following is a snippet of the debug output of the OpenTelemetry collector debugger from the updated OpenTelemetry example from my fork:The problem is the empty span id in the log, which instead should be
fdbb20b1c5d77be5
.In terms of OpenTelemetry, the trace context reported in the log record is invalid, and will be ignored by tools that would use it, like Dash0.
The issues seems to me to be here: there is no setting of the span id. Confusingly, it seems that the
log_context
abstraction does not even have a field for the span id, which makes me wonder how other tracing systems, like Datadog's proprietary ones, are supposed to successfully use this correlation, as afaik, pretty much all modern distributed trace systems use (at least) a pair of trace identifier and span identifier to specify correlation with spans.The way I would fix it, is to add a
getSpanIdAsHex()
to the Span abstraction in Envoy, leave it unimplemented for other tracers (although, as I wrote above, I suspect a few would need it), and use that to invokelog_entry.mutable_span_id()
on the OpenTelemetry SDK.Repro steps:
Try the updated OpenTelemetry example from my fork, send a request to one of the envoys, and check the debug output on the console.
Admin and Stats Output: N/A
Config: see the updated OpenTelemetry example from my fork
Logs: See above.
Call Stack: N/A
The text was updated successfully, but these errors were encountered: