You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current trace sampling behavior is not a good default. It conflates the parent trace flag and Trace ID-based sampling. With no config to configure something more specific, the default behavior should match the (OTel parent sampler)[https://github.com/open-telemetry/opentelemetry-go/blob/e6e186bfa485f679e35bb775cba63ca24029590d/sdk/trace/sampling.go#L161C1-L170C2].
In what area(s)?
/area runtime
What version of Dapr?
1.1.x
Expected Behavior
OTel ParentSampler behavior:
Parent sampling flag on -> sample
Parent sampling flag off -> do not sample
No/invalid parent: -> fall back to a root sampler (often traceIdRatioSampler)
Actual Behavior
Dapr Trace Sampler behavior:
Parent sampling flag on -> sample
Parent sampling flag off -> fall back to traceIdRatioSampler
No/invalid parent: -> fall back to traceIdRatioSampler
The current implementation combines the second two cases and calls traceIdRatioSampler. This leads to orphaned traces in observability systems, where a correct sampling decision is made for a trace in an upstream service but Dapr does not respect that decision and will start orphaned traces SamplingRate% of the time.
At a minimum, the default behavior should be changed to correctly implement the above logic (and use the ParentSampler in the Go Otel library instead of reimplementing this logic).
A more complete solution would be to expand the Sampling config to provide a richer configuration. For example, it is impossible right now to configure Dapr to only trace when a parent asks for tracing, because setting SamplingRate to 0 completely disables sampling, even though this a very common-sense configuration to use.
Steps to Reproduce the Problem
Release Note
RELEASE NOTE:
The text was updated successfully, but these errors were encountered:
The current trace sampling behavior is not a good default. It conflates the parent trace flag and Trace ID-based sampling. With no config to configure something more specific, the default behavior should match the (OTel parent sampler)[https://github.com/open-telemetry/opentelemetry-go/blob/e6e186bfa485f679e35bb775cba63ca24029590d/sdk/trace/sampling.go#L161C1-L170C2].
In what area(s)?
/area runtime
What version of Dapr?
1.1.x
Expected Behavior
OTel ParentSampler behavior:
Parent sampling flag on -> sample
Parent sampling flag off -> do not sample
No/invalid parent: -> fall back to a root sampler (often traceIdRatioSampler)
Actual Behavior
Dapr Trace Sampler behavior:
Parent sampling flag on -> sample
Parent sampling flag off -> fall back to traceIdRatioSampler
No/invalid parent: -> fall back to traceIdRatioSampler
The current implementation combines the second two cases and calls traceIdRatioSampler. This leads to orphaned traces in observability systems, where a correct sampling decision is made for a trace in an upstream service but Dapr does not respect that decision and will start orphaned traces SamplingRate% of the time.
At a minimum, the default behavior should be changed to correctly implement the above logic (and use the ParentSampler in the Go Otel library instead of reimplementing this logic).
A more complete solution would be to expand the Sampling config to provide a richer configuration. For example, it is impossible right now to configure Dapr to only trace when a parent asks for tracing, because setting SamplingRate to 0 completely disables sampling, even though this a very common-sense configuration to use.
Steps to Reproduce the Problem
Release Note
RELEASE NOTE:
The text was updated successfully, but these errors were encountered: