Skip to content

1.42.0 regression: RANDOM_TRACE_ID flag (0x02) is set on every span, including spans whose trace_id was inherited from a remote traceparent — breaks Level-1 APM backends #5240

@suxue

Description

@suxue

Describe your environment

Versions ≤1.41.0 are not affected.

What happened?

After upgrading from 1.41.0 to 1.42.0, every outgoing HTTP request produced by auto-instrumented clients (opentelemetry-instrumentation-openai, -httpx, -aiohttp-client, etc.) carries a traceparent header ending in -03 instead of the long-standing -01:

traceparent: 00-e91d2af669385dba4477d73a1017dc7b-ba292baf102ea196-03
                                                                  ^^
                                                                  was -01 in 1.41
  • 0x03 = 0x01 (SAMPLED) | 0x02 (RANDOM_TRACE_ID).
  • Several APM backends and homegrown collectors treat the 0x02 bit as an unknown flag and silently drop the spans they receive (they parse flags == 0x01 instead of bitmask-checking the SAMPLED bit). Our internal APM is one of them; we've also seen production middleware and proxies in the wild that do the same — anything written against W3C Trace Context Level 1 is in scope.
  • The trace_id in the failing requests was inherited verbatim from the incoming traceparent on our /run endpoint — never generated by our SDK. The RANDOM_TRACE_ID advertisement is therefore not just unwanted but factually incorrect: we are advertising "this trace ID has ≥56 random bits because we generated it that way," when in reality we never touched the ID generator for this span.

Where the regression lives

[opentelemetry/sdk/trace/__init__.py](https://github.com/open-telemetry/opentelemetry-python/blob/v1.42.0/opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py), in Tracer.start_span():

trace_flags = (
    trace_api.TraceFlags(trace_api.TraceFlags.SAMPLED)
    if sampling_result.decision.is_sampled()
    else trace_api.TraceFlags(trace_api.TraceFlags.DEFAULT)
)

if self.id_generator.is_trace_id_random():           # ← added in #4854
    trace_flags = trace_api.TraceFlags(
        trace_flags | trace_api.TraceFlags.RANDOM_TRACE_ID
    )

The check is on self.id_generator.is_trace_id_random() — i.e., a property of the id generator instance, not a property of the span's actual trace_id. The default RandomIdGenerator.is_trace_id_random() returns True, so the flag is OR'd in unconditionally for every span created by that provider — including child spans whose trace_id was inherited from a traceparent and never went through generate_trace_id().

The closest internal discussion of this is one inline comment by @aabmass on PR #4854 (2026-02-19):

Hopefully downstream consumers are all properly handling bitmasking to check for sampled flag and not flags == 1

…which is essentially the failure mode we're reporting. There was no further mitigation, no release-note warning, and no opt-in.

Why this is a regression, not just "new feature, broken consumers"

  1. It mis-advertises trace-id origin. Even if every consumer was Level-2 compliant, the flag on an inherited-trace-id span is a lie. Per the [W3C spec](https://www.w3.org/TR/trace-context-2/#considerations-for-trace-id-field-generation):

    The "random-trace-id" flag … indicates that at least the 7 rightmost bytes (56 bits) of the trace ID were generated randomly with uniform distribution.
    That guarantee is about the trace_id itself, not about the generator the SDK is configured with. A correct implementation can only set the flag for spans where this SDK's id generator actually produced the trace_id.

  2. It's a silent wire-format change. The 1.42.0 release notes describe the change as "Added" — a feature. There is no mention that every outgoing traceparent from existing applications will change shape, that the tracestate propagation behavior changes, or that consumers must now bitmask. Operators who pin opentelemetry-sdk>=1.27 (a very common constraint) get a different wire format on the next pip install with no warning.

  3. The OTel-Go implementation of the same Level 2 feature ([opentelemetry-go#8012](trace: add Random Trace ID Flag opentelemetry-go#8012), shipped in v1.43.0, 2026-04-03) is API-only — TraceFlags.IsRandom() / WithRandom() are exposed for use, but the SDK does not unconditionally OR 0x02 into every emitted span. Go users were not affected.

Steps to Reproduce

from opentelemetry.sdk.trace import TracerProvider
from opentelemetry import trace, context as otel_context
from opentelemetry.trace import (
    NonRecordingSpan, SpanContext, TraceFlags, format_trace_id, format_span_id, set_span_in_context,
)
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator

# Simulate an inbound request whose upstream traceparent had flags = 01.
upstream_trace_id = 0xe91d2af669385dba4477d73a1017dc7b
upstream_span_id  = 0x560e8d7ab1550fbf

parent_ctx = set_span_in_context(
    NonRecordingSpan(SpanContext(
        trace_id=upstream_trace_id,
        span_id=upstream_span_id,
        is_remote=True,
        trace_flags=TraceFlags(0x01),
    )),
    otel_context.Context(),
)

# A provider with the default (random) ID generator — exactly what every
# auto-instrumentor receives when wired via instrumentor.instrument(tracer_provider=...).
provider = TracerProvider()
tracer = provider.get_tracer(__name__)

# Start a CHILD span; trace_id is inherited from parent_ctx, never generated by us.
child = tracer.start_span("child", context=parent_ctx)
carrier: dict[str, str] = {}
TraceContextTextMapPropagator().inject(carrier, context=set_span_in_context(child))
print(carrier["traceparent"])

# Output on 1.42.0:  00-e91d2af669385dba4477d73a1017dc7b-<random>-03
# Output on 1.41.0:  00-e91d2af669385dba4477d73a1017dc7b-<random>-01

The trace_id in the outgoing header is the inherited e91d2af669385dba4477d73a1017dc7b — but the flag now claims we generated it randomly.

Expected Result

N/A

Actual Result

N/A

Additional context

Workaround

Subclass RandomIdGenerator and override is_trace_id_random() to return False:

from opentelemetry.sdk.trace.id_generator import RandomIdGenerator

class _HonestRandomIdGenerator(RandomIdGenerator):
    """Suppress the misleading RANDOM_TRACE_ID flag on derived spans.

    The SDK's check looks at the id generator's *capability*, not at whether
    the current span's trace_id was actually generated by it. We have many
    child spans that inherit trace_id from a remote parent; advertising
    'random-trace-id' on those is a lie that older APM backends reject.
    """
    def is_trace_id_random(self) -> bool:
        return False

provider = TracerProvider(id_generator=_HonestRandomIdGenerator(), ...)

This restores the 1.41 wire format while keeping random ID generation for the rare spans we genuinely root.

Suggested fixes (any one, in order of preference)

  1. Move the check from generator capability to actual provenance. Only OR RANDOM_TRACE_ID when the current span's trace_id was just produced by self.id_generator.generate_trace_id() in the same start_span() call. Inherited child spans should keep whatever flags the parent carried (or, more strictly, never set the random flag themselves).
  2. Make the flag opt-in via env var (OTEL_PYTHON_EMIT_RANDOM_TRACE_ID_FLAG=true) for at least one minor release, defaulting to off, so operators on Level-1 backends have time to upgrade. Match opentelemetry-go's behavior (API-only, no unconditional emission) until consumers catch up.
  3. At minimum, document the behavior change in the 1.42.0 release notes as "Changed" not "Added", with explicit warning that downstream consumers parsing flags == 0x01 will drop traces. Currently the only mention is the requirement that custom IdGenerator subclasses should set is_trace_id_random()=True — there is nothing telling operators their outgoing wire format will change.

Cross-reference

Would you like to implement a fix?

None

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions