Skip to content

feat(contexts): Add TraceId by default#5759

Merged
thetruecpaul merged 4 commits intomasterfrom
cpaul/031926/default-trace-id
Apr 9, 2026
Merged

feat(contexts): Add TraceId by default#5759
thetruecpaul merged 4 commits intomasterfrom
cpaul/031926/default-trace-id

Conversation

@thetruecpaul
Copy link
Copy Markdown
Contributor

We are beginning to store events in EAP, which — as a Trace-centric datastore — requires all TraceItems to be associated with a TraceId. This is problematic, since we currently don't require that events actually have TraceIds. That means that we're currently just silently dropping a bunch of events before we can successfully ingest them into EAP.

This PR adds a random TraceId & SpanId to events that do not have a TraceContext.

@thetruecpaul thetruecpaul requested a review from a team March 23, 2026 17:54
@thetruecpaul thetruecpaul requested a review from a team as a code owner March 23, 2026 17:54
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Random TraceContext not added when contexts is None
    • normalize_contexts now initializes Annotated<Contexts> with Contexts::new before processor::apply, ensuring a random TraceContext is added even when contexts were missing.

Create PR

Or push these changes by commenting:

@cursor push 42352a05a6
Preview (42352a05a6)
diff --git a/relay-event-normalization/src/event.rs b/relay-event-normalization/src/event.rs
--- a/relay-event-normalization/src/event.rs
+++ b/relay-event-normalization/src/event.rs
@@ -1308,6 +1308,7 @@
 
 /// Normalizes incoming contexts for the downstream metric extraction.
 fn normalize_contexts(contexts: &mut Annotated<Contexts>) {
+    contexts.get_or_insert_with(Contexts::new);
     let _ = processor::apply(contexts, |contexts, _meta| {
         // Reprocessing context sent from SDKs must not be accepted, it is a Sentry-internal
         // construct.

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch 2 times, most recently from 95690de to 3579819 Compare March 23, 2026 18:15
@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch 4 times, most recently from 0fefa72 to 64d1970 Compare March 23, 2026 21:30
Comment on lines +336 to +351
impl TraceContext {
/// Generates a random [`TraceId`] and random [`SpanId`].
/// Leaves all other fields blank.
pub fn random() -> Self {
let mut trace_meta = Meta::default();
trace_meta.add_remark(Remark::new(RemarkType::Substituted, "trace_id.missing"));

let mut span_meta = Meta::default();
span_meta.add_remark(Remark::new(RemarkType::Substituted, "span_id.missing"));
TraceContext {
trace_id: Annotated(Some(TraceId::random()), trace_meta),
span_id: Annotated(Some(SpanId::random()), span_meta),
..Default::default()
}
}
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of two non-test changes.

Comment on lines +1320 to +1325
// We need a TraceId to ingest the event into EAP.
// If the event lacks a TraceContext, add a random one.
if !contexts.contains::<TraceContext>() {
contexts.add(TraceContext::random())
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of two non-test changes.

@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch 2 times, most recently from a159791 to eae2186 Compare March 23, 2026 23:15
@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch from eae2186 to e30157b Compare March 23, 2026 23:25
@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch 2 times, most recently from 5406ae5 to 75a2d8a Compare March 23, 2026 23:40
@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch from 75a2d8a to ba308fb Compare March 24, 2026 17:16
@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch 2 times, most recently from 785348e to 9c7ebde Compare March 24, 2026 17:38
@thetruecpaul thetruecpaul requested a review from mjq March 24, 2026 19:50
Comment on lines +346 to +349
trace_meta.add_remark(Remark::new(RemarkType::Substituted, "trace_id.missing"));

let mut span_meta = Meta::default();
span_meta.add_remark(Remark::new(RemarkType::Substituted, "span_id.missing"));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will these remarks show up in the UI if we merge this right now? Might be worth trying with a local relay hooked up to production sentry.io before merging (see https://github.com/getsentry/relay/?tab=readme-ov-file#building-and-running). I'm worried that UI annotates this as some kind of processing error, and then users see this on almost every error event.

An alternative would be to not set a remark at all, and rather set the origin field of the trace context to something like "relay".

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UI doesn't show meta at all right now from the spans dataset. It's a task for a project upcoming shortly (attribute explorer).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UI doesn't show meta at all right now from the spans dataset

But it does for the errors data set, right? I.e. JSON ends up in nodestore, and the views in Issues will render at least some of the _meta.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: SpanId byte order inconsistency between random and from_str
    • SpanId::random() now uses to_be_bytes() so its internal byte order matches SpanId::from_str() and remains consistent across platforms.

Create PR

Or push these changes by commenting:

@cursor push 73d35bc862
Preview (73d35bc862)
diff --git a/relay-event-schema/src/protocol/contexts/trace.rs b/relay-event-schema/src/protocol/contexts/trace.rs
--- a/relay-event-schema/src/protocol/contexts/trace.rs
+++ b/relay-event-schema/src/protocol/contexts/trace.rs
@@ -172,7 +172,7 @@
 impl SpanId {
     pub fn random() -> Self {
         let value: u64 = rand::random_range(1..=u64::MAX);
-        Self(value.to_ne_bytes())
+        Self(value.to_be_bytes())
     }
 }

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

pub fn random() -> Self {
let value: u64 = rand::random_range(1..=u64::MAX);
Self(value.to_ne_bytes())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpanId byte order inconsistency between random and from_str

Low Severity

SpanId::random() stores the u64 using to_ne_bytes() (native endian), but SpanId::from_str() stores it using to_be_bytes() (big endian). On little-endian platforms (virtually all modern servers), these produce different byte orderings for the same u64 value. While the Display/from_str round-trip happens to preserve bytes correctly, this inconsistency means randomly-generated SpanIds follow a different internal byte convention than parsed ones. Any future code that interprets the inner [u8; 8] as a u64 via from_be_bytes (the convention implied by from_str) would get wrong results for random SpanIds. Using to_be_bytes() here would maintain consistency.

Fix in Cursor Fix in Web

@thetruecpaul thetruecpaul requested a review from jjbayer March 30, 2026 18:13
Copy link
Copy Markdown
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks OK to me but I would be more comfortable if we could feature flag the auto-generation (see code linked below). Then we can dogfood it, see what it does to average payload sizes (e.g. in S4S) and check what the existing UI looks like.

/// Features exposed by project config.
#[derive(Clone, Copy, Debug, Eq, PartialEq, PartialOrd, Ord, Hash, Serialize, Deserialize)]
pub enum Feature {
)

@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch from ff8697f to 7d666ae Compare April 1, 2026 21:15
@thetruecpaul
Copy link
Copy Markdown
Contributor Author

Added a feature flag; PR adding it to flagpole up in https://github.com/getsentry/sentry-options-automator/pull/7064

@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch from 7d666ae to eeb9e61 Compare April 1, 2026 21:29
impl SpanId {
pub fn random() -> Self {
let value: u64 = rand::random_range(1..=u64::MAX);
Self(value.to_ne_bytes())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: SpanId::random() uses native-endian bytes (to_ne_bytes), but serialization/deserialization (Display/FromStr) expect big-endian. This causes incorrect roundtripping on little-endian systems.
Severity: MEDIUM

Suggested Fix

In SpanId::random(), change the call from rng.gen::<u64>().to_ne_bytes() to rng.gen::<u64>().to_be_bytes(). This will align the byte order during generation with the big-endian order expected by the Display and FromStr implementations, ensuring correct serialization and deserialization roundtrips.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: relay-event-schema/src/protocol/contexts/trace.rs#L174

Potential issue: A byte-ordering inconsistency exists in `SpanId` handling. The
`SpanId::random()` function generates an ID using native-endian byte order
(`to_ne_bytes()`), while the `Display` and `FromStr` implementations, used for
serialization and deserialization, assume big-endian byte order. On common little-endian
architectures like x86_64, this causes a randomly generated `SpanId` to fail a
serialization-deserialization roundtrip. For example, a generated ID will be displayed
with its bytes reversed, and parsing that string back will result in a different ID
value. This can lead to data integrity issues and broken trace continuity if these
randomly generated IDs are ever stored and re-ingested.

@thetruecpaul thetruecpaul requested a review from jjbayer April 2, 2026 20:31
@thetruecpaul
Copy link
Copy Markdown
Contributor Author

I'll go through and revert the test changes that are causing the failures once I get an OK on the new direction / use of Feature.

Copy link
Copy Markdown
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature usage looks good to me!

span_allowed_hosts: &[], // only supported in relay
span_op_defaults: Default::default(), // only supported in relay
performance_issues_spans: Default::default(),
should_add_trace_id_by_default: Default::default(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would give this an imperative name, e.g. derive_trace_id. See remove_other, enrich_spans, emit_event_errors.

#[serde(rename = "projects:relay-playstation-uploads")]
PlaystationUploads,
/// Add a random trace ID to events that lack one.
#[serde(rename = "organizations:add-default-trace-id-relay")]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: relay- is usually a prefix (see feature above this one).

We are beginning to store events in EAP, which — as a Trace-centric datastore — requires all TraceItems to be associated with a TraceId. This is problematic, since we currently don't require that events actually have TraceIds. That means that we're currently just silently dropping a bunch of events before we can successfully ingest them into EAP.

This PR adds a random TraceId & SpanId to events that do not have a TraceContext.
@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch from eeb9e61 to 43afc56 Compare April 9, 2026 06:28
Comment on lines +172 to +175
pub fn random() -> Self {
let value: u64 = rand::random_range(1..=u64::MAX);
Self(value.to_ne_bytes())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: SpanId::random() uses native-endian bytes (to_ne_bytes), while parsing assumes big-endian, causing incorrect serialization and parsing on little-endian systems.
Severity: HIGH

Suggested Fix

Ensure consistent endianness across SpanId creation, serialization, and parsing. Modify SpanId::random() to use to_be_bytes() instead of to_ne_bytes(). This will align the byte order of randomly generated IDs with the expectations of the FromStr and Display implementations, guaranteeing correct behavior on all architectures.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: relay-event-schema/src/protocol/contexts/trace.rs#L172-L175

Potential issue: The `SpanId::random()` function generates a `u64` and converts it to
bytes using `to_ne_bytes()`. On little-endian systems, which include most modern server
architectures, this produces a little-endian byte array. However, the `Display`
implementation serializes these bytes in their storage order, creating a byte-reversed
hex string. When this string is later parsed by `FromStr`, it is interpreted as a
big-endian value, resulting in a different `SpanId` value than the one originally
generated. This will cause incorrect span IDs to be sent to downstream systems that
expect standard big-endian hex encoding.

@thetruecpaul thetruecpaul force-pushed the cpaul/031926/default-trace-id branch from e5be3e3 to 9927ecc Compare April 9, 2026 06:41
@thetruecpaul thetruecpaul requested a review from jjbayer April 9, 2026 06:47
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Method named random but trace_id is deterministic
    • Renamed TraceContext::random(event_id) to TraceContext::from_event_id(event_id) and updated call sites/tests/comments to reflect deterministic trace ID derivation.

Create PR

Or push these changes by commenting:

@cursor push eb60b26a0d
Preview (eb60b26a0d)
diff --git a/relay-event-normalization/src/event.rs b/relay-event-normalization/src/event.rs
--- a/relay-event-normalization/src/event.rs
+++ b/relay-event-normalization/src/event.rs
@@ -1331,10 +1331,10 @@
         contexts.0.remove("reprocessing");
 
         // We need a TraceId to ingest the event into EAP.
-        // If the event lacks a TraceContext, add a random one.
+        // If the event lacks a TraceContext, derive one from the event id.
 
         if config.derive_trace_id && !contexts.contains::<TraceContext>() {
-            contexts.add(TraceContext::random(event_id))
+            contexts.add(TraceContext::from_event_id(event_id))
         }
 
         for annotated in &mut contexts.0.values_mut() {

diff --git a/relay-event-schema/src/protocol/contexts/trace.rs b/relay-event-schema/src/protocol/contexts/trace.rs
--- a/relay-event-schema/src/protocol/contexts/trace.rs
+++ b/relay-event-schema/src/protocol/contexts/trace.rs
@@ -332,9 +332,9 @@
 }
 
 impl TraceContext {
-    /// Generates a random [`SpanId`] and takes `[TraceId]` from the event's UUID.
+    /// Generates a random [`SpanId`] and derives [`TraceId`] from the event's UUID.
     /// Leaves all other fields blank.
-    pub fn random(event_id: Uuid) -> Self {
+    pub fn from_event_id(event_id: Uuid) -> Self {
         let mut trace_meta = Meta::default();
         trace_meta.add_remark(Remark::new(RemarkType::Substituted, "trace_id.missing"));
 
@@ -641,8 +641,8 @@
     }
 
     #[test]
-    fn test_random_trace_context() {
-        let rand_context = TraceContext::random(Uuid::new_v4());
+    fn test_trace_context_from_event_id() {
+        let rand_context = TraceContext::from_event_id(Uuid::new_v4());
         assert!(rand_context.trace_id.value().is_some());
         assert_eq!(
             rand_context

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 9927ecc. Configure here.

impl TraceContext {
/// Generates a random [`SpanId`] and takes `[TraceId]` from the event's UUID.
/// Leaves all other fields blank.
pub fn random(event_id: Uuid) -> Self {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method named random but trace_id is deterministic

Low Severity

TraceContext::random(event_id) derives trace_id deterministically from the event UUID, not randomly. Only the span_id is actually random. The method was originally fully random (first iteration of the PR) but was changed to derive from event_id without updating the name. This is misleading — callers seeing TraceContext::random(event_id) would reasonably assume the output is fully random, not that it embeds the event UUID as the trace ID.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 9927ecc. Configure here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could rename to auto.

impl TraceContext {
/// Generates a random [`SpanId`] and takes `[TraceId]` from the event's UUID.
/// Leaves all other fields blank.
pub fn random(event_id: Uuid) -> Self {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could rename to auto.

fn test_normalize_adds_trace_id() {
let json = r#"
{
"type": "transaction",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually want this normalization to also apply to transactions? If not, we should probably filter on the event type in normalize_contexts.

@thetruecpaul thetruecpaul added this pull request to the merge queue Apr 9, 2026
Merged via the queue into master with commit f7bf859 Apr 9, 2026
31 checks passed
@thetruecpaul thetruecpaul deleted the cpaul/031926/default-trace-id branch April 9, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants