Skip to content

Conversation

@mjq
Copy link
Member

@mjq mjq commented Nov 20, 2025

Currently, transaction name clustering and normalization happens partly in Relay and partly in Sentry's event_manager.save_transaction_events. This normalization only applies to transactions, and spans that are extracted from those transactions.

For standalone spans and OTLP spans, we need to apply the clustering elsewhere. We will introduce it into the segment enrichment step of the span ingestion pipeline.

This PR introduces the first step of the process, stripping identifier-like patterns out of names that aren't explicitly parameterized (via the sentry.span.source attribute). Future PRs will add interactions with the transaction clusterer (saving seen names and applying compiled clustering rules).

The changes are behind an org flag (organizations:normalize_segment_names_in_span_enrichment) for testing and easy rollback.

Closes ENG-5953.

@mjq mjq requested review from a team as code owners November 20, 2025 17:44
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 20, 2025
@codecov
Copy link

codecov bot commented Nov 20, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
330 1 329 10
View the top 1 failed test(s) by shortest run time
::tests.sentry.ingest.transaction_clusterer.test_normalization
Stack Traces | 0s run time
#x1B[31mimport file mismatch:
imported module 'test_normalization' has this __file__ attribute:
  .../sentry/event_manager/test_normalization.py
which is not the same as the test file we want to collect:
  .../ingest/transaction_clusterer/test_normalization.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules#x1B[0m

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@linear
Copy link

linear bot commented Nov 20, 2025

Copy link
Member

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@mjq mjq merged commit 1a193b1 into master Nov 24, 2025
69 checks passed
@mjq mjq deleted the mjq/segment-name-clustering-p1-normalization branch November 24, 2025 16:04
mjq added a commit that referenced this pull request Nov 24, 2025
Continues the work started in #103739.

As a first step to adding incremental segment name clustering to segment
enrichment, record each seen segment name (like we do for transaction names in
Sentry's `event_manager`). To work around the different signatures for
transactions and segment spans, I extracted as much of the logic as I could
into a second function, with one caller per scenario.

As with the previous PR, the changes are behind an org flag
(organizations:normalize_segment_names_in_span_enrichment) for testing and easy
rollback.
mjq added a commit that referenced this pull request Nov 26, 2025
Continues the work started in
#103739.

As a first step to adding incremental segment name clustering to segment
enrichment, record each seen segment name (like we do for transaction
names in Sentry's `event_manager`). To work around the different
signatures for transactions and segment spans, I extracted as much of
the logic as I could into a second function, with one caller per
scenario.

As with the previous PR, the changes are behind an org flag
(organizations:normalize_segment_names_in_span_enrichment) for testing
and easy rollback.

Closes ENG-5951.
mjq added a commit that referenced this pull request Dec 2, 2025
Continues the work from #103739
and #103913.

This PR ports the application of transaction name clusterer rules from
Relay into segment enrichment. Relay will continue to apply transaction
name clustering to spans from transactions, while this code will handle
standalone spans (spans v2) and OTLP spans.

See [the preamble to
`sentry.ingest.transaction_clusterer.tree`](https://github.com/getsentry/sentry/blob/faca90e0cc06cf2492da22a01539352568dea565/src/sentry/ingest/transaction_clusterer/tree.py#L1-L46)
for an explanation of how transaction name clustering rules are matched.

Closes ENG-5747.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants