Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(transaction): Normalize transaction name #1621

Merged
merged 8 commits into from
Nov 23, 2022

Conversation

olksdr
Copy link
Contributor

@olksdr olksdr commented Nov 22, 2022

Adding feature which, when enabled, supports transaction names normalization by replacing UUIDs, SHAs and numerical IDs in transaction names by placeholders.

This is the first step which should help to prevent high cardinality for transaction names.

ref: #1597

Adding feature which, when enabled, supports transaction names
normalization by replacing UUIDs, SHAs and numerical IDs in transaction
names by placeholders.

This is the first step which should help to prevent high cardinality
for transaction names.
@@ -663,6 +663,7 @@ pub struct LightNormalizationConfig<'a> {
pub measurements_config: Option<&'a MeasurementsConfig>,
pub breakdowns_config: Option<&'a BreakdownsConfig>,
pub normalize_user_agent: Option<bool>,
pub normalize_transaction_name: Option<bool>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Does this need to be an Option<bool>? I don't know why normalize_user_agent is one to be honest, I feel like a simple bool would do just fine.

relay-general/src/store/regexes.rs Show resolved Hide resolved
// ranges will change.
for matches in TRANSACTION_NAME_NORMALIZER_REGEX.captures_iter(trans) {
// Skip first, since it's always the entire match and the following are groups.
// And if there is no matches, just cotinue to next iteration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// And if there is no matches, just cotinue to next iteration.
// And if there is no matches, just continue to next iteration.

Comment on lines 244 to 245
for name in capture_names {
if let Some(caps) = TRANSACTION_NAME_NORMALIZER_REGEX.captures(trans) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for name in capture_names {
if let Some(caps) = TRANSACTION_NAME_NORMALIZER_REGEX.captures(trans) {
if let Some(caps) = TRANSACTION_NAME_NORMALIZER_REGEX.captures(trans) {
for name in capture_names {

if meta.original_value().is_none() {
meta.set_original_value(Some(trans.to_string()))
}
let mut collector = String::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to allocate a new String only once for each transaction name. Assuming that capture_names are ordered , we can start writing [0..m1.start()], then [m1.end()...m2.start()], etc.

*trans = collector;
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no way we can unify the two for loops in this function?

Comment on lines 1415 to 1421
[
"normalize_transaction_name",
"s",
51,
54
]
],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@olksdr olksdr requested a review from jjbayer November 23, 2022 11:14
@@ -71,6 +71,9 @@ pub enum Feature {
/// Enables ingestion of Session Replays (Replay Recordings and Replay Events)
#[serde(rename = "organizations:session-replay")]
Replays,
/// docs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget the docs :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I knew you will see it! 😄

@olksdr olksdr enabled auto-merge (squash) November 23, 2022 13:17
olksdr added a commit to getsentry/sentry that referenced this pull request Nov 23, 2022
… names (#41642)

See relay side in getsentry/relay#1621 where the
new feature will be consumed.

Once enabled, we will try to normalize the transaction names by
replacing UUID, SHAs, numerical IDs with placeholders.

ref: getsentry/relay#1597
@olksdr olksdr merged commit 2a0a6b5 into master Nov 23, 2022
@olksdr olksdr deleted the feat/hard-trans-parse-rules branch November 23, 2022 13:33
jan-auer added a commit that referenced this pull request Nov 24, 2022
* master:
  ref(system): Introduce a generic recipient type (#1622)
  feat(transaction): Normalize transaction name (#1621)
  feat(server): Track metrics for OpenTelemetry events (#1618)
  bug(replays): Increase reserved space in payload chunks (#1601)
  feat(protocol): Add OpenTelemetry Context (#1617)
jan-auer added a commit that referenced this pull request Nov 24, 2022
* master:
  chore(toolchain): Fix lint and test for beta toolchain (#1623)
  ref(system): Introduce a generic recipient type (#1622)
  feat(transaction): Normalize transaction name (#1621)
  feat(server): Track metrics for OpenTelemetry events (#1618)
  bug(replays): Increase reserved space in payload chunks (#1601)
  feat(protocol): Add OpenTelemetry Context (#1617)
@jjbayer jjbayer mentioned this pull request Dec 13, 2022
6 tasks
jjbayer added a commit to getsentry/sentry that referenced this pull request Jan 19, 2023
Literal asterisks in transaction names (such as created by
getsentry/relay#1621) should not be confused
with the wildcard `*`, which for transaction name rules means "match
and replace".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants