Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(auto_source_config): Derive in-app stack trace rules #87842

Merged
merged 11 commits into from
Mar 27, 2025

Conversation

armenzg
Copy link
Member

@armenzg armenzg commented Mar 25, 2025

This is the logic to mark frames as in-app when the files are found in the developers' GitHub repositories.

This is very important for languages where the SDK or our internal rules cannot determine which frames are in-app and which are not.

The initial language to get support for this is Java.

This will run in dry run mode to discover errors or scaling problems.

@armenzg armenzg self-assigned this Mar 25, 2025
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 25, 2025
Copy link

codecov bot commented Mar 25, 2025

Codecov Report

Attention: Patch coverage is 96.33028% with 4 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...uto_source_code_config/in_app_stack_trace_rules.py 87.50% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #87842       +/-   ##
===========================================
+ Coverage   42.12%   87.74%   +45.61%     
===========================================
  Files        9892     9924       +32     
  Lines      561937   563914     +1977     
  Branches    22190    22190               
===========================================
+ Hits       236735   494815   +258080     
+ Misses     324771    68668   -256103     
  Partials      431      431               

expected_stack_root: str,
expected_source_root: str,
expected_stack_root: str | None = None,
expected_source_root: str | None = None,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this merges, this won't be necessary.

@@ -100,12 +102,13 @@ def _process_and_assert_configuration_changes(
)
code_mappings = RepositoryProjectPathConfig.objects.all()
assert len(code_mappings) == expected_num_code_mappings
code_mapping = code_mappings.filter(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block won't be necessary when #87776 merges.

@@ -133,7 +146,8 @@ def _process_and_assert_no_configuration_changes(
repo_trees: Mapping[str, Sequence[str]],
frames: Sequence[Mapping[str, str | bool]],
platform: str,
) -> tuple[list[CodeMapping], list[str]]:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to return in_app_stacrace_rules since we can test it within this function.
image

frames=[
{"module": "com.example.foo.Bar", "abs_path": "Bar.kt"},
{"module": "com.example.foo.Baz", "abs_path": "Baz.kt"},
{"module": "com.example.utils.Helper", "abs_path": "Helper.kt"},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's the utils app and the foo app under the same domain, thus, we will have one in-app stack trace rule but two code mappings.

@armenzg armenzg requested a review from MichaelSun48 March 25, 2025 14:28
@armenzg armenzg force-pushed the in_app_stacktrace/auto_source_config/armenzg branch from 2564abb to d519a3b Compare March 25, 2025 18:28
@armenzg armenzg requested a review from lobsterkatie March 25, 2025 20:07
@armenzg armenzg marked this pull request as ready for review March 25, 2025 20:33
@armenzg armenzg requested review from a team as code owners March 25, 2025 20:33
Copy link
Member

@lobsterkatie lobsterkatie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! Can't wait to see the effects once it's turned on!

Comment on lines 118 to 123
enhancements_string = (
f"{automatic_enhancements}\n{enhancements_string}"
if enhancements_string
else automatic_enhancements
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This can be simpler. If enhancements_string is falsy, the worst that happens is we add an extra newline to the auto enhancements.

Suggested change
enhancements_string = (
f"{automatic_enhancements}\n{enhancements_string}"
if enhancements_string
else automatic_enhancements
)
enhancements_string = f"{automatic_enhancements}\n{enhancements_string}"


logger = logging.getLogger(__name__)

OPTION_KEY = "sentry:automatic_grouping_enhancements"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of derived_grouping_enhancements as an option name rather than automatic_grouping_enhancements?

If I didn't know anything about how this worked, my first thought seeing the description "auto enhancements" would be that they were enhancements applied automatically. OTOH, that's not a lie - we do apply them automatically. OTOH, we also apply the default and custom enhancements automatically, so under my gut-reaction interpretation "automatic" isn't a great differentiator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me 👍🏻

# For packages including the domain name, we want to have less granularity to create less rules, thus,
# reducing the number of rules applied to the stack trace.
# com/example/foo/bar -> com.example
if stacktrace_root.find("/") > 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this want to be >= 2? You've already stripped the trailing slash, but you still want to catch examples like com/example/foo, right?

if len(parts) > 2:
stacktrace_root = "/".join(parts[:2])

return f"stack.module:{stacktrace_root.replace('/', '.')}.** +app"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this rule match something at the base level of the stacktrace root (module = com.example)? IOW, can .** match with nothing, or does the . mean that it'd have to be com.example. (trailing period) to match? (IDK if that's even a reasonable scenario, though.)

Also, does our glob matching treat . the same way it treats /? Put another way, does it recognize a dot-separated string as a path, or does it fall back to literal string matching?

assert event.data["metadata"]["in_app_frame_mix"] == "system-only"
assert len(event.data["hashes"]) == 1 # Only system hash
system_only_hash = event.data["hashes"][0]
first_enhancements_hash = event.data["grouping_config"]["enhancements"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get why you're referring to this as a "hash", but that means something different than the "hash" on the line before. Maybe first_enhancements_base64_string or something?

@armenzg armenzg changed the base branch from master to ref/multiple_code_mappings/auto_source_config/armenzg March 26, 2025 14:50
Base automatically changed from ref/multiple_code_mappings/auto_source_config/armenzg to master March 26, 2025 17:50
armenzg and others added 8 commits March 26, 2025 14:09
This is the logic to mark frames as in-app when the files are found in the developers' GitHub repositories.

This is very important for languages where the SDK or our internal rules cannot determine which frames are in-app vs not.

The initial language to get support for this is Java.

For now, this will run as dry-run to discover any errors or escaling problems.
Co-authored-by: Katie Byers <katie.byers@sentry.io>
@armenzg armenzg force-pushed the in_app_stacktrace/auto_source_config/armenzg branch from 2c93f6e to f52e04a Compare March 26, 2025 18:10
@armenzg armenzg changed the title feat(auto_source_config): Create in-app stack trace rules feat(auto_source_config): Derive in-app stack trace rules Mar 26, 2025
@@ -593,9 +593,12 @@ def get_path_from_module(module: str, abs_path: str) -> tuple[str, str]:
if "." not in module:
raise DoesNotFollowJavaPackageNamingConvention

# If module has a dot, take everything before the last dot
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes simplify this code block.

It also fixes a bug where we were not restricting the length of the stack root, thus, we would end up with com/example/ and many more levels deep (not intended behaviour).

There's a test down below that ensures we get the right number of levels.

Inputs:

  • repo_file="src/com/example/foo/bar/Baz.kt"
  • self.frame(module="com.example.foo.bar.Baz", abs_path="Baz.kt", in_app=False)

Expected:

  • stack_root="com/example/", source_root="src/com/example/"
  • in_app_rule="stack.module:com.example.** +app"

@@ -87,6 +88,7 @@ def get_config_dict(self, project: Project) -> GroupingConfig:
}

def _get_enhancements(self, project: Project) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - where/how is this enhancement string used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primarily, it is used as part of save_event to determine which frames should be considered for app or system hash calculations.

if stacktrace_root == "":
raise ValueError("Stacktrace root is empty")

parts = stacktrace_root.rstrip("/").split("/", 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to account for backslash paths too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for Java.
When we generate the code mappings, we replace periods (e.g. com.foo) with forward slashes (e.g. com/foo).

platform=self.platform,
expected_new_code_mappings=[self.code_mapping("com/example/", "src/com/example/")],
expected_in_app_stack_trace_rules=["stack.module:com.example.** +app"],
expected_num_code_mappings=0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is a bit confusing to me — the expected_num_code_mappings seems to conflict with the expected_new_code_mappings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. I intend to fix it. Since we're in dry_run mode, it won't create them.

Comment on lines 24 to 25
if not platform_config.is_dry_run_platform():
project.update_option(DERIVED_ENHANCEMENTS_OPTION_KEY, "\n".join(rules))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be checking for duplicates here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally, I have a test and a fix for this. I will push it later.

@armenzg armenzg enabled auto-merge (squash) March 27, 2025 12:40
@armenzg armenzg merged commit d3afa10 into master Mar 27, 2025
48 checks passed
@armenzg armenzg deleted the in_app_stacktrace/auto_source_config/armenzg branch March 27, 2025 12:50
andrewshie-sentry pushed a commit that referenced this pull request Mar 27, 2025
This is the logic to mark frames as in-app when the files are found in
the developers' GitHub repositories.

This is very important for languages where the SDK or our internal rules
cannot determine which frames are in-app and which are not.

The initial language to get support for this is Java.

This will run in dry run mode to discover errors or scaling problems.

---------

Co-authored-by: Katie Byers <katie.byers@sentry.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants